Way Finding in Social Networks

Embed Size (px)

Citation preview

  • 8/14/2019 Way Finding in Social Networks

    1/50

    Wayfinding in

    Social Networks

    David Liben-Nowell

    Carleton College

    [email protected]

    Microsoft ResearchCambridge | 7 December 2007

    Workshop on Online Social Networks

  • 8/14/2019 Way Finding in Social Networks

    2/50

    1

  • 8/14/2019 Way Finding in Social Networks

    3/50

    Analyzing Social Networks, pre-1995

    Social Network Analysis: Old Sool

    social networks have been around for 100K+ years!

    before the web, hard to acquire (surveys, interviews, ...).

    but many interesting, relevant, generalizable observations!

    2

  • 8/14/2019 Way Finding in Social Networks

    4/50

    Zacharys Karate Club

    inst2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    1516

    1718

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    admin

    [Zachary 1977]

    Recorded interactions in a

    karate club for 2 years.

    During observation,

    adminstrator/instructor

    conflict developed broke into two clubs.

    Who joins which club?

    Split along

    administrator/instructor

    minimum cut (!)

    3

  • 8/14/2019 Way Finding in Social Networks

    5/50

    Zacharys Karate Club

    inst2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    1516

    1718

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    admin

    [Zachary 1977]

    Recorded interactions in a

    karate club for 2 years.

    During observation,

    adminstrator/instructor

    conflict developed

    broke into two clubs.

    Who joins which club?

    Split along

    administrator/instructor

    minimum cut (!)

    4

  • 8/14/2019 Way Finding in Social Networks

    6/50

    Zacharys Karate Club

    inst2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    1516

    1718

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    admin

    [Zachary 1977]

    Recorded interactions in a

    karate club for 2 years.

    During observation,

    adminstrator/instructor

    conflict developed

    broke into two clubs.

    Who joins which club?

    Split along

    administrator/instructor

    minimum cut (!)

    5

  • 8/14/2019 Way Finding in Social Networks

    7/50

    Zacharys Karate Club

    inst2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    1516

    1718

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    admin

    [Zachary 1977]

    Recorded interactions in a

    karate club for 2 years.

    During observation,

    adminstrator/instructor

    conflict developed

    broke into two clubs.

    Who joins which club?

    Split along

    administrator/instructor

    minimum cut (!)

    6

  • 8/14/2019 Way Finding in Social Networks

    8/50

    Zacharys Karate Club

    inst2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    1516

    1718

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    admin

    [Zachary 1977]

    Recorded interactions in a

    karate club for 2 years.

    During observation,

    adminstrator/instructor

    conflict developed

    broke into two clubs.

    Who joins which club?

    Split along

    administrator/instructor

    minimum cut (!)

    7

  • 8/14/2019 Way Finding in Social Networks

    9/50

    Why to think algorithmically about social networks

    The small-world phenomenon

    Models of the navigable small world

    Some conclusions and musings

  • 8/14/2019 Way Finding in Social Networks

    10/50

    Milgram: Six Degrees of Separation

    Social Networks as Networks: [Milgram 1967]

    People given letter, asked to forward to one friend.

    Source: random residents of Omaha, NE;

    Target: stockbroker near Boston, MA.

    Of completed chains, averaged six hops to reach target.8

  • 8/14/2019 Way Finding in Social Networks

    11/50

    Milgram: The Explanation?

    the small-world problem

    Why is a random Omahaian close to a Boston stockbroker?

    Standard (pseudosociological, pseudomathematical) explanation:

    (Erdos/Renyi) random graphs have small diameter.

    Bogus! In fact, many bogosities:degree distribution

    clustering coefficients

    ...

    9

  • 8/14/2019 Way Finding in Social Networks

    12/50

  • 8/14/2019 Way Finding in Social Networks

    13/50

    High School Friendships

    [Moody 2001]11

  • 8/14/2019 Way Finding in Social Networks

    14/50

    Homophily

    homophily: a person xs friends tend to be similar to x.

    One explanation for high clustering: (semi)transitivity of similarity.

    x, y both friends of u x and u similar; y and u similar

    x and y similar

    x and y friends12

  • 8/14/2019 Way Finding in Social Networks

    15/50

    Navigability of Social Networks

    [Kleinberg 2000]

    Milgram experiment shows more than small diameter:

    People can construct short paths!

    Milgrams result is algorithmic, not existential.

    Theorem [Kleinberg 2000]: No local-information algorithm

    can find short paths in Watts/Strogatz networks.

    13

  • 8/14/2019 Way Finding in Social Networks

    16/50

    Homophily and Greedy Applications

    homophily: a person xs friends tend to be similar to x.

    Key idea: getting closer in similarity space

    getting closer in graph-distance space

    [Killworth Bernard 1978] (reverse small-world experiment)

    [Dodds Muhamed Watts 2003]

    In searching a social network for a target,

    most people chose the next step because of

    geographical proximity or similarity of occupation

    (more geography early in chains; more occupation late.)

    Suggests the greedy algorithm in social-network routing:

    if aiming for target t, pick your friend whos most like t.

    14

  • 8/14/2019 Way Finding in Social Networks

    17/50

    Greedy Routing

    Greedy algorithm:

    if aiming for target t, pick your friend whos most like t.

    Geography: greedily route based on distance to t.

    Occupation: greedily route based on distance

    in the (implicit) hierarchy of occupations.

    Want Pr [u, v friends] to decay smoothly as d(u, v) increases.

    (Need social cues to help narrow in on t.Not just homophily! Cant just have many disjoint cliques.)

    15

  • 8/14/2019 Way Finding in Social Networks

    18/50

    The LiveJournal Community

    www.livejournal.com Baaaaah, says Frank.

    Online blogging community.

    Currently 14.3 million users; 1.3 million in February 2004.

    LiveJournal users provide:

    disturbingly detailed accounts of their personal lives.

    profiles (birthday, hometown, explicit list of friends)

    Yields a social network, with users geographic locations.

    (500K people in the continental US.)

    [DLN Novak Kumar Raghavan Tomkins 2005]16

  • 8/14/2019 Way Finding in Social Networks

    19/50

    LiveJournal

    0.1% of LJ friendships

    [DLN Novak Kumar Raghavan Tomkins 2005]17

  • 8/14/2019 Way Finding in Social Networks

    20/50

    Distance versus LJ link probability

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    10 km 100 km 1000 km

    link

    probability

    distance

    [DLN Novak Kumar Raghavan Tomkins 2005]18

  • 8/14/2019 Way Finding in Social Networks

    21/50

    Analyzing Social Networks, pre-1995

    Social Network Analysis: Old Sool

    social networks have been around for 100K+ years!

    before the web, hard to acquire (surveys, interviews, ...).

    but many interesting, relevant, generalizable observations!

    Social Network Analysis: New Sch00l

    automatically extract networks without having to ask.

    phone calls, emails, online communities, ...

    big! but are these really social networks?

    19

  • 8/14/2019 Way Finding in Social Networks

    22/50

    The Hewlett-Packard Email Community

    [Adamic Adar 2005]

    Corporate research community.

    Captured email headers over 3 months.

    Define friendship as 6 emails u v and 6 emails v u.

    Yields a social network (n = 430),

    with positions in the corporate hierarchy.

    20

  • 8/14/2019 Way Finding in Social Networks

    23/50

    Emails and the HP Corporate Hierarchy

    black: HP corporate hierarchy gray: exchanged emails.

    [Adamic Adar 2005]

    21

  • 8/14/2019 Way Finding in Social Networks

    24/50

    Emails and the HP Corporate Hierarchy

    So what?

    [Adamic Adar 2005]

    22

  • 8/14/2019 Way Finding in Social Networks

    25/50

    Why to think algorithmically about social networks

    The small-world phenomenon

    Models of the navigable small world

    Some conclusions and musings

  • 8/14/2019 Way Finding in Social Networks

    26/50

    Requisites for Navigability

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    10 km 100 km 1000 km

    linkprobability

    distance

    [Kleinberg 2000]:

    for a social network to be navigable without global knowledge,

    need well-scattered friends (to reach faraway targets)

    need well-localized friends (to home in on nearby targets)

    23

  • 8/14/2019 Way Finding in Social Networks

    27/50

    Kleinberg: Navigable Social Networks

    [Kleinberg 2000]

    put n people on a k-dimensional grid

    connect each to its immediate neighbors

    add one long-range link per person;Pr

    [u v]

    1

    d(u,v) .24

  • 8/14/2019 Way Finding in Social Networks

    28/50

    Navigability of Social Networks

    put n people on a k-dimensional grid

    connect each to its immediate neighbors

    add one long-range link per person; Pr[u v] 1d(u,v)

    .

    Theorem [Kleinberg 2000]: (short = polylog(n))

    If = k

    then no local-information algorithm can find short paths.

    If = k

    then people can find shortO(log2 n)paths using

    the greedy algorithm.

    25

  • 8/14/2019 Way Finding in Social Networks

    29/50

    Geographys Role in LiveJournal

    [DLN Novak Kumar Raghavan Tomkins 2005]

    By simulating the Milgram experiment,

    we find that LJ is navigable via geographically greedy routing.

    By Kleinbergs theorem,

    navigable 2D geographic mesh Pr[u v] d(u, v)2.

    Original goal of this research:

    verify that Pr[u v] d(u, v)2 in LiveJournal.

    26

  • 8/14/2019 Way Finding in Social Networks

    30/50

  • 8/14/2019 Way Finding in Social Networks

    31/50

    The LiveJournal Odyssey

    Dot shown for every

    inhabited location

    in LiveJournal network.

    Circles are centered

    on Ithaca, NY.

    Each successive circles

    population increases by 50,000.

    Uniform population radii would decrease quadratically.

    (actually mostly increase!)

    People dont live on a uniform grid!

    28

  • 8/14/2019 Way Finding in Social Networks

    32/50

    Why does distance fail?

    500 m

    500 m

    Population density varies widely across the US!

    and : best friends in Minnesota, strangers in Manhattan.

    29

  • 8/14/2019 Way Finding in Social Networks

    33/50

    Rank-Based Friendship

    How do we handle non-uniformly distributed populations?

    Instead of distance, use rank as fundamental quantity.

    rankA(B) := |{C : d(A, C) < d(A, B)}|

    How many people live closer to A than B does?

    Rank-Based Friendship : Pr[A is a friend of B] 1/rankA(B).

    Probability of friendship 1/(number of closer candidates)

    30

  • 8/14/2019 Way Finding in Social Networks

    34/50

    Relating Rank and Distance

    Rank-Based Friendship: Pr[A is a friend of B] 1/rankA(B).

    Kleinberg (k-dim grid): Pr[A is a friend of B] 1/d(A, B)k.

    Uniform k-dimensional grid:

    radius-d ball volume dk1/rank 1/dk

    For a uniform grid, rank-based friendship

    has (essentially) same link probabilities as Kleinberg.

    31

    l ti t k

  • 8/14/2019 Way Finding in Social Networks

    35/50

    Population Networks

    A rank-based population network consists of:

    a k-dimensional grid L of locations.

    a population P of people, living at points in L (n := |P|).

    a set E P P of friendships:one edge from each person in each directionone edge from each person, chosen by rank-based friendship

    e.g.,

    locations rounded

    to the nearestintegral point in

    longitude/latitude.

    32

  • 8/14/2019 Way Finding in Social Networks

    36/50

    The Real Theorem

    Theorem: For any n-person rank-based population network

    in a k-dimensional grid, k = (1), for any source s Pand for a randomly chosen target t P,

    the expected length (over t) of Greedy(s, loc(t)) is O(log3 n).

    Intuition: difficulty of halving distance to isolated target t

    is canceled by low probability of choosing t.

    Real theorem: not just for grids.

    (use doubling dimension of metric space instead of k).

    33

    Sh t P th d R k B d F i d hi

  • 8/14/2019 Way Finding in Social Networks

    37/50

    Short Paths and Rank-Based Friendships

    Theorem: For any n-person population network in a k-dim grid,

    for any source s P and a randomly chosen target t P,

    the expected length (over t) of Greedy(s, t) is O(log3 n).

    Theorem [Kleinberg 2000]: For any n-person uniform-density

    population network, any source s, and any target t,

    the length of Greedy(s, t) is O(log2 n) with high probability.

    Lose: expectation (not whp).

    Lose: another log factor.

    Gain: arbitrary population densities.

    Gain? holds in real networks?

    34

    Ranks and Friendships in LiveJournal

  • 8/14/2019 Way Finding in Social Networks

    38/50

    Ranks and Friendships in LiveJournal

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    rank

    shows Pru[u is friends with the v : ranku(v) = r]

    35

    Ranks and Friendships in LiveJournal

  • 8/14/2019 Way Finding in Social Networks

    39/50

    Ranks and Friendships in LiveJournal

    r1.15

    r1.2 +

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    rank

    shows Pru[u is friends with the v : ranku(v) = r]very close to 1/r, as required for rank-based friendship!

    again, must correct for nongeographic friends.

    35

    Ranks and Friendships in LiveJournal

  • 8/14/2019 Way Finding in Social Networks

    40/50

    Ranks and Friendships in LiveJournal

    r1.25

    1e-11

    1e-10

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    minus

    rank

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    rank

    1e-11

    1e-10

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobabilityminus

    rank

    shows probability of rank-r friendship, less = 5.0 106

    .

    36

    Ranks and Friendships in LiveJournal

  • 8/14/2019 Way Finding in Social Networks

    41/50

    Ranks and Friendships in LiveJournal

    r1.25

    1e-11

    1e-10

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    minus

    rank

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    rank

    1e-11

    1e-10

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobabilityminus

    rank

    shows probability of rank-r friendship, less = 5.0 106

    .LJ location resolution is city-only.

    average us ranks {r , . . . , r + 1300} are in the same city

    well average probabilities over ranks {r , . . . , r + 1300}

    36

    Ranks and Friendships in LiveJournal

  • 8/14/2019 Way Finding in Social Networks

    42/50

    Ranks and Friendships in LiveJournal

    r

    1.30

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    link

    probabilityminus

    ,averaged

    rank

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobability

    rank

    1e-11

    1e-10

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobabilityminus

    rank

    1e-09

    1e-08

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    1e-02

    1e-01

    1 10 100 1000 10000 100000

    linkprobabilityminus,averaged

    rank

    still very close to 1/r

    (though not absolutely perfect)

    37

  • 8/14/2019 Way Finding in Social Networks

    43/50

    Why to think algorithmically about social networks

    The small-world phenomenon

    Models of the navigable small world

    Some conclusions and musings

    Geographic/Nongeographic Friendships

  • 8/14/2019 Way Finding in Social Networks

    44/50

    Geographic/Nongeographic Friendships

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    10 km 100 km 1000 km

    linkprobability

    distance

    1e-07

    1e-06

    1e-05

    1e-04

    1e-03

    10 km 100 km 1000 km

    linkprobabilityminus

    distance

    good estimate of friendship probability:

    or Pr[u v] + f(d(u, v)) for 5.0 106.

    friends (nongeographic) f(d) friends (geographic).

    LJ: E[number of us friends] = 500,000 2.5.

    LJ: average degree 8.

    5.5/8 66% of LJ friendships are geographic, 33% are not.

    38

    A Puzzle to Ponder

  • 8/14/2019 Way Finding in Social Networks

    45/50

    A Puzzle to Ponder

    This talk:

    66% of LJ friendships form through geographic processes.

    Geography is very important, even in a virtual community.

    The 5.5 rank-based geographic friends per personsuffice to account for the small world.

    Not in this talk:Why should networking people care?

    Also not in this talk:And what about the other 2.5 nongeographic friends?

    But why does geography matter so much?

    virtualized real-world friendships?

    39

    Routing Choices

  • 8/14/2019 Way Finding in Social Networks

    46/50

    Routing Choices

    In real life, many ways to choose a next step when searching!

    Geography: greedily route based on distance to t.

    Occupation: greedily route based on distancein the (implicit) hierarchy of occupations.

    Age, hobbies, alma mater, ...

    Popularity: choose people with high outdegree.

    [Adamic Lukose Puniyani Huberman 2001]

    [Kim Yoon Han Jeong 2002] ...

    What does closest mean in real life?

    How do you weight various proximities?

    minimum over all proximities? [Dodds Watts Newman 2002]

    a more complicated combination?

    40

    A Puzzle to Ponder

  • 8/14/2019 Way Finding in Social Networks

    47/50

    A Puzzle to Ponder

    This talk:

    66% of LJ friendships form through geographic processes.

    Geography is very important, even in a virtual community.

    The 5.5 rank-based geographic friends per personsuffice to account for the small world.

    Not in this talk:Why should networking people care?

    Also not in this talk:And what about the other 2.5 nongeographic friends?

    But why does geography matter so much?

    virtualized real-world friendships?

    41

    Open Directions

  • 8/14/2019 Way Finding in Social Networks

    48/50

    Op

    A half-sociological, half-computational question ...

    Why should rank-based friendship hold, even approximately?

    Are there natural processes that generate it?

    E.g., a generative process based on geographic interests?

    42

  • 8/14/2019 Way Finding in Social Networks

    49/50

    Thank you!

    David Liben-Nowell

    [email protected]

    http://cs.carleton.edu/faculty/dlibenno

    Some references

  • 8/14/2019 Way Finding in Social Networks

    50/50

    Milgram 1967The Small World Problem. Psychology Today.

    Kleinberg 2000The Small-World Phenomenon: An Algorithmic Perspective. STOC00.

    Dodds Muhamad Watts 2003An Experimental Study of Search in Global Social Networks. Science.

    Liben-Nowell Novak Kumar Raghavan Tomkins 2005Geographic Routing in Social Networks. Proc. Natl. Acad. Sci.

    Kumar Liben-Nowell Tomkins 2006

    Navigating Low-Dimensional and Hierarchical Population Networks. ESA06.

    Adamic Adar 2005How to search a social network. Social Networks.

    43