Upload
beverley-bradford
View
215
Download
0
Embed Size (px)
Citation preview
The Center for Naval Analyses
Another View of theSmall World
Brian McCue(Original paper published in Social Networks 24 (2002), pages 121-133)
This work is not a product of the CNA Corporation, a non-profit research and analysis organization.
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric Distribution“It’s a
small world!”
A common expression
Usual “Small World” Topics
• Lengths of typical acquaintance chains (“degrees of separation” joining individuals).
• Sizes of typical acquaintance volumes (numbers of people known to an individual.)
• Network structures of individuals’ acquaintanceships.
What do we mean?
When we say, “It’s a small world,” do we mean:
• “It’s a short acquaintance chain”?
• “It’s a small acquaintance volume”?
• Or something about structure … ?
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric DistributionIt’s a short
acquaintance chain!
Acquaintance Chains?
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric Distribution
Duh!
Chains
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric DistributionIt’s a small
acquaintance volume!
Acquaintance volumes?
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric Distribution
When do we vote next? Volume
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric Distribution
Structure?It’s a small world; I sample it at no great rate, and I keep getting all these repeats!
.),,(
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−−
⎟⎟⎠
⎞⎜⎜⎝
⎛
=
SU
tSMU
tM
tSMUp
Capture-Recapture and the Hypergeometric Distribution
Structure!That’s right!
“It’s a small world”
• “It must be a small world, because I sample the population at no great rate and keep getting all these repeats.”
• The “small world” is the world from which we would be sampling, if we were sampling randomly from a structureless world and experiencing the observed level of coincidental meetings.
Operational world size
• The evocation of this imaginary, small, structureless world is a statement about the structure of the real, large, structured world.
• We will estimate the size of this “operational world,” and thereby learn about the real world.
Definitions
• W = size of an individual’s “world”(Does not include individual herself.)
• I = Number of meetings she has had
• Ik= number of meetings of person k
(Ik is defined for k = 1, 2, … W)
• Wj = number of individuals met j times
(Wj is defined for j = 0, 1, 2, … I)
Distributing I balls over W boxes
• W I ways do to it.
• We don’t care about the order of introductions.
• We don’t care which person is which.
A box with two balls is a coincidental re-introduction.
Probability of a configuration
• W I ways do assign I balls to W boxes.• We don’t care about the order of
introductions, so I!/(I1! x I2! x I3! x … IW!) configurations can’t be told apart.
• We don’t care which person is which, so W!/(W0! x W1! x W2! x … WI!) configurations can’t be told apart.
• So the probability of any configuration is:
.!!...!!!!...!!
!!
321210I
WI WIIIIWWWW
IW
Small village example
• A visitor meets randomly 9 people, two of them twice.
• Given a total population of W, the probability of this happening is
11!...0!0!1!1!1!1!1!2!1!2!1!0...!2!7)!9(
!11!
WW
W
⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅−
Likelihood, a function of W
• Probability, given W, that what happened would happen.
• Can be used to estimate W.• Suggests that there are about
24 people.
Realistic numbers
• W1 and I1 are nearly equal to I
• These equal a few thousand for most people, but can only be estimated approximately.
• For j,k > 1,Wj and Ik are small and people might recall them.
More definitions
∑≥
=2i
ic WW
∑∑≥≥
=⋅=22 j
ji
ic IWiI
S = Ic – Wc S is the number of surprising reintroductions.
Likelihood of W, re-written
.
!!)!()!(
!!
22
I
jj
iic WIWIISIW
WI
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛−+− ∏∏
≥≥
What the person doesn’t remember has factorial = 1 so it doesn’t matter
Things a person might remember.
=)...,,,,...|( 3212 wI IIIIWWWL
Maximizing L(W)
• L(W) still contains factorials of some big numbers.
• But we can find the W that maximizes by finding W such that
.11
)1(
)(=⎟
⎠
⎞⎜⎝
⎛ −⋅
+−=
−
I
WW
SIWW
WLWL
Estimating I
• The phonebook test of Freeman and Thompson presents 301 surnames and asks the subject how many are names of people she knows.
• I = score x total names in book/301.
• Book contains about 100,000 names.
• Typical result is 1,000 – 6,000.
Estimating W
• A person has I = 2000, S = 1: this leads to a W of about 2,000,000 in
• If I = 4,320 and S = 12, W = 775,000
.11
)1(
)(=⎟
⎠
⎞⎜⎝
⎛ −⋅
+−=
−
I
WW
SIWW
WLWL
But it’s worth computing L(W)
Observations on likelihoods
• Maxima are surprisingly high.• Even S = 3 is enough to make a
distinct peak.• Resulting world sizes are
– Much less than the real world’s size.– Comparable to (mostly less than or
equal to) city sizes.
Conclusions
• We each might as well be drawing a lifetime’s introductions from a small city.
• For people who really do draw introductions from limited populations, coincidental re-introductions could be used to estimate I.
Discussion
• But what about those acquaintance chains, and the six degrees of separation?– In light of US population size and estimates
of I, six degrees is surprisingly many, not surprisingly few. For a random structure, four degrees would be plenty.
– Small world-size suggests that extra degrees are needed to make jumps from world to world.
Suggestions for future work
• Get solid data on coincidental re-introductions.
• Do math to find:– Why maxima of L(W) are equal for equal S’s.– Faster way of computing L’s for successive W’s.
• Think about how small worlds might connect and how we could, perhaps through coincidental reintroductions, discover how they really do connect.
Connected small worlds
or
Or what?
Partial Bibliography
• Manfred Kochen (editor), 1989, The Small World, Ablex Publishing Corporation, Norwood, MA. Includes the following chapters:
– H. Russell Bernard, Eugene C. Johnsen, Peter D. Killworth, Scott Robinson, “Estimating the Size of an Average Personal Network and of an Event Subpopulation.”
– Linton C. Freeman and Claire R. Thompson, “Estimating Acquaintanceship Volume.”– Alden S. Klovdahl, “Urban Social Networks, Some Methodological Problems and Possibilities”– Ithiel de Sola Pool and Manfred Kochen, “Contacts and Influence,” originally published in Social
Networks 1 (1978), pages 5-51.
• Brian McCue, “Estimating the Number of Unheard U-boats: A Problem in Traffic Analysis,” 2000, Military Operations Research, Volume 5, Number 4, pp 5-18.
• Stanley Milgram, “The Small World Problem,” 1967 Psychology Today 1, pp 61-67.
• Ray Solomonoff and Anatol Rapaport, 1951, “Connectivity of Random Nets,” Bulletin of Mathematical Biophysics 13, pp 107-117.
• Ray Solomonoff, 1952, “An Exact Method for the Computation of the Connectivity of Random Nets,” Bulletin of Mathematical Biophysics 14, pp 153-157.
• Jeffrey Travers and Stanley Milgram, 1970, “An experimental study of the small world problem, “ Sociometry 32, pp. 425-443.
• Duncan Watts, 1999, Small Worlds: The Dynamics of Networks between Order and Randomness, Princeton, Princeton University Press.