Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Predictability of conversation partners Taro Takaguchi, Mitsuhiro Nakamura, Nobuo Sato, Kazuo Yano, and Naoki Masuda
preprint arXiv:1104.5344
社会ネットワーク中での 会話パターンの予測可能性
高口太朗, 中村光宏, 佐藤信夫, 矢野和男, 増田直紀
“Predictability of conversation partners” preprint arXiv:1104.5344
Conven&onal assump&ons
Mobility pa+erns in the physical space
Temporal order of selec6ng interac6on partners
by random (or Lévy) walk by Poisson process
Actual behavior: to what degree random / determinis6c ?
Modeling human behavior such as
Predictability of mobility pa7erns (Song et al., 2010)
Mobility pa+erns are largely predictable. Method: measuring the entropy of the sequence of loca6ons of each cell-‐phone user
T = X1, · · · , Xt, Xt+1, · · · , XL
How about other components of human behavior?
Predictability of interac&on partners
interacts with
-‐ Characterize individuals in a social organiza6on -‐ Related to epidemics and informa6on spreading
2, 3, 2, 2, 3, 1“partner sequence”
6me
2 2 2 3 3 1
discard the 6ming of events
Data set: face-‐to-‐face interac&on log in an office
Recording from two offices in Japanese companies -‐ subjects: 163 individuals -‐ period: 73 days -‐ total conversa6on events: 51,879 6mes
Name tag with an infrared module (The Business Microscope system)
We used the data collected by World Signal Center, Hitachi, Ltd., Japan. (株)日立製作所 ワールドシグナルセンタにより収集された データセットを使用した。
h+p://www.hitachi-‐hitec.com/jyouhou/business-‐microscope/
Details of data set
-‐ A conversa6on event = communica6on between close tags
-‐ Conversa6on events: undirected -‐ Time resolu6on = one minute
-‐ Mul6ple events ini6ated within the same minute
→ determine their order at random
!"#$%&!'#(%%%%%%%%%%%%%%%)*+%%%%)*,%%%%-./'!"01,22342+42+%+56+7%%%%%%%%+%%%%%%%%7%%%%%%%%+%#"1,22342+42+%+56+8%%%%%%%%+%%%%%%%%9%%%%%%%%5%#"1,22342+42+%+56+8%%%%%%%%8%%%%%%,9%%%%%%%%+%#"1,22342+42+%+56+:%%%%%%%%,%%%%%%%%8%%%%%%%%+%#"1,22342+42+%+56+3%%%%%%%%+%%%%%%+7%%%%%%%%5%#"1,22342+42+%+56+3%%%%%%%%7%%%%%%%%5%%%%%%%%+%#"1,22342+42+%+56,5%%%%%%%%+%%%%%%,,%%%%%%%%+%#"1,22342+42+%+56,9%%%%%%%%+%%%%%%,,%%%%%%%%;%#"1,22342+42+%+56,9%%%%%%%%7%%%%%%%%9%%%%%%%%+%#"1
!"#$%&!'#(%%%%%%%%%%%%%%%)*+%%%%)*,%%%%-./'!"01,22342+42+%+56+7%%%%%%%%+%%%%%%%%7%%%%%%%%+%#"1,22342+42+%+56+8%%%%%%%%+%%%%%%%%9%%%%%%%%5%#"1,22342+42+%+56+3%%%%%%%%+%%%%%%+7%%%%%%%%5%#"1,22342+42+%+56,5%%%%%%%%+%%%%%%,,%%%%%%%%+%#"1,22342+42+%+56,9%%%%%%%%+%%%%%%,,%%%%%%%%;%#"1
<<<<<
<<<<<
=7>%9>%+7>%,,>%,,>%<<<?
@'A @BA
@CA
Three entropy measures (cf. Song et al., 2010)
H0i = log2 ki
H1i = −
j∈Ni
Pi(j) log2 Pi(j)
Random entropy: interac6on in a totally random manner
Uncorrelated entropy: heterogeneity among
Condi6onal entropy: second-‐order correla6on
-‐ for individual who has partners in the recording period, i ki
Pi(j)
Pi(|j)
Pi(j) : the probability to talk with j
: to talk with immediately a`er with j
H2i = −
j∈Ni
Pi(j)
∈Ni
Pi(|j) log2 Pi(|j)
cf. Entropy
Ex.) coin toss
H(P ) ≡ −
ω∈Ω
P (ω) log2 P (ω)
H(p) = −p log2 p− (1− p) log2(1− p)ω ∈ head, tailP (head) = p
p0 1
1
“The uncertainty of the random variable”
p = 0 : always tail 1/2
Example 1: random
1
2
3
i
Pi(1|∗) = Pi(1),
Pi(2|∗) = Pi(2),
Pi(3|∗) = Pi(3).
H0i = log2 3
H2i = H
1i < H
0i
Suppose that
Pi(1) = 1/2, Pi(2) = 1/3, Pi(3) = 1/6
H1i = −
3
j=1
Pi(j) log2 Pi(j) =2
3+
1
2log2 3
Example 2: predictable
1
2
3
i Suppose that
Pi(1) = 1/2, Pi(2) = 1/3, Pi(3) = 1/6
0 1 12/3 0 01/3 0 0
Pi(|j)
j
H0i = log2 3
H1i =
2
3+ log2 3
H2i =
1
2log2 3−
1
3
< H
1i < H
0i
(same as example 1)
Quan&fying the predictability
Ii = H1i −H
2i
The informa6on about the NEXT partner that is earned by knowing the PREVIOUS partner
Interpreta6on:
Mutual informa6on of the partner sequence
Ii
predictable random
0 H1i
Aim of the study
Examine the predictability of conversa6on partners
-‐ similar to that of mobility pa+erns
“Predictability” = the mutual informa6on of the partner sequence
Data set: face-‐to-‐face interac6on log from two offices in Japan
events
interacts with
2 2 2 3 3 1
Ii = H1i −H
2i
i
Histogram of the entropies
!"
!"#$%&'()'*!+*,*+"-./
0 1 2 3 4 5 67
07
17
27
37!"7
!"0
!"1
8-9
2.0 2.5 3.0 3.5 4.02.0
2.5
3.0
3.5
4.0
Hi1
(b)
Hi2
, regardless of the value of Ii = H1i −H
2i 0 H
1i
→ Partner sequences are predictable to some extent.
Significance of Ii
Null hypothesis: “ is posi6ve only because of the small data size.”
Compare with the value obtained from shuffled partner sequences
1 50 100 1500
1
2
3
i in the ascending order of I i
Ii
(a)
Ii
Empirical Ii
Ii
Error bars: 99% confiden6al intervals of shuffled sequences
Main cause of the predictability
The bursty human ac6vity pa+ern: characterized by the long-‐tailed inter-‐event intervals
“ tends to talk with within a short period a`er talking with .”
6me
i j j
P>(τ) of a typical individual
Test 1 (1)
6me 2 3 1
Partner sequence in a given day
with 1
2
3
with
with
τ (1)2 τ (2)2
τ (1)3
Purpose: examine the contribu6on of the bursty ac6vity
extract the sequences with each partner
1 1 1 3 2 2 3 3 2 1
τ (1)1 τ (2)1 τ (3)1 τ (4)1
τ (3)2
τ (2)3 τ (3)3
Test 1 (2)
with 1
2
3
with
with
Next, shuffle the intervals within each partner
merge
Calculate of this shuffled sequence Ibursti 1
2
τ (1)2τ (2)2
τ (1)3
τ (1)1τ (2)1 τ (3)1τ (4)1
τ (3)2
τ (2)3τ (3)3
2 3 1 1 1 1 3 2 2 3 3 2 1 6me
Result of Test 1
1 50 100 1500
1
2
3
i in the ascending order of I i
I iburst
(a)
Generate 100 shuffled sequence
Empirical Ii
Error bars: mean ± sd for the shuffled sequences
Contribu6on of burs6ness ≈ 80%
Test 2
events 2 2 2 3 3 1 2 3
2, 3, 2, 3, 1of the merged sequence Imerge
iCalculate
Purpose: examine the predictability a`er ominng the burs6ness
merge into one
Result of Test 2
1 50 100 1500
1
2
3
i in the ascending order of I imerge
I imerge
(b)
Shuffle the merged sequence with replacement condi6oned that no partner ID appears successively
ImergeiOriginal
Error bars: 99% confiden6al intervals of shuffled sequences
Imergei is significantly large.
→ Predictability remains without the effect of the burs6ness.
Summary so far
Partner sequence: predictable to some extent
Main cause of predictability: the bursty ac6vity pa+ern
-‐ Predictability remains a`er ominng the burs6ness.
2.0 2.5 3.0 3.5 4.02.0
2.5
3.0
3.5
4.0
Hi1
(b)
Hi2
Predictability depends on individuals
I i
num
ber o
f ind
ivid
uals
0.5 1.0 1.5 2.00
10
20
30
40
Depends on ’s posi6on in the social network?
Histogram of Ii
Ii
i
Conversa&on network
Node : individual
Link : a pair of individuals having at least one event
Link weight : The total number of events for the pair
i
j
wij
Undirected and weighted network
Degree
Strength
Mean weight
Node a+ribu6ons
kisi =
j wij
wi = si/ki
Correla&on between and node a7ribu&ons
0 10 20 30 40 50 60 70
0.5
1.0
1.5
2.0
k i
R ! 1.698 " 10 3
(a)
I i
0 500 1000 1500 2000
0.5
1.0
1.5
2.0
si
R ! 0.5111(b)
I i
0 20 40 60
0.5
1.0
1.5
2.0
wi
R ! 0.6533(c)
I i
Ii Ii
Ii
ki
wi
si
No correla6on Nega6ve
Nega6ve
Ii
Hypotheses
-‐ Fix , small → abundance of weak links (with small weights)
Hypothesis about links
-‐ “Strength of weak 6es” (Granove+er, 1973)
Hypothesis about nodes
w ≥ 5w ≤ 4
ki wi
Strength of weak &es (Granove7er, 1973)
Weak links tend to connect different communi6es. Weak links offer non-‐redundant contacts.
“bridge”
Hypotheses
-‐ Fix , small → abundance of weak links (with small weights)
Hypothesis about links
-‐ Weak links connect different communi6es.
Hypothesis about nodes
w ≥ 5w ≤ 4
ki wi
• Individuals connec6ng different communi6es → large
• Individuals concealed in a community → small Ii
Hypothesis about links
Rela6ve neighborhood overlap of a link (Onnela et al., 2007)
i j i j
Oij = 1Oij = 0
Purpose: quan6fy the degree of being concealed in a community
i j
Oij = 1/2
Oij =| Intersection of the neighbors of i and j|| Union of the neighbors of i and j|− 2
Hypothesis about links
P cum(w)
(a)
<O> w
0.2 0.4 0.6 0.8 1.0
0.30
0.35
0.40
0.45
“Strength of weak 6es” hypothesis; a link with large tends to have a large → increases with
wij Oij
Oij wij
Ow averaged over the links with wij ≤ wOij
frac6on of the links with
:
wij ≤ w
Consistent with the hypothesis
Hypotheses
-‐ Fix , small → abundance of weak links (with small weights)
Hypothesis about links
-‐ Weak links connects different communi6es.
Hypothesis about nodes
w ≥ 5w ≤ 4
ki wi
• Individuals connec6ng different communi6es → large
• Individuals concealed in a community → small Ii
Hypothesis about nodes
i i
Ci = 0 Ci = 1
Ci =# triangles including i
ki(ki − 1)/2
Clustering coefficient of a node (Wa+s and Strogatz, 1998)
Purpose: quan6fy the degree of being concealed in a community
i
Ci = 1/2
Abundance of triangles
-‐ 3-‐clique percola6on community (Palla et al., 2005)
-‐ Hierarchical structure (Ravász and Barabási, 2003) Nodes connec6ng communi6es → small Ci
Clustering coefficient with link-‐weight threshold
Ci(wthr) : a`er elimina6ng the links with wij < wthr
Original CN
wthr = 3
12 5
7
1 1 3
5 2
10
Expecta6on: triangles persist around individuals with small
Ii
Ci(wthr) wthrfor a fixed
Ii
Hypothesis about nodes
! "! #! $! %! &!!!'(
!'#
!')
!'"
!'&
!'!
!'&
w *+,
RI, C(w *+,)RI, C(w *+,)
-./01,,234*51670128850526*
wthr
: Pearson correla6on coefficient between and Ii Ci(wthr)
: Par6al correla6on coefficient between and Ii Ci(wthr)
with and fixed ki(wthr) si(wthr)
Hypotheses
-‐ Fix , small → abundance of weak links (with small weights)
Hypothesis about links
-‐ Weak links connects different communi6es.
Hypothesis about nodes
ki wi
• Individuals connec6ng different communi6es → large
• Individuals concealed in a community → small Ii
! "! #! $! %! &!!!'(
!'#
!')
!'"
!'&
!'!
!'&
w *+,
RI, C(w *+,)RI, C(w *+,)
-./
01,,234*51670128850526*
wthrP cum(w)
(a)
<O> w
0.2 0.4 0.6 0.8 1.0
0.30
0.35
0.40
0.45
Conclusions
Predictability of partner sequence -‐ Face-‐to-‐face interac6on log in offices in Japanese companies
Conversa6on partners: predictable to some extent
-‐ A main cause: the bursty ac6vity pa+ern
-‐ Significant predictability a`er ominng the burs6ness
Dependence on the individual’s posi6on in the conversa6on network -‐ “Strength of weak 6es”
-‐ INTER-‐community individuals → large
-‐ INTRA-‐community individuals → small
IiIi
Preprint available electronically
Taro Takaguchi, Mitsuhiro Nakamura, Nobuo Sato, Kazuo Yano, and Naoki Masuda, “Predictability of conversa6on partners”, preprint arXiv:1104.5344