View
222
Download
0
Embed Size (px)
Citation preview
Analysis of Fusing Online and Co-presence Social Networks
Juan (Susan) Pan, Daniel Boston, and Cristian Borcea
Department of Computer Science New Jersey Institute of Technology
Pervasive social applications
Traditional social apps Location-aware social apps
Socially-aware apps BUBBLE Rap
Use social knowledge to improve packet forwarding in delayed tolerant networks
Tribler Use social knowledge to reduce peer-to-peer
communication overhead
2
Social information collection
Declared by users Implicitly, through online social networks Explicitly, through surveys
Extracted from user online interactions Extracted from user mobility traces
Location traces Co-presence traces (e.g., using Bluetooth)
3
Social information representation
Multiple social graphs (e.g., Facebook and co-presence) Vertices -> users Edges -> social ties
Online social networks (OSN) provide relatively stable social graph Many connections are weak
▪ Example: actors have millions of “friends” Not all social contacts use OSN apps
Co-presence social network (CSN) identifies social ties grounded on real-world interactions Hard to differentiate social connections from passers-
by 4
Research questions
Do OSN and CSN just reinforce each other or capture different types of social ties?
Can a fused network take advantage of the strengths of both? How can we quantify the benefits of this fusion? Can we measure the contribution of each
source network to the fused network?
5
Outline
Motivation
Data collection
Social graph representation
Analysis of global network parameters
Analysis of local network parameters
Conclusions
6
Study participants
One month of CSN data and Facebook data for the same set of 104 students Volunteers Received compensation Belong to various departments at NJIT
7
Bluetooth based co-presence data
User Seen
TimeA B 1:00
B A 1:05
INTERNETA B 1:07
8
AB
B C 1:05
A C 1:07
Co-presence statistics
Max Mean Standard Dev.
Meeting Duration 220 hrs 2 min
1hr 16min 7hrs 34 min
Meeting Frequency
51 2.2 3.79
Facebook data
Subjects gave us permission to collect data Friends, wall writings,
comments, photo tags Online interaction is wall
writing, comment or photo tag Count number of
interactions between user pairs
Max Means Standard Dev.
Online Interactions
40 2 4
10
Outline
Motivation
Data collection
Social graph representation
Analysis of global network parameters
Analysis of local network parameters
Conclusions
11
Weighted social graphs are more accurate
OSN: Weightonline = number of interactions CSN: Weightco-presence = 0.5 х Weightduration +
0.5 х Weightfrequency
How to make OSN and CSN weights comparable? Need weight normalization
OSN: Weightonline [1,40] CSN
▪ Weightduration = (Duration/MAXduration)*40 [1,40]▪ Weightfrequency= (Frequency/MAXfrequency )*40 [1,40]
12
How to remove edges due to passers-by in CSN? Very short and infrequent co-presence does not
indicate the presence of a social tie
CSN noise reduction
13
Find duration & frequency thresholds for adding a CSN edge Increase thresholds until Edit distance between CSN
and OSN stabilizes▪ Edit distance: number of edge additions/deletions to
transform one graph into the other▪ Keep OSN unchanged because Facebook friendship
confirmations validate social ties
Threshold selection
14
Total meeting duration thresholdα= 160 minutes per month
Total meeting frequency thresholdβ= 3 times per month
Resulting social graphs
Co-presence SocialNetwork
Online SocialNetwork
Fused Network (51 shared edges)15
Outline
Motivation
Data collection
Social graph representation
Analysis of global network parameters
Degree, connectivity, centrality, cohesiveness
Analysis of local network parameters
Conclusions
16
OSN CSN Fused
Correlation (online, co-presence)= 0.202Average
degree3.17 3.77 5.96
• OSN degree follows proximately power law distribution
• CSN degree does not resemble as strong power-law distribution as OSN’s• Due to meeting with familiar strangers• Consequently, similar result observed for fused
network
Degree distribution
3 nodes are social
butterflies
Most nodes have high degree in either CSN or OSN, but not both 3 nodes have high degree in both CSN and OSN
Increased average degree means people meet different sets of contacts in the two source networks 17
Connectivity
OSN CSN Fused
Weighted
Number of edges 165 196 310 N
Size of LCC(largest connected component)
63 84 98 N
Diameter of LCC 7 8 7 N
Average length of shortest path
12.3 21.98 8.77 Y
CSN contributes 27% more edges than OSN
• Compared to OSN, CSN has 55% more connected people
• Almost all people connected in fused network• Average weighted shortest path reduced in
fused network• Stronger social connectivity: reason to leverage it in
social apps 18
OSN CSN Fused
Weighted
Average weight betweenness
49.1 90.13 94.83 Y
Average length of shortest path
12.3 21.98 8.77 Y
Average edge weight 3.02 3.64 1.95 Y
Average weighted clustercoefficient
0.156 0.122 0.157 Y
• CSN has much longer average shortest path than OSN• Hence, average betweenness is high
• In fused network, average shortest path is low, but betweenness is highest• Social centrality is improved
Betweenness centrality and cluster coefficient
• Average edge weight shows that people interact more in real life than online
• Highly socially active person online is not necessarily highly socially active in real life• Thus, smaller values in fused network
• OSN has higher cohesiveness• People become friends when
sharing common friends
• OSN contributes more to fused
19
OSN CSN
Outline
Motivation
Data collection
Social graph representation
Analysis of global network parameters
Analysis of local network parameters
Node, edge, community
Conclusions20
Similarity of node degree and edge weight Calculate Euclidean distance of the degree vector (104
nodes) and shared edge weight vector (51 edges) Similarity is inverse of distance
Distance(OSN, CSN)
Distance(OSN, fused)
Distance(CSN, fused)
Weighted node degree
0.558 0.306 0.256
Node degree 0.399 0.305 0.225
Edge weight 0.560 0.324 0.295
21
CSN more similar to fused network
Computation of community similarity How to quantify community similarity across
networks? Few communities are the same Better to quantify community overlapping
Compute k-clique overlapping clusters on the three networks separately
Use community overlapping matrix to compute distance between networks (inverse of similarity)
22
Community similarity
K=3
K=4
K=5
Dist(OSN, fused)
2561
142 26.5
Dist(CSN, fused)
2289
135 32.0
Fused network has larger average size community than OSN and CSN (fused=6.1, CSN=4.9, OSN=5.2)
CSN is closer to the fused network for weaker communities (k=3,4)
OSN is closer to fused network for stronger communities(k=5)
OSN contributes stronger social communities than CSN 23
Conclusions
CSN and OSN represent two different classes of social engagement
Applications may benefit from fused network that merges CSN and OSN CSN increases the fused network connectivity and
communication strength OSN strengthens the community structure and
lowers the average path length of fused network Typical example is friend-of-friend apps
24
Mobius project
Decentralized two-tier infrastructure for mobile social computing
P2P tier Collects on-line social information Manages social state Runs user-deployed services to support
mobile apps Dynamically adapts to geo-social
context
▪ Energy-efficiency, scalability, reliability
Mobile tier Runs mobile applications Collects geo-social information from
phones25
Application scenario: communitymultimedia sharing system
Thank you!Acknowledgment: NSF Grant CNS-0831753
http://www.cs.njit.edu/~borcea/mobius/
26
Related work
Kostakos[2010] The networks are very sparse Co-presence social ties are based on only one
meeting Does not consider user interaction (edge weight) There is no proper noise reduction
Eagle[2009], Cranshaw[2010] Focused on using co-presence data to predict
friendship Mtibaa[2008]
Concluding that the two graphs are similar Conference over a single day These results cannot be broadened
27
Power Law distribution
Node degrees in real-world large scale social networks often follow a power law distribution
few nodes with many degrees and many others with few degrees
29