23
MCD: Mutual Community Detection across Multiple Social Networks Philip S. Yu 1,2 and Jiawei Zhang 1 1 University of Illinois at Chicago 2 Tsinghua University

MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

MCD: Mutual Community Detection across Multiple Social

Networks

Philip S. Yu1,2 and Jiawei Zhang1

1University of Illinois at Chicago 2Tsinghua University

Page 2: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Users use multiple social networks simultaneously

Locations

Tips

8 AM 12 PM 4 PM 8 PM 11 PM

User Accounts

Temporal Activities

User Accounts

Locationslocate

locate

Tweets

8 AM 12 PM 4 PM 8 PM 11 PM

Temporal Activities

Partially Aligned Social Networks

anchor users

anchor links

non-anchor users

Page 3: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Problem Studied: Simultaneous Community Detection of Multiple Partially Aligned Networks

Locations

Tips

8 AM 12 PM 4 PM 8 PM 11 PM

User Accounts

Temporal Activities

User Accounts

Locationslocate

locate

Tweets

8 AM 12 PM 4 PM 8 PM 11 PM

Temporal Activitiescommunities in social networks

Page 4: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Motivations: Mutual community detection

Locations

Tips

8 AM 12 PM 4 PM 8 PM 11 PM

User Accounts

Temporal Activities

User Accounts

Locationslocate

locate

Tweets

8 AM 12 PM 4 PM 8 PM 11 PM

Temporal Activities

isolated user: cannot be handled in traditional

community detection methods

transfer

Page 5: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Locations

Tips

8 AM 12 PM 4 PM 8 PM 11 PM

User Accounts

Temporal Activities

User Accounts

Locationslocate

locate

Tweets

8 AM 12 PM 4 PM 8 PM 11 PM

Temporal Activities

community boundary users: hard to be partitioned in traditional community detection methods

they are not connected

transfer

Motivations: Mutual community detection

Page 6: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Challenges• Challenge 1: Similarity measure among users with

heterogeneous information in social networks

• Challenge 2: Community detection in each network

• Challenge 3: Mutual community detection

Page 7: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Challenge 1: Similarity Measure among Users with Heterogeneous Information

Page 8: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Solution: Social Meta Path based Similarity Measure

Similarity score between users x and y based on meta paths 1-7

ymeta path

Page 9: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Challenge 2: Community detection in each heterogeneous network

• Solution: Partition users into different clusters based on certain cost function, e.g., normalized-cut

S(u, v) is the meta path based similarity between u and v

Page 10: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

• Solution: transfer information about the anchor users across networks based on Clustering Discrepancy

Challenge 3: Mutual community detection

Normalized Discrepancy is used, Definition refer to the paper

Page 11: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Example: Discrepancy

community detection discrepancy

where

Page 12: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

• The optimal Community Detection of networks G(1) and G(2) can be represented as

sensitivity analysis of θ is available in the experiments

Mutual Community Detection Objective Function

Page 13: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Mutual Community Detection Objective Function

solution: fix one variable, update the other variable until convergence

Convergence analysis is available in the experiments

Objective Equation of Normalized-Cut in Network G(1)

Objective Equation of Normalized-Cut in Network G(2)

Objective Equation of Normalized-Discrepancy

between Networks G(1) and G(2)

Page 14: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Experiments• Dataset: Foursquare &Twitter (Anchor Links: 3,388)

Page 15: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Experiments• Comparison Methods

• MCD: the method proposed in this paper

• SI-Clus: Social Influence based multi-network clustering methods [34][37]

• NCUT: partition users by minimizing the normalized-cut cost calculated with social links[23]

• KMeans: extend traditional KMeans to partition users based on the social link information only[22]

[34] J. Zhang and P. Yu. Community detection for emerging networks. In SDM, 2015. [37] Y. Zhou and L. Liu. Social influence based clustering of heterogeneous information networks. In KDD, 2013. [22] G. Qi, C. Aggarwal, and T. Huang. Community detection with edge content in social media networks. In ICDE, 2012.[23] J. Shi and J. Malik. Normalized cuts and image segmentation. TPAMI, 2000.

Page 16: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Experiments• Evaluation Metrics

• Clustering Quality Metrics • (1) normalized-dbi, (2) entropy,

• (3) density, (4) Silhouette index

• Clustering Consensus Metrics • (1) rand, (2) variation of information,

• (3)mutual information, (4) normalized mutual information

Page 17: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Community Detection Quality in FoursquarePartial Alignment of Foursquare and Twitter

Page 18: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Community Detection Quality in Twitter

Page 19: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Consensus of the Community Detection Results between Foursquare and Twitter

Page 20: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Convergence AnalysisTo address the objective function: iterative updating variables

both X(1) and X(2) can converge within 200 iterations

Page 21: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Parameter AnalysisIn the objective function: θ is the weight of the discrepancy term

higher ndbi: better quality; lower rand: more consensus

smaller θ: favors high quality results in each network

larger θ favors consensus results between networks

Page 22: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Summary• Problem Studied: Mutual Community Detection across

Partially Aligned Social Networks

• Proposed Method:

• Social Meta Path based Similarity Measure among users

• Normalized-Cut based Isolated Community Detection

• Normalized-Discrepancy based Mutual Community Detection

Page 23: MCD: Mutual Community Detection across Multiple Social ...jzhang2/files/2015_bigdata... · users based on the social link information only[22] [34] J. Zhang and P. Yu. Community detection

Q&A