15
Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research Laboratory Dept. of Computer Science and Engineering The Ohio State University

Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Embed Size (px)

Citation preview

Page 1: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Local/Global Term Analysis for Discovering Community

Differences in Social Networks

David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy

Data Mining Research LaboratoryDept. of Computer Science and Engineering

The Ohio State University

Page 2: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Communities in Social Networks

Observations:•Social networks consist of many interacting communities of users.•Each community can be characterized by the content which its members generate.

Motivating questions:•Given a community, how can we determine what its members are talking about, relative to the entire social network?•Given two communities, how can we determine the difference between them?

Page 3: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Methodology• A community’s users mention relevant terms

frequently.

• Many works look at #hashtags or most frequent terms.

• But not all frequent terms are relevant.

• Desiderata:– Consider all content terms

– Interpretable

– scalable to million-user social networks

Page 4: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Four-step Process• Four-step process for determining community

differences:– Community Discovery

– Term Extraction & Aggregation

– Visualization

– Handling Time Varying Data

Network

Content

Page 5: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

1. Community Discovery (I)

• Keyword search based identification of candidate users

• Extract underlying network of users

• Local community identification• Graph clustering (e.g. METIS

[KARYPIS’99], Graclus [DHILLON’07], MLR-MCL [SATULURI’09], Localized Clustering (L-Spar) [SATULURI’11])

• Modularity [NEWMAN’04]

• Content-Sensitive Viewpoint Neighborhoods [Asur’09]

Page 6: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

1. Community Discovery (II)

• Start with the network of all users

• Extract candidate communities• Using any community discovery

algorithm

• Filter candidate communities by keyword strength

Page 7: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

2. Term Extraction & Aggregation

• Extract terms from each message and weight them

• Term Frequency• TF/IDF• Domain-dependent

semantic importance

• Merge terms• Combine synonyms• Handling hypernyms

• Aggregate them by user

Page 8: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

3. Visualization

• Plot terms by frequency across two axes.

• Global (all users) on Y-axis• Local (community users) on

X-axis.• Terms on the regression line

are equifrequent in both groups

• Terms off the regression line are relatively more frequent in one group

• Support for multiple scales of local community identification

Page 9: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

4. Handling Time Varying Data

• Time range divided into batches• Perform steps 1 to 3 for each batch• Visualize results

Page 10: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Experimental Results

Between Nikon and Olympus communities, Olympus community talks more about blogs.

Using a dataset of 1M tweets we look at groups discussing Canon, Nikon, and Olympus cameras:

Page 11: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Experimental Results

Between camera and global communities, camera community talks less about health, teeth, and success.

Page 12: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Experimental ResultsUsing a dataset of 2M tweets about the “Occupy” movement, we compare “Occupy Oakland” to the entire “Occupy” movement:

Occupy Oakland movement talks less about NYPD, p2 (group of progressives using social media), and tcot (“Top Conservatives On Twitter”).

Page 13: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Filter and Zoom

Page 14: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Conclusions• Four-part visual analytic framework for

discovering differences between communities in social networks.– Simple– Scalable

• Qualitative and quantitative results.

• Future– Temporal– More quantitative measures– Automatically determine best scale

Page 15: Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research

Thank You!