45
Rise and Fall Patterns of Information Diffusion: Model and Implications Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), B. Aditya Prakash (CMU), Lei Li (UCB), Christos Faloutsos (CMU) KDD 2012 1 Y. Matsubara et al.

Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Rise and Fall Patterns of Information Diffusion: Model and Implications �Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), B. Aditya Prakash (CMU), Lei Li (UCB), Christos Faloutsos (CMU) �

KDD 2012 1 Y. Matsubara et al.

Page 2: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Motivation �

Social media facilitate faster diffusion of news and rumors

KDD 2012 2

Q: How do news and rumors spread in social media? �

Y. Matsubara et al.

Page 3: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

News spread in social media �

MemeTracker [Leskovec et al. KDD’09] short phrases sourced from U.S. politics in 2008

KDD 2012 3

20 40 60 80 100 120 140 1600

100

200

Time (hours)

# of

men

tions

“you can put lipstick on a pig” (# of mentions in blogs)

“yes we can”

Y. Matsubara et al.

20 40 60 80 100 120 140 1600

50

100

Time (hours)

# of

men

tions

(per hour, 1 week)

Page 4: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

News spread in social media �

MemeTracker [Leskovec et al. KDD’09] short phrases sourced from U.S. politics in 2008

KDD 2012 4

20 40 60 80 100 120 140 1600

100

200

Time (hours)

# of

men

tions

“you can put lipstick on a pig” (# of mentions in blogs)

“yes we can”

Y. Matsubara et al.

20 40 60 80 100 120 140 1600

50

100

Time (hours)

# of

men

tions

Breaking news�

Decay �News

spread

(per hour, 1 week)

Page 5: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Rise and fall patterns in social media�

Twitter (# of hashtags per hour)

Google trend (# of queries per week)

KDD 2012 5

0 20 40 60 80 10020406080

100120

Time (weeks)

Valu

e

10 20 30 40 50 600

50

100

Time

Valu

e

0 50 100 1500

100

200

Time

Valu

e

0 50 100 1500

1000

2000

Time

Valu

e

Y. Matsubara et al.

“#assange” “#stevejobs”

“harry potter” (2010 - 2011) “tsunami” (in 2005)

(per hour, 1week) (per hour, 1 week)

(per week, 1 year) (per week, 2 years)

Page 6: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

Rise and fall patterns in social media�

KDD 2012 6

How many patterns are there? -Earlier work claims there’re several classes •  four classes on YouTube [Crane et al. PNAS’08] •  six classes on Media [Yang et al. WSDM’11]

Y. Matsubara et al.

Page 7: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Rise and fall patterns in social media�

KDD 2012 7

Q. How many classes are there after all?

A. Our answer is “ONE”!

We can represent all patterns by single model

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

Y. Matsubara et al.

Page 8: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Outline �

KDD 2012 8

-  Motivation -  Problem definition -  Proposed method -  Experiments -  Discussion – SpikeM at work -  Conclusions �

Y. Matsubara et al.

Page 9: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Problem definition �

KDD 2012 9

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

Goal: predict/model social activity �

Given:

-  Network of bloggers/users -  External shock/event -  Quality of the event β Find:

-  How blogging activity will evolve over time

Problem 1 (What-if?)�β

Y. Matsubara et al.

Page 10: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Problem definition �

KDD 2012 10

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

Goal: predict/model social activity �

Given:

-  Behavior of spikes Find:

-  Equation/model that can explain them, e.g., -  # of potential bloggers -  Strength of external shock -  Quality of the event β

Epidemic process by word-of-mouth

Problem 2 (Model design) �

β

Y. Matsubara et al.

Page 11: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Outline �

KDD 2012 11

-  Motivation -  Problem definition -  Proposed method -  Experiments -  Discussion – SpikeM at work -  Conclusions �

Y. Matsubara et al.

Page 12: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Proposed method�

SpikeM capture 3 properties of real spike

KDD 2012 12

20 40 60 80 100 120 140 1600

100

200

Time (hours)

# of

men

tions

1. periodicities

Y. Matsubara et al.

Page 13: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Proposed method�

SpikeM capture 3 properties of real spike

KDD 2012 13

20 40 60 80 100 120 140 1600

100

200

Time (hours)

# of

men

tions

2. avoid infinity

Y. Matsubara et al.

1. periodicities

Page 14: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Proposed method�

SpikeM capture 3 properties of real spike

KDD 2012 14

20 40 60 80 100 120 140 1600

100

200

Time (hours)

# of

men

tions

3. power-law fall

Y. Matsubara et al.

1. periodicities 2. avoid infinity

Page 15: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Proposed method�

SpikeM capture 3 properties of real spike

KDD 2012 15

20 40 60 80 100 120 140 1600

100

200

Time (hours)

# of

men

tions

3. power-law fall

SpikeM capture behavior of real spikes using few parameters

Y. Matsubara et al.

1. periodicities 2. avoid infinity

Page 16: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Main idea (details) �-  1. Un-informed bloggers (clique of N bloggers/nodes)

KDD 2012 16

Time n=0

Y. Matsubara et al.

Nodes (bloggers) consist of two states

- Un-informed of rumor

-  informed, and Blogged about rumor

U

B

Page 17: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Main idea (details) �-  1. Un-informed bloggers (clique of N bloggers/nodes)

-  2. External shock at time nb (e.g, breaking news)

KDD 2012 17

Time n=0 Time n=nb

Y. Matsubara et al.

External shock

-  Event happened at time -  bloggers are informed, blog about news

nb

Sb

Sb

Page 18: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Main idea (details) �-  1. Un-informed bloggers (clique of N bloggers/nodes)

-  2. External shock at time nb (e.g, breaking news)

-  3. Infection (word-of-mouth effects)

KDD 2012 18

Time n=0 Time n=nb Time n=nb+1

β

Y. Matsubara et al.

Infectiveness of a blog-post

-  Strength of infection (quality of news) -  Decay function (how infective a blog posting is)

!f (n)

Page 19: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Main idea (details) �-  1. Un-informed bloggers (clique of N bloggers/nodes)

-  2. External shock at time nb (e.g, breaking news)

-  3. Infection (word-of-mouth effects)

KDD 2012 19

Time n=0 Time n=nb Time n=nb+1

β

Y. Matsubara et al.

Infectiveness of a blog-post

-  Strength of infection (quality of news) -  Decay function (how infective a blog posting is)

!f (n)

Decay function: �

f (n) = ! *n!1.5

f (n)

n

Linear scale f (n)

n

Log scale

-1.5

Page 20: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

SpikeM-base (details) �Equations of SpikeM (base)

KDD 2012 20

-  Total population of available bloggers -  Strength of infection/news -  External shock at birth (time ) -  Background noise

nbSb

N

nb,Sb

!

!

Y. Matsubara et al.

!B(n+1) =U(n) " (!B(t)+ S(t)) " f (n+1# t)+!t=nb

n

$

U(n+1) =U(n)!"B(n+1)Un-informed

Blogged

Page 21: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

SpikeM - with periodicity (details) �Full equation of SpikeM

KDD 2012 21 Y. Matsubara et al.

U(n+1) =U(n)!"B(n+1)

Un-informed

Blogged

!B(n+1) = p(n+1) " U(n) " (!B(t)+ S(t)) " f (n+1# t)+!t=nb

n

$%

&''

(

)**Periodicity

12pm Peak activity 3am

Low activity

Time n

activity

p(n)Bloggers change their

activity over time (e.g., daily, weekly, yearly)�

Page 22: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Learning parameters -  Given a real time sequence -  Minimize the error (Levenberg-Marquardt (LM) fitting)

Model fitting (Details) �

SpikeM consists of 7 parameters

KDD 2012 22

! = {N,",nb,Sb,#,Pa,Ps}

X = {X(1),...X(n),...,X(nd )}

D(X,! ) = (X(n)!"B(n))2n=1

nd

#

Y. Matsubara et al.

Page 23: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Analysis �

SpikeM matches reality exponential rise and power-raw fall

KDD 2012 23

Y. Matsubara et al.

rise fall

SpikeM vs. SI model (susceptible infected model)

Page 24: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Analysis �

KDD 2012 24

Linear-log

Log-log

Rise-part SpikeM: exponential SI model: exponential

Y. Matsubara et al.

rise fall

Reverse x-axis�

Page 25: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Analysis �

KDD 2012 25

Linear-log

Log-log

Y. Matsubara et al.

rise fall

Fall-part SpikeM: power law SI model: exponential

SpikeM matches reality

Page 26: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Outline �

KDD 2012 26

-  Motivation -  Problem definition -  Proposed method -  Experiments -  Discussion – SpikeM at work -  Conclusions �

Y. Matsubara et al.

Page 27: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Experiments�

We answer the following questions…

KDD 2012 27

Q1. Match real spikes - Q1-1: K-SC clusters - Q1-2: MemeTracker - Q1-3: Twitter - Q1-4: Google trend

Q2. Forecast future patterns

Y. Matsubara et al.

Page 28: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Q1-1 Explaining K-SC clusters�

KDD 2012 28

Six patterns of K-SC [Yang et al. WSDM’11]

SpikeM can generate all patterns in K-SC

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

20 40 60 80 100 1200

50

100

Time

Valu

e

OriginalSpikeM

Y. Matsubara et al.

Page 29: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Q1-2 Matching MemeTracker patterns �

KDD 2012 29

MemeTracker (memes in blogs) [Leskovec et al. KDD’09]

SpikeM can fit various patterns in blog

Linear scale

Log scale

Y. Matsubara et al.

Noise-robust

fitting�Outliers

Page 30: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Q1-3 Matching Twitter data�

KDD 2012 30

Twitter data (hashtags)

SpikeM can generate various patterns in social media

Y. Matsubara et al.

Linear scale

Log scale

Page 31: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Q1-4 Matching Google trend data�

KDD 2012 31

Volume of searches for queries on Google

SpikeM can capture various patterns

Y. Matsubara et al.

Page 32: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Q2 Tail-part forecasts�

KDD 2012 32

- Given a first part of the spike - forecast the tail part

SpikeM can capture tail part (AR: fail)

Y. Matsubara et al.

Page 33: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Outline �

KDD 2012 33

-  Motivation -  Problem definition -  Proposed method -  Experiments -  Discussion – SpikeM at work -  Conclusions �

Y. Matsubara et al.

Page 34: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

SpikeM at work �

SpikeM is capable of various applications

KDD 2012 34

-  A1. What-if forecasting

-  A2. Outlier detection

-  A3. Reverse engineering

Y. Matsubara et al.

Page 35: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A1. “What-if” forecasting �

KDD 2012 35

Forecast not only tail-part, but also rise-part!

e.g., given (1) first spike, (2) release date of two sequel movies (3) access volume before the release date

? � ? �

(1) First spike (2) Release date (3) Two weeks before release

Y. Matsubara et al.

Page 36: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A1. “What-if” forecasting �

KDD 2012 36

Forecast not only tail-part, but also rise-part!

SpikeM can forecast upcoming spikes

Y. Matsubara et al.

(1) First spike (2) Release date (3) Two weeks before release

Page 37: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A2. Outlier detection �

KDD 2012 37

Fitting result of “tsunami (Google trend)” in log-log scale

Y. Matsubara et al.

Another earthquake �

One year after Indian Ocean earthquake �Indian Ocean

earthquake �

Page 38: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A3. Reverse engineering �

KDD 2012 38

SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Meme�

Twitter�

Page 39: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A3. Reverse engineering �

KDD 2012 39

SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Observation 1 Total population N is

almost same

N =1,000 ~ 2,000

Meme�

Twitter�

Page 40: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A3. Reverse engineering �

KDD 2012 40

SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Observation 2 Strength of first burst (news) is

! *N =1.0

Meme�

Twitter�

Page 41: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

A3. Reverse engineering �

KDD 2012 41

SpikeM provide an intuitive explanation PDF of parameters over 1,000 memes/hashtags

Y. Matsubara et al.

Observation 3 Daily periodicity with phase shift

Every meme has the same periodicity without lag

Ps = 0

(Twitter) Daily periodicity with

more spread in (i.e., Multiple time zone)

Ps

Meme�

Twitter�

Page 42: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Outline �

KDD 2012 42

-  Motivation -  Background -  Proposed method -  Experiments -  Discussion – SpikeM at work -  Conclusions �

Y. Matsubara et al.

Page 43: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Conclusions�

KDD 2012 43

SpikeM has following advantages:

•  Unification power It includes earlier patterns/models

•  Practicality: It Matches real datasets

•  Parsimony It requires only 7 parameters

•  Usefulness: What-if scenarios, outliers, etc.

Y. Matsubara et al.

Page 44: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Acknowledgements

Thanks Jaewon Yang & Jure Leskovec for the six clusters [WSDM’11]

Funding

44  KDD 2012 Y. Matsubara et al.

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

0 50 1000

50

100

Page 45: Rise and Fall Patterns of Information Diffusion: Model and ...yasuko/PUBLICATIONS/...Rise and fall patterns in social media Twitter (# of hashtags per hour) Google trend (# of queries

Thank you�

KDD 2012 45 Y. Matsubara et al.

Code: http://www.kecl.ntt.co.jp/csl/sirg/people/yasuko/software.html

Email: matsubara.yasuko lab.ntt.co.jp

Yasushi Sakurai

Yasuko Matsubara

B. Aditya Prakash

Lei Li Christos Faloutsos