View
917
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Professor Christos Faloutsos (Carnegie Mellon University, USA), gave a lecture on "Influence Propagation in Large Graphs - Theorems and Algorithms" in the Distinguished Lecturer Series - Leon The Mathematician.
Citation preview
Influence propagation in large graphs -
theorems and algorithmsB. Aditya Prakash
http://www.cs.cmu.edu/~badityap
Christos Faloutsoshttp://www.cs.cmu.edu/~christos
Carnegie Mellon UniversityA.U.T.’12
Thank you!
• Yannis Manolopoulos
AUT '12 Prakash and Faloutsos 2012 2
Networks are everywhere!
Human Disease Network [Barabasi 2007]
Gene Regulatory Network [Decourty 2008]
Facebook Network [2010]
The Internet [2005]
Prakash and Faloutsos 2012 3AUT '12
Dynamical Processes over networks are also everywhere!
Prakash and Faloutsos 2012 4AUT '12
Why do we care?• Information Diffusion• Viral Marketing• Epidemiology and Public Health• Cyber Security• Human mobility • Games and Virtual Worlds • Ecology• Social Collaboration........
Prakash and Faloutsos 2012 5AUT '12
Why do we care? (1: Epidemiology)
• Dynamical Processes over networks[AJPH 2007]
CDC data: Visualization of the first 35 tuberculosis (TB) patients and their 1039 contacts
Diseases over contact networks
Prakash and Faloutsos 2012 6AUT '12
Why do we care? (1: Epidemiology)
• Dynamical Processes over networks
• Each circle is a hospital• ~3000 hospitals• More than 30,000 patients transferred
[US-MEDICARE NETWORK 2005]
Problem: Given k units of disinfectant, whom to immunize?
Prakash and Faloutsos 2012 7AUT '12
Why do we care? (1: Epidemiology)
CURRENT PRACTICE OUR METHOD
~6x fewer!
~6x fewer!
[US-MEDICARE NETWORK 2005]
Prakash and Faloutsos 2012 8AUT '12Hospital-acquired inf. took 99K+ lives, cost $5B+ (all per year)
Why do we care? (2: Online Diffusion)
> 800m users, ~$1B revenue [WSJ 2010]
~100m active users
> 50m users
Prakash and Faloutsos 2012 9AUT '12
Why do we care? (2: Online Diffusion)
• Dynamical Processes over networks
Celebrity
Buy Versace™!
Followers
Social Media MarketingPrakash and Faloutsos 2012 10AUT '12
High Impact – Multiple Settings
Q. How to squash rumors faster?
Q. How do opinions spread?
Q. How to market better?
epidemic out-breaks
products/viruses
transmit s/w patches
Prakash and Faloutsos 2012 11AUT '12
Research Theme
DATALarge real-world
networks & processes
ANALYSISUnderstanding
POLICY/ ACTIONManaging
Prakash and Faloutsos 2012 12AUT '12
In this talk
ANALYSISUnderstanding
Given propagation models:
Q1: Will an epidemic happen?
Prakash and Faloutsos 2012 13AUT '12
In this talk
Q2: How to immunize and control out-breaks better?
POLICY/ ACTIONManaging
Prakash and Faloutsos 2012 14AUT '12
Outline
• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 15AUT '12
A fundamental questionStrong Virus
Strong Virus
Epidemic?Epidemic?
Prakash and Faloutsos 2012 16AUT '12
example (static graph)
Weak VirusWeak Virus
Epidemic?Epidemic?
Prakash and Faloutsos 2012 17AUT '12
Problem Statement
Find, a condition under which– virus will die out exponentially quickly– regardless of initial infection condition
above (epidemic)
below (extinction)
# Infected
time
Separate the regimes?
Prakash and Faloutsos 2012 18AUT '12
Threshold (static version)
Problem Statement• Given:
–Graph G, and –Virus specs (attack prob. etc.)
• Find: –A condition for virus extinction/invasion
Prakash and Faloutsos 2012 19AUT '12
Threshold: Why important?
• Accelerating simulations• Forecasting (‘What-if’ scenarios)• Design of contagion and/or topology• A great handle to manipulate the spreading
– Immunization– Maximize collaboration…..
Prakash and Faloutsos 2012 20AUT '12
Outline
• Motivation• Epidemics: what happens? (Theory)
– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 21AUT '12
“SIR” model: life immunity (mumps)
• Each node in the graph is in one of three states– Susceptible (i.e. healthy)– Infected– Removed (i.e. can’t get infected again)
Prob. β Prob. δ
t = 1 t = 2 t = 3Prakash and Faloutsos 2012 22AUT '12
Terminology: continued
• Other virus propagation models (“VPM”)– SIS : susceptible-infected-susceptible, flu-like– SIRS : temporary immunity, like pertussis– SEIR : mumps-like, with virus incubation (E = Exposed)….………….
• Underlying contact-network – ‘who-can-infect-whom’
Prakash and Faloutsos 2012 23AUT '12
Related Work R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991. A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks.
Cambridge University Press, 2010. F. M. Bass. A new product growth for model consumer durables. Management Science,
15(5):215–227, 1969. D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real
networks. ACM TISSEC, 10(4), 2008. D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly
Connected World. Cambridge University Press, 2010. A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of
epidemics. IEEE INFOCOM, 2005. Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free
networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003. H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000. H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer
Lecture Notes in Biomathematics, 46, 1984. J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer viruses.
IEEE Computer Society Symposium on Research in Security and Privacy, 1991. J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE
Computer Society Symposium on Research in Security and Privacy, 1993. R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical
Review Letters 86, 14, 2001.
……… ……… ………
All are about either:
• Structured topologies (cliques, block-diagonals, hierarchies, random)
• Specific virus propagation models
• Static graphs
All are about either:
• Structured topologies (cliques, block-diagonals, hierarchies, random)
• Specific virus propagation models
• Static graphs
Prakash and Faloutsos 2012 24AUT '12
Outline
• Motivation• Epidemics: what happens? (Theory)
– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 25AUT '12
How should the answer look like?
• Answer should depend on:– Graph– Virus Propagation Model (VPM)
• But how??– Graph – average degree? max. degree? diameter?– VPM – which parameters? – How to combine – linear? quadratic? exponential?
?diameterdavg ?/)( max22 ddd avgavg …..
Prakash and Faloutsos 2012 26AUT '12
Static Graphs: Our Main Result
• Informally,
•
For, any arbitrary topology (adjacency matrix A) any virus propagation model (VPM) in standard literature
the epidemic threshold depends only •on the λ, first eigenvalue of A, and •some constant , determined by the virus propagation model
λ λVPMC
No epidemic if λ *
< 1
VPMCVPMC
Prakash and Faloutsos 2012 27AUT '12In Prakash+ ICDM 2011 (Selected among best papers).
w/ DeepayChakrabarti
Our thresholds for some models
• s = effective strength• s < 1 : below threshold
Models Effective Strength (s)
Threshold (tipping point)
SIS, SIR, SIRS, SEIR s = λ .
s = 1 SIV, SEIV s = λ .
(H.I.V.) s = λ .
12
221
vv
v
2121 VVISIPrakash and Faloutsos 2012 28AUT '12
Our result: Intuition for λ
“Official” definition:• Let A be the adjacency
matrix. Then λ is the root with the largest magnitude of the characteristic polynomial of A [det(A – xI)].
• Doesn’t give much intuition!
“Un-official” Intuition • λ ~ # paths in the
graph
u≈ .k
kA
(i, j) = # of paths i j of length k
kA
Prakash and Faloutsos 2012 29AUT '12
Largest Eigenvalue (λ)
λ ≈ 2 λ = N λ = N-1
N = 1000λ ≈ 2 λ= 31.67 λ= 999
better connectivity higher λ
Prakash and Faloutsos 2012 30AUT '12 N nodes
Examples: Simulations – SIR (mumps)
(a) Infection profile (b) “Take-off” plot
PORTLAND graph: synthetic population, 31 million links, 6 million nodes
Effective StrengthTime ticks
Prakash and Faloutsos 2012 31AUT '12
Examples: Simulations – SIRS (pertusis)
Effective StrengthTime ticks (a) Infection profile (b) “Take-off” plot
PORTLAND graph: synthetic population, 31 million links, 6 million nodesPrakash and Faloutsos 2012 32AUT '12
Models and more modelsModel Used for
SIR Mumps
SIS Flu
SIRS Pertussis
SEIR Chicken-pox
……..
SICR Tuberculosis
MSIR Measles
SIV Sensor Stability
H.I.V.……….
2121 VVISI
33
Ingredient 1: Our generalized model
34
Endogenous Transitions
Susceptible Infected
Vigilant
Exogenous Transitions
Endogenous Transitions
Endogenous Transitions
Susceptible Infected
Vigilant
Special case
Susceptible Infected
Vigilant
35
Special case: H.I.V.
2121 VVISI
Multiple Infectious, Vigilant states
36
“Terminal”
“Non-terminal”
Ingredient 2: NLDS+Stability
• View as a NLDS– discrete time – non-linear dynamical system (NLDS)
Probability vector Specifies the state of the system at time t
37
size mN x 1
.
.
.
.
.
size N (number of nodes in the graph)
.
.
.
SS
II
VV
Ingredient 2: NLDS + Stability
• View as a NLDS– discrete time – non-linear dynamical system (NLDS)
Non-linear functionExplicitly gives the evolution of system
38
size mN x 1
.
.
.
.
.
.
.
.
Ingredient 2: NLDS + Stability
• View as a NLDS– discrete time – non-linear dynamical system (NLDS)
• Threshold Stability of NLDS
39
= probability that node i is not attacked by any of its infectious neighbors
Special case: SIR
size 3N x 1 II
RR
SS
NLDS
II
RR
SS
40
Fixed Point
11.
11.
00.
00.
00.
00.
State when no node is infected
Q: Is it stable? 41
Stability for SIR
Stableunder threshold
Unstableabove threshold
42
λ * < 1
VPMC
Graph-based
Model-based
43
General VPM structure
Topology and stability
See paper See paper for full for full proofproof
Outline
• Motivation• Epidemics: what happens? (Theory)
– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 44AUT '12
λ * < 1VPMC
Graph-based
Model-basedGeneral VPM structure
Topology and stability
See paper See paper for full for full proofproof
45AUT '12 Prakash and Faloutsos 2012
Outline
• Motivation• Epidemics: what happens? (Theory)
– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 46AUT '12
Dynamic Graphs: Epidemic?
adjacency matrixadjacency matrix
8
8
Alternating behaviorsDAY (e.g., work)
Prakash and Faloutsos 2012 47AUT '12
adjacency matrixadjacency matrix
8
8
Dynamic Graphs: Epidemic?Alternating behaviorsNIGHT
(e.g., home)
Prakash and Faloutsos 2012 48AUT '12
• SIS model– recovery rate δ– infection rate β
• Set of T arbitrary graphs
Model Description
dayday
N
N nightnight
N
N , weekend…..
Infected
Healthy
XN1
N3
N2
Prob. βProb. β Prob. δ
Prakash and Faloutsos 2012 49AUT '12
• Informally, NO epidemic if
eig (S) = < 1
Our result: Dynamic Graphs Threshold
Single number! Largest eigenvalue of The system matrix S
In Prakash+, ECML-PKDD 2010
S =
Prakash and Faloutsos 2012 50AUT '12
Synthetic MIT Reality Mining
log(fraction infected)
Time
BELOW
AT
ABOVE ABOVE
AT
BELOW
Infection-profile
Prakash and Faloutsos 2012 51AUT '12
“Take-off” plotsFootprint (# infected @ “steady state”)
Our threshold
Our threshold
(log scale)
NO EPIDEMIC
EPIDEMIC
EPIDEMIC
NO EPIDEMIC
Synthetic MIT Reality
Prakash and Faloutsos 2012 52AUT '12
Outline
• Motivation• Epidemics: what happens? (Theory)
– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses
• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 53AUT '12
Competing Contagions
iPhone v Android Blu-ray v HD-DVD
54AUT '12 Prakash and Faloutsos 2012Biological common flu/avian flu, pneumococcal inf etc
A simple model
• Modified flu-like • Mutual Immunity (“pick one of the two”)• Susceptible-Infected1-Infected2-Susceptible
Virus 1 Virus 2
Prakash and Faloutsos 2012 55AUT '12
Question: What happens in the end?
green: virus 1red: virus 2
Footprint @ Steady State Footprint @ Steady State = ?
Number of Infections
Prakash and Faloutsos 2012 56AUT '12
ASSUME: Virus 1 is stronger than Virus 2
Question: What happens in the end?
green: virus 1red: virus 2
Number of Infections
Strength Strength
??= Strength Strength
2
Footprint @ Steady State Footprint @ Steady State
57AUT '12 Prakash and Faloutsos 2012
ASSUME: Virus 1 is stronger than Virus 2
Answer: Winner-Takes-All
green: virus 1red: virus 2
Number of Infections
58AUT '12 Prakash and Faloutsos 2012
ASSUME: Virus 1 is stronger than Virus 2
Our Result: Winner-Takes-All
Given our model, and any graph, the weaker virus always dies-out completely
Given our model, and any graph, the weaker virus always dies-out completely
1. The stronger survives only if it is above threshold 2. Virus 1 is stronger than Virus 2, if: strength(Virus 1) > strength(Virus 2)3. Strength(Virus) = λ β / δ same as before!
59AUT '12 Prakash and Faloutsos 2012In Prakash+ WWW 2012
Real Examples
Reddit v Digg Blu-Ray v HD-DVD
[Google Search Trends data]
60AUT '12 Prakash and Faloutsos 2012
Outline
• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)
Prakash and Faloutsos 2012 61AUT '12
?
?
Given: a graph A, virus prop. model and budget k; Find: k ‘best’ nodes for immunization (removal).
k = 2
??
Full Static Immunization
Prakash and Faloutsos 2012 62AUT '12
Outline
• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)– Fractional Immunization
Prakash and Faloutsos 2012 63AUT '12
Challenges
• Given a graph A, budget k, Q1 (Metric) How to measure the ‘shield-
value’ for a set of nodes (S)?
Q2 (Algorithm) How to find a set of k nodes with highest ‘shield-value’?
Prakash and Faloutsos 2012 64AUT '12
Proposed vulnerability measure λ
Increasing λ Increasing vulnerability
λ is the epidemic threshold
“Safe” “Vulnerable” “Deadly”
Prakash and Faloutsos 2012 65AUT '12
1
9
10
3
4
5
7
8
6
2
9
1
11
10
3
4
56
7
8
2
9
Original Graph Without {2, 6}
Eigen-Drop(S) Δ λ = λ - λs
Eigen-Drop(S) Δ λ = λ - λs
Δ
A1: “Eigen-Drop”: an ideal shield value
Prakash and Faloutsos 2012 66AUT '12
(Q2) - Direct Algorithm too expensive!
• Immunize k nodes which maximize Δ λ
S = argmax Δ λ• Combinatorial!• Complexity:
– Example: • 1,000 nodes, with 10,000 edges • It takes 0.01 seconds to compute λ• It takes 2,615 years to find 5-best nodes!
Prakash and Faloutsos 2012 67AUT '12
A2: Our Solution
• Part 1: Shield Value– Carefully approximate Eigen-drop (Δ λ)– Matrix perturbation theory
• Part 2: Algorithm– Greedily pick best node at each step– Near-optimal due to submodularity
• NetShield (linear complexity)– O(nk2+m) n = # nodes; m = # edges
Prakash and Faloutsos 2012 68AUT '12In Tong, Prakash+ ICDM 2010
Experiment: Immunization quality
Log(fraction of infected nodes)
NetShield
Degree
PageRank
Eigs (=HITS)Acquaintance
Betweeness (shortest path)
Lower is
better TimePrakash and Faloutsos 2012 69AUT '12
Outline
• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)
– Full Immunization (Static Graphs)– Fractional Immunization
Prakash and Faloutsos 2012 70AUT '12
Fractional Immunization of NetworksB. Aditya Prakash, Lada Adamic, Theodore Iwashyna (M.D.), Hanghang Tong, Christos Faloutsos
Under review
Prakash and Faloutsos 2012 71AUT '12
Fractional Asymmetric Immunization
Hospital Another Hospital
Drug-resistant Bacteria (like XDR-TB)
Prakash and Faloutsos 2012 72AUT '12
Fractional Asymmetric Immunization
Hospital Another Hospital
Drug-resistant Bacteria (like XDR-TB)
Prakash and Faloutsos 2012 73AUT '12
Fractional Asymmetric Immunization
Hospital Another Hospital
Problem: Given k units of disinfectant, how to distribute them to maximize
hospitals saved?
Prakash and Faloutsos 2012 74AUT '12
Our Algorithm “SMART-ALLOC”
CURRENT PRACTICE SMART-ALLOC
[US-MEDICARE NETWORK 2005]• Each circle is a hospital, ~3000 hospitals• More than 30,000 patients transferred
~6x fewer!
~6x fewer!
Prakash and Faloutsos 2012 75AUT '12
Running Time
≈
Simulations SMART-ALLOC
> 1 week
14 secs
> 30,000x speed-up!
Wall-Clock Time
Lower is better
Prakash and Faloutsos 2012 76AUT '12
Experiments
K = 200 K = 2000
PENN-NETWORK SECOND-LIFE
~5 x ~2.5 x
Lower is better
Prakash and Faloutsos 2012 77AUT '12
Acknowledgements
Funding
Prakash and Faloutsos 2012 78AUT '12
References1. Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B.
Aditya Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) - In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.)
2. Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B. Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain
3. Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point (Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain
4. Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon
5. On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia
6. Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission
7. Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under Submission
79
http://www.cs.cmu.edu/~badityap/AUT '12 Prakash and Faloutsos 2012
Analysis Policy/Action Data
Propagation on Large Networks
B. Aditya Prakash Christos Faloutsos
80AUT '12 Prakash and Faloutsos 2012