Anomaly Detection and Virus Propagation in Large Graphs Christos Faloutsos CMU

Preview:

Citation preview

Anomaly Detection and Virus Propagation in

Large GraphsChristos Faloutsos

CMU

Thank you!

• Dr. Ching-Hao (Eric) Mao

• Prof. Kenneth Pao

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 2

Faloutsos, Prakash, Chau, Koutra, Akoglu 3

Outline

• Part 1: anomaly detection– OddBall (anomaly detection)– Belief Propagation– Conclusions

• Part 2: influence propagation

Taiwan'12

OddBall: Spotting Anomalies in Weighted Graphs

Leman Akoglu, Mary McGlohon, Christos Faloutsos

Carnegie Mellon University School of Computer Science

PAKDD 2010, Hyderabad, India

Main idea

For each node, • extract ‘ego-net’ (=1-step-away neighbors)• Extract features (#edges, total weight, etc etc)• Compare with the rest of the population

Faloutsos, Prakash, Chau, Koutra, Akoglu 5Taiwan'12

What is an egonet?

ego

6

egonet

Faloutsos, Prakash, Chau, Koutra, AkogluTaiwan'12

Selected Features Ni: number of neighbors (degree) of ego i Ei: number of edges in egonet i Wi: total weight of egonet i λw,i: principal eigenvalue of the weighted adjacency matrix of

egonet I

7Faloutsos, Prakash, Chau, Koutra, AkogluTaiwan'12

Near-Clique/Star

8Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Near-Clique/Star

9Faloutsos, Prakash, Chau, Koutra, AkogluTaiwan'12

Near-Clique/Star

10Faloutsos, Prakash, Chau, Koutra, AkogluTaiwan'12

Andrew Lewis (director)

Near-Clique/Star

11Faloutsos, Prakash, Chau, Koutra, AkogluTaiwan'12

Faloutsos, Prakash, Chau, Koutra, Akoglu 12

Outline

• Part 1: anomaly detection– OddBall (anomaly detection)– Belief Propagation– Conclusions

• Part 2: influence propagation

Taiwan'12

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 13

E-bay Fraud detection

w/ Polo Chau &Shashank Pandit, CMU[www’07]

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 14

E-bay Fraud detection

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 15

E-bay Fraud detection

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 16

E-bay Fraud detection - NetProbe

Popular press

And less desirable attention:• E-mail from ‘Belgium police’ (‘copy of your

code?’)

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 17

Faloutsos, Prakash, Chau, Koutra, Akoglu 18

Outline

• OddBall (anomaly detection)• Belief Propagation

– Ebay fraud– Symantec malware detection– Unification results

• Conclusions

Taiwan'12

Polo ChauMachine Learning Dept

Carey NachenbergVice President & Fellow

Jeffrey WilhelmPrincipal Software Engineer

Adam WrightSoftware Engineer

Prof. Christos FaloutsosComputer Science Dept

Polonium: Tera-Scale Graph Mining and Inference for Malware Detection

PATENT PENDING

SDM 2011, Mesa, Arizona

Polonium: The Data60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program

50+ million machines900+ million executable files

Constructed a machine-file bipartite graph (0.2 TB+)

1 billion nodes (machines and files)37 billion edges

Taiwan'12 20Faloutsos, Prakash, Chau, Koutra, Akoglu

Polonium: Key Ideas

• Use Belief Propagation to propagate domain knowledge in machine-file graph to detect malware

• Use “guilt-by-association” (i.e., homophily)– E.g., files that appear on machines with many bad

files are more likely to be bad• Scalability: handles 37 billion-edge graph

Taiwan'12 21Faloutsos, Prakash, Chau, Koutra, Akoglu

Polonium: One-Interaction Results

84.9% True Positive Rate1% False Positive Rate

True Positive Rate% of malware correctly identified

False Positive Rate% of non-malware wrongly labeled as malware 22

Ideal

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Faloutsos, Prakash, Chau, Koutra, Akoglu 23

Outline

• Part 1: anomaly detection– OddBall (anomaly detection)– Belief Propagation

• Ebay fraud• Symantec malware detection• Unification results

– Conclusions• Part 2: influence propagation

Taiwan'12

Unifying Guilt-by-Association Approaches:

Theorems and Fast Algorithms

Danai KoutraU Kang

Hsing-Kuo Kenneth Pao

Tai-You KeDuen Horng (Polo) Chau

Christos Faloutsos

ECML PKDD, 5-9 September 2011, Athens, Greece

Problem Definition:GBA techniques

Faloutsos, Prakash, Chau, Koutra, Akoglu 25

Given: Graph; & few labeled nodesFind: labels of rest(assuming network effects)

?

?

?

?

Taiwan'12

Homophily and Heterophily

Faloutsos, Prakash, Chau, Koutra, Akoglu 26

Step 1

Step 2

homophily heterophily

All methods handle homophily

NOT all methods handle heterophily

BUT

proposed method does!

Taiwan'12

Are they related?

• RWR (Random Walk with Restarts) – google’s pageRank (‘if my friends are important,

I’m important, too’)• SSL (Semi-supervised learning)

– minimize the differences among neighbors• BP (Belief propagation)

– send messages to neighbors, on what you believe about them

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 27

Are they related?

• RWR (Random Walk with Restarts) – google’s pageRank (‘if my friends are important,

I’m important, too’)• SSL (Semi-supervised learning)

– minimize the differences among neighbors• BP (Belief propagation)

– send messages to neighbors, on what you believe about them

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 28

YES!

Correspondence of Methods

Faloutsos, Prakash, Chau, Koutra, Akoglu 29

Method Matrix Unknown knownRWR [I – c AD-1] × x = (1-c)ySSL [I + a(D - A)] × x = y

FABP [I + a D - c’A] × bh = φh

0 1 01 0 10 1 0

? 0 1 1

1 1 1

d1 d2 d3

final labels/ beliefs

prior labels/ beliefs

adjacency matrix

Taiwan'12

Results: Scalability

Faloutsos, Prakash, Chau, Koutra, Akoglu 30

FABP is linear on the number of edges.

# of edges (Kronecker graphs)

runti

me

(min

)

Taiwan'12

Results (5): Parallelism

Faloutsos, Prakash, Chau, Koutra, Akoglu 31

FABP ~2x faster & wins/ties on accuracy.

runtime (min)

% a

ccur

acy

Taiwan'12

Faloutsos, Prakash, Chau, Koutra, Akoglu 32

Conclusions

• Anomaly detection: hand-in-hand with pattern discovery (‘anomalies’ == ‘rare patterns’)

• ‘OddBall’ for large graphs

• ‘NetProbe’ and belief propagation: exploit network effects.

• FaBP: fast & accurate

Taiwan'12

Faloutsos, Prakash, Chau, Koutra, Akoglu 33

Outline

• Part 1: anomaly detection– OddBall (anomaly detection)– Belief Propagation– Conclusions

• Part 2: influence propagation

Taiwan'12

Influence propagation in large graphs -

theorems and algorithmsB. Aditya Prakash

http://www.cs.cmu.edu/~badityap

Christos Faloutsoshttp://www.cs.cmu.edu/~christos

Carnegie Mellon University

Networks are everywhere!

Human Disease Network [Barabasi 2007]

Gene Regulatory Network [Decourty 2008]

Facebook Network [2010]

The Internet [2005]

Faloutsos, Prakash, Chau, Koutra, Akoglu 35Taiwan'12

Dynamical Processes over networks are also everywhere!

Faloutsos, Prakash, Chau, Koutra, Akoglu 36Taiwan'12

Why do we care?• Information Diffusion• Viral Marketing• Epidemiology and Public Health• Cyber Security• Human mobility • Games and Virtual Worlds • Ecology• Social Collaboration........

Faloutsos, Prakash, Chau, Koutra, Akoglu 37Taiwan'12

Why do we care? (1: Epidemiology)

• Dynamical Processes over networks[AJPH 2007]

CDC data: Visualization of the first 35 tuberculosis (TB) patients and their 1039 contacts

Diseases over contact networks

Faloutsos, Prakash, Chau, Koutra, Akoglu 38Taiwan'12

Why do we care? (1: Epidemiology)

• Dynamical Processes over networks

• Each circle is a hospital• ~3000 hospitals• More than 30,000

patients transferred

[US-MEDICARE NETWORK 2005]

Problem: Given k units of disinfectant, whom to immunize?

Faloutsos, Prakash, Chau, Koutra, Akoglu 39Taiwan'12

Why do we care? (1: Epidemiology)

CURRENT PRACTICE OUR METHOD

~6x fewer!

[US-MEDICARE NETWORK 2005]

Faloutsos, Prakash, Chau, Koutra, Akoglu 40Taiwan'12Hospital-acquired inf. took 99K+ lives, cost $5B+ (all per year)

Why do we care? (2: Online Diffusion)

> 800m users, ~$1B revenue [WSJ 2010]

~100m active users

> 50m users

Faloutsos, Prakash, Chau, Koutra, Akoglu 41Taiwan'12

Why do we care? (2: Online Diffusion)

• Dynamical Processes over networks

Celebrity

Buy Versace™!

Followers

Social Media MarketingFaloutsos, Prakash, Chau, Koutra, Akoglu 42Taiwan'12

High Impact – Multiple Settings

Q. How to squash rumors faster?

Q. How do opinions spread?

Q. How to market better?

epidemic out-breaks

products/viruses

transmit s/w patches

Faloutsos, Prakash, Chau, Koutra, Akoglu 43Taiwan'12

Research Theme

DATALarge real-world

networks & processes

ANALYSISUnderstanding

POLICY/ ACTIONManaging

Faloutsos, Prakash, Chau, Koutra, Akoglu 44Taiwan'12

In this talk

ANALYSISUnderstanding

Given propagation models:

Q1: Will an epidemic happen?

Faloutsos, Prakash, Chau, Koutra, Akoglu 45Taiwan'12

In this talk

Q2: How to immunize and control out-breaks better?

POLICY/ ACTIONManaging

Faloutsos, Prakash, Chau, Koutra, Akoglu 46Taiwan'12

Outline

• Part 1: anomaly detection• Part 2: influence propagation

• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 47Taiwan'12

A fundamental questionStrong Virus

Epidemic?

Faloutsos, Prakash, Chau, Koutra, Akoglu 48Taiwan'12

example (static graph)Weak Virus

Epidemic?

Faloutsos, Prakash, Chau, Koutra, Akoglu 49Taiwan'12

Problem Statement

Find, a condition under which– virus will die out exponentially quickly– regardless of initial infection condition

above (epidemic)

below (extinction)

# Infected

time

Separate the regimes?

Faloutsos, Prakash, Chau, Koutra, Akoglu 50Taiwan'12

Threshold (static version)

Problem Statement• Given:

–Graph G, and –Virus specs (attack prob. etc.)

• Find: –A condition for virus extinction/invasion

Faloutsos, Prakash, Chau, Koutra, Akoglu 51Taiwan'12

Threshold: Why important?

• Accelerating simulations• Forecasting (‘What-if’ scenarios)• Design of contagion and/or topology• A great handle to manipulate the spreading

– Immunization– Maximize collaboration…..

Faloutsos, Prakash, Chau, Koutra, Akoglu 52Taiwan'12

Outline

• Motivation• Epidemics: what happens? (Theory)

– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses

• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 53Taiwan'12

“SIR” model: life immunity (mumps)

• Each node in the graph is in one of three states– Susceptible (i.e. healthy)– Infected– Removed (i.e. can’t get infected again)

Prob. β Prob. δ

t = 1 t = 2 t = 3

Background

Faloutsos, Prakash, Chau, Koutra, Akoglu 54Taiwan'12

Terminology: continued

• Other virus propagation models (“VPM”)– SIS : susceptible-infected-susceptible, flu-like– SIRS : temporary immunity, like pertussis– SEIR : mumps-like, with virus incubation (E = Exposed)….………….

• Underlying contact-network – ‘who-can-infect-whom’

Background

Faloutsos, Prakash, Chau, Koutra, Akoglu 55Taiwan'12

Related Work R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press,

1991. A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks.

Cambridge University Press, 2010. F. M. Bass. A new product growth for model consumer durables. Management Science,

15(5):215–227, 1969. D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in

real networks. ACM TISSEC, 10(4), 2008. D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly

Connected World. Cambridge University Press, 2010. A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of

epidemics. IEEE INFOCOM, 2005. Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free

networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003. H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000. H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer

Lecture Notes in Biomathematics, 46, 1984. J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer

viruses. IEEE Computer Society Symposium on Research in Security and Privacy, 1991. J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE

Computer Society Symposium on Research in Security and Privacy, 1993. R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks.

Physical Review Letters 86, 14, 2001.

……… ……… ………

All are about either:

• Structured topologies (cliques, block-diagonals, hierarchies, random)

• Specific virus propagation models

• Static graphs

Background

Faloutsos, Prakash, Chau, Koutra, Akoglu 56Taiwan'12

Outline

• Motivation• Epidemics: what happens? (Theory)

– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses

• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 57Taiwan'12

How should the answer look like?

• Answer should depend on:– Graph– Virus Propagation Model (VPM)

• But how??– Graph – average degree? max. degree? diameter?– VPM – which parameters? – How to combine – linear? quadratic? exponential?

?diameterdavg ?/)( max22 ddd avgavg …..

Faloutsos, Prakash, Chau, Koutra, Akoglu 58Taiwan'12

Static Graphs: Our Main Result

• Informally,

For, any arbitrary topology (adjacency matrix A) any virus propagation model (VPM) in standard literature

the epidemic threshold depends only 1. on the λ, first eigenvalue of A, and 2. some constant , determined by

the virus propagation model

λVPMC

No epidemic if λ *

< 1

VPMCVPMC

Faloutsos, Prakash, Chau, Koutra, Akoglu 59Taiwan'12In Prakash+ ICDM 2011 (Selected among best papers).

w/ DeepayChakrabarti

Our thresholds for some models

• s = effective strength• s < 1 : below threshold

Models Effective Strength (s) Threshold (tipping point)

SIS, SIR, SIRS, SEIRs = λ .

s = 1

SIV, SEIV s = λ .

(H.I.V.) s = λ .

12

221

vv

v

2121 VVISIFaloutsos, Prakash, Chau, Koutra, Akoglu 60Taiwan'12

Our result: Intuition for λ

“Official” definition:• Let A be the adjacency

matrix. Then λ is the root with the largest magnitude of the characteristic polynomial of A [det(A – xI)].

• Doesn’t give much intuition!

“Un-official” Intuition • λ ~ # paths in the

graph

uu≈ .

kkA

(i, j) = # of paths i j of length k

kA

Faloutsos, Prakash, Chau, Koutra, Akoglu 61Taiwan'12

Largest Eigenvalue (λ)

λ ≈ 2 λ = N λ = N-1

N = 1000λ ≈ 2 λ= 31.67 λ= 999

better connectivity higher λ

Faloutsos, Prakash, Chau, Koutra, Akoglu 62Taiwan'12 N nodes

Examples: Simulations – SIR (mumps)

(a) Infection profile (b) “Take-off” plot

PORTLAND graph: synthetic population, 31 million links, 6 million nodes

Frac

tion

of In

fecti

ons

Foot

prin

tEffective StrengthTime ticks

Faloutsos, Prakash, Chau, Koutra, Akoglu 63Taiwan'12

Examples: Simulations – SIRS (pertusis)

Frac

tion

of In

fecti

ons

Foot

prin

tEffective StrengthTime ticks

(a) Infection profile (b) “Take-off” plot

PORTLAND graph: synthetic population, 31 million links, 6 million nodesFaloutsos, Prakash, Chau, Koutra, Akoglu 64Taiwan'12

λ * < 1

VPMC

Graph-based

Model-based

65

General VPM structure

Topology and stability

See paper for full proof

Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Outline

• Motivation• Epidemics: what happens? (Theory)

– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses

• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 66Taiwan'12

λ * < 1VPMC

Graph-based

Model-basedGeneral VPM structure

Topology and stability

See paper for full proof

67Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Outline

• Motivation• Epidemics: what happens? (Theory)

– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses

• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 68Taiwan'12

Dynamic Graphs: Epidemic?

adjacency matrix

8

8

Alternating behaviorsDAY (e.g., work)

Faloutsos, Prakash, Chau, Koutra, Akoglu 69Taiwan'12

adjacency matrix

8

8

Dynamic Graphs: Epidemic?Alternating behaviorsNIGHT

(e.g., home)

Faloutsos, Prakash, Chau, Koutra, Akoglu 70Taiwan'12

• SIS model– recovery rate δ– infection rate β

• Set of T arbitrary graphs

Model Description

day

N

N night

N

N , weekend…..

Infected

Healthy

XN1

N3

N2

Prob. βProb. β Prob. δ

Faloutsos, Prakash, Chau, Koutra, Akoglu 71Taiwan'12

• Informally, NO epidemic if

eig (S) = < 1

Our result: Dynamic Graphs Threshold

Single number! Largest eigenvalue of The system matrix S

In Prakash+, ECML-PKDD 2010

S =

Details

Faloutsos, Prakash, Chau, Koutra, Akoglu 72Taiwan'12

Synthetic MIT Reality Mining

log(fraction infected)

Time

BELOW

AT

ABOVE ABOVE

AT

BELOW

Infection-profile

Faloutsos, Prakash, Chau, Koutra, Akoglu 73Taiwan'12

“Take-off” plotsFootprint (# infected @ “steady state”)

Our threshold

Our threshold

(log scale)

NO EPIDEMIC

EPIDEMIC

EPIDEMIC

NO EPIDEMIC

Synthetic MIT Reality

Faloutsos, Prakash, Chau, Koutra, Akoglu 74Taiwan'12

Outline

• Motivation• Epidemics: what happens? (Theory)

– Background– Result (Static Graphs)– Proof Ideas (Static Graphs)– Bonus 1: Dynamic Graphs– Bonus 2: Competing Viruses

• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 75Taiwan'12

Competing Contagions

iPhone v Android Blu-ray v HD-DVD

76Taiwan'12 Faloutsos, Prakash, Chau, Koutra, AkogluBiological common flu/avian flu, pneumococcal inf etc

A simple model

• Modified flu-like • Mutual Immunity (“pick one of the two”)• Susceptible-Infected1-Infected2-Susceptible

Virus 1 Virus 2

Details

Faloutsos, Prakash, Chau, Koutra, Akoglu 77Taiwan'12

Question: What happens in the end?

green: virus 1red: virus 2

Footprint @ Steady State Footprint @ Steady State= ?

Number of Infections

Faloutsos, Prakash, Chau, Koutra, Akoglu 78Taiwan'12

ASSUME: Virus 1 is stronger than Virus 2

Question: What happens in the end?

green: virus 1red: virus 2

Number of Infections

Strength Strength

??= Strength Strength

2

Footprint @ Steady State Footprint @ Steady State

79Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

ASSUME: Virus 1 is stronger than Virus 2

Answer: Winner-Takes-All

green: virus 1red: virus 2

Number of Infections

80Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

ASSUME: Virus 1 is stronger than Virus 2

Our Result: Winner-Takes-All

Given our model, and any graph, the weaker virus always dies-out completely

1. The stronger survives only if it is above threshold 2. Virus 1 is stronger than Virus 2, if: strength(Virus 1) > strength(Virus 2)3. Strength(Virus) = λ β / δ same as before!

Details

81Taiwan'12 Faloutsos, Prakash, Chau, Koutra, AkogluIn Prakash, Beutel, + WWW 2012

Real Examples

Reddit v Digg Blu-Ray v HD-DVD

[Google Search Trends data]

82Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Outline

• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)

Faloutsos, Prakash, Chau, Koutra, Akoglu 83Taiwan'12

?

?

Given: a graph A, virus prop. model and budget k; Find: k ‘best’ nodes for immunization (removal).

k = 2

??

Full Static Immunization

Faloutsos, Prakash, Chau, Koutra, Akoglu 84Taiwan'12

Outline

• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)

– Full Immunization (Static Graphs)– Fractional Immunization

Faloutsos, Prakash, Chau, Koutra, Akoglu 85Taiwan'12

Challenges

• Given a graph A, budget k, Q1 (Metric) How to measure the ‘shield-

value’ for a set of nodes (S)?

Q2 (Algorithm) How to find a set of k nodes with highest ‘shield-value’?

Faloutsos, Prakash, Chau, Koutra, Akoglu 86Taiwan'12

Proposed vulnerability measure λ

Increasing λ Increasing vulnerability

λ is the epidemic threshold

“Safe” “Vulnerable” “Deadly”

Faloutsos, Prakash, Chau, Koutra, Akoglu 87Taiwan'12

1

9

10

3

4

5

7

8

6

2

9

1

11

10

3

4

56

7

8

2

9

Original Graph Without {2, 6}

Eigen-Drop(S) Δ λ = λ - λs

Δ

A1: “Eigen-Drop”: an ideal shield value

Faloutsos, Prakash, Chau, Koutra, Akoglu 88Taiwan'12

(Q2) - Direct Algorithm too expensive!

• Immunize k nodes which maximize Δ λ

S = argmax Δ λ• Combinatorial!• Complexity:

– Example: • 1,000 nodes, with 10,000 edges • It takes 0.01 seconds to compute λ• It takes 2,615 years to find 5-best nodes!

Faloutsos, Prakash, Chau, Koutra, Akoglu 89Taiwan'12

A2: Our Solution

• Part 1: Shield Value– Carefully approximate Eigen-drop (Δ λ)– Matrix perturbation theory

• Part 2: Algorithm– Greedily pick best node at each step– Near-optimal due to submodularity

• NetShield (linear complexity)– O(nk2+m) n = # nodes; m = # edges

Faloutsos, Prakash, Chau, Koutra, Akoglu 90Taiwan'12In Tong, Prakash+ ICDM 2010

Experiment: Immunization quality

Log(fraction of infected nodes)

NetShield

Degree

PageRank

Eigs (=HITS)Acquaintance

Betweeness (shortest path)

Lower is

better TimeFaloutsos, Prakash, Chau, Koutra, Akoglu 91Taiwan'12

Outline

• Motivation• Epidemics: what happens? (Theory)• Action: Who to immunize? (Algorithms)

– Full Immunization (Static Graphs)– Fractional Immunization

Faloutsos, Prakash, Chau, Koutra, Akoglu 92Taiwan'12

Fractional Immunization of NetworksB. Aditya Prakash, Lada Adamic, Theodore Iwashyna (M.D.), Hanghang Tong, Christos Faloutsos

Under review

Faloutsos, Prakash, Chau, Koutra, Akoglu 93Taiwan'12

Fractional Asymmetric Immunization

Hospital Another Hospital

Drug-resistant Bacteria (like XDR-TB)

Faloutsos, Prakash, Chau, Koutra, Akoglu 94Taiwan'12

Fractional Asymmetric Immunization

Hospital Another Hospital

Drug-resistant Bacteria (like XDR-TB)

Faloutsos, Prakash, Chau, Koutra, Akoglu 95Taiwan'12

Fractional Asymmetric Immunization

Hospital Another Hospital

Problem: Given k units of disinfectant, how to distribute them to maximize

hospitals saved?

Faloutsos, Prakash, Chau, Koutra, Akoglu 96Taiwan'12

Our Algorithm “SMART-ALLOC”

CURRENT PRACTICE SMART-ALLOC

[US-MEDICARE NETWORK 2005]• Each circle is a hospital, ~3000 hospitals• More than 30,000 patients transferred

~6x fewer!

Faloutsos, Prakash, Chau, Koutra, Akoglu 97Taiwan'12

Running Time

Simulations SMART-ALLOC

> 1 week

14 secs

> 30,000x speed-up!

Wall-Clock Time

Lower is better

Faloutsos, Prakash, Chau, Koutra, Akoglu 98Taiwan'12

Experiments

K = 200 K = 2000

PENN-NETWORK SECOND-LIFE

~5 x ~2.5 x

Lower is better

Faloutsos, Prakash, Chau, Koutra, Akoglu 99Taiwan'12

Acknowledgements

Funding

Faloutsos, Prakash, Chau, Koutra, Akoglu 100Taiwan'12

References1. Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B.

Aditya Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) - In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.)

2. Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B. Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain

3. Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point (Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain

4. Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon

5. On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia

6. Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission

7. Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under Submission

101

http://www.cs.cmu.edu/~badityap/Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Analysis Policy/Action Data

Propagation on Large Networks

B. Aditya Prakash Christos Faloutsos

102Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

Recommended