20
Network Archaeology: Uncovering Ancient Networks from Present-Day Interactions Saket Navlakha, Carl Kingsford Presentation by Roshna Ramesh

Network Archaeology: Uncovering Ancient Networks from Present-Day Interactions Saket Navlakha, Carl Kingsford Presentation by Roshna Ramesh

Embed Size (px)

Citation preview

Network Archaeology:Uncovering Ancient Networks from

Present-Day InteractionsSaket Navlakha, Carl Kingsford

Presentation byRoshna Ramesh

Introduction

• Objective: To gauge the growth history of present day networks.

• Present Scenario:– Only static snapshot of network is present– No node-by-node or edge-by-edge history of

changes.

Proposed models

• Growth models– Duplication mutation with complementarity(DMC)– Forest Fire– Preferential Attachment (PA)

Forward growth or backward growth?

Network Reconstruction Algorithms

• Given:– Present day network Gt at time t

• To find: what the network looked like at time t-Δt.

Set Δt=1 and apply Bayes’ theorem

Duplication Mutation with Complementarity(DMC)

• Given : simple, connected 2 node graph• Steps:– 1. Node v enters the network by duplicating from a

random anchor node u. Initially, v is connected to all of u ’s neighbors (and to no other nodes).

– 2. For each neighbor x of v, decide to modify the edge or its compliment with probability qmod. If the edge is to be modified, delete either edge (v,x) or (u,x) by the flip of a fair coin.

– 3. Add edge (u,v) with probability qcon.

DMC-Example

Reversing DMC

Considers common neighbors of u and v if those neighbors were not modified.

Considers neighbors of either u or v.

•Follows random deletion of u or v

Forest Fire (FF)

• Given : simple, connected 2 node graph• Steps:– 1. Node v enters the network, selects a random anchor

node u, and links to it.– 2. Node v randomly chooses neighbors of u and links to

them, where x is an integer chosen from a geometric distribution with mean p/(1-p). These vertices are flagged as active vertices.

– 3. Set u to each active vertex and recursively apply step 2. Node u becomes non-active. Stop when no active vertices remain.

FF-Example

u

x x

v

u u

Reversing FF

• Forest Fire Simulation Procedure– assume v does not exist in the network.– simulate the FF model starting from anchor u– Get S(v) as a list of visited nodes – Take the fraction of simulations in which S(v)=N(v)

• No uncertainty as to which node to remove.

Preferential Attachment (PA)

• Given: A graph with a set of k+1 nodes.– Each node u is assigned a probability of du/2m.– Create a probability distribution histogram– Choose k nodes according to distribution– Node v links to those k nodes from step 2.

Reversal of PA

• No anchor node for PA

• Compute which node is to be removed using

Tests

Model reversibility using the greedy likelihood algorithm

Recovery of past social networksEffect of deviation from the assumed model– Noise

Model reversibility using the greedy likelihood algorithm

• Gt=100• DMC:

– Increasing qcon-easier to reverse– Increasing qmod-difficult to reverse

• FF:

• PA:– Increasing k-easier to reverse– When k>15, Kendall tau value>80%

Result: PA is most easily reversible

p (value) Anchor relationships identified Performance

0.1 to 0.5 25% to 64% Good

Effect of deviation from the assumed model

Noise• DMC– Most sensitive to noise– Can tolerate noise up to 30%

• PA– Most resilient to noise

• FF– Between PA and DMC

Effect of noise on DMC model

Effect of noise on PA model

Recovery of past social networks

• Scenario: Only a single snapshot of network is available.

• Models were applied to part of the e Last.fm music social network– Nodes: users– Edges: links to users who are friends

• DMC performed poorly compared to PA. Why?

Conclusion

• DMC performed best in biological networks• PA performed best in social networks• An individual node’s journey can be tracked

through time• Present day networks are strongly lined to

their past