48
Supervised Random Walks Pawan Goyal CSE, IITKGP September 8, 2014 Footnotetext without footnote mark Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17

Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Supervised Random Walks

Pawan Goyal

CSE, IITKGP

September 8, 2014

Footnotetext without footnote markPawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17

Page 2: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Correlation Discovery by random walk

Problem definitionEstimate the importance/affinity of node “B” with respect to another node “A” inthe graph.

Framework: Random walk with restartsGoal: Compute the importance of node “B” for node “A"

Consider a random walker that starts from node “A”, choosing among theavailable edges every time

Except that, before he makes a choice, with probability c, he goes back tonode “A” (restart)

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 2 / 17

Page 3: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Correlation Discovery by random walk

Problem definitionEstimate the importance/affinity of node “B” with respect to another node “A” inthe graph.

Framework: Random walk with restartsGoal: Compute the importance of node “B” for node “A"

Consider a random walker that starts from node “A”, choosing among theavailable edges every time

Except that, before he makes a choice, with probability c, he goes back tonode “A” (restart)

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 2 / 17

Page 4: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Correlation Discovery by random walk

Problem definitionEstimate the importance/affinity of node “B” with respect to another node “A” inthe graph.

Framework: Random walk with restartsGoal: Compute the importance of node “B” for node “A"

Consider a random walker that starts from node “A”, choosing among theavailable edges every time

Except that, before he makes a choice, with probability c, he goes back tonode “A” (restart)

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 2 / 17

Page 5: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Correlation Discovery by random walk

Problem definitionEstimate the importance/affinity of node “B” with respect to another node “A” inthe graph.

Framework: Random walk with restartsGoal: Compute the importance of node “B” for node “A"

Consider a random walker that starts from node “A”, choosing among theavailable edges every time

Except that, before he makes a choice, with probability c, he goes back tonode “A” (restart)

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 2 / 17

Page 6: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Random walk with restarts

Let uA(B) denote the steady state probability that the random walker willfind himself at node “B”.

uA(B) is what we want, the importance of “B” with respect to “A”.

uA = (uA(1), . . . ,uA(N))

Steady-state vector: uA = (1− c)AuA + cvA

A: transition matrix, c: restart probability, vA: restart vector with all its Nelements zero except for the entry corresponding to node A.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 3 / 17

Page 7: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Random walk with restarts

Let uA(B) denote the steady state probability that the random walker willfind himself at node “B”.

uA(B) is what we want, the importance of “B” with respect to “A”.

uA = (uA(1), . . . ,uA(N))

Steady-state vector: uA = (1− c)AuA + cvA

A: transition matrix, c: restart probability, vA: restart vector with all its Nelements zero except for the entry corresponding to node A.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 3 / 17

Page 8: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Random walk with restarts

Let uA(B) denote the steady state probability that the random walker willfind himself at node “B”.

uA(B) is what we want, the importance of “B” with respect to “A”.

uA = (uA(1), . . . ,uA(N))

Steady-state vector: uA = (1− c)AuA + cvA

A: transition matrix, c: restart probability, vA: restart vector with all its Nelements zero except for the entry corresponding to node A.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 3 / 17

Page 9: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Random walk with restarts

Let uA(B) denote the steady state probability that the random walker willfind himself at node “B”.

uA(B) is what we want, the importance of “B” with respect to “A”.

uA = (uA(1), . . . ,uA(N))

Steady-state vector: uA = (1− c)AuA + cvA

A: transition matrix, c: restart probability, vA: restart vector with all its Nelements zero except for the entry corresponding to node A.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 3 / 17

Page 10: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Random walk with restarts

Let uA(B) denote the steady state probability that the random walker willfind himself at node “B”.

uA(B) is what we want, the importance of “B” with respect to “A”.

uA = (uA(1), . . . ,uA(N))

Steady-state vector: uA = (1− c)AuA + cvA

A: transition matrix, c: restart probability, vA: restart vector with all its Nelements zero except for the entry corresponding to node A.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 3 / 17

Page 11: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

The problem of link prediction and recommendation

Link PredictionWe are given a snapshot of a social network at time t

We seek to predict the edges that will be added to the network during theinterval from time t to a future time t′

e.g. we are given a large network, say Facebook, at time t and for each user we wouldlike to predict what new edges (friendships) that particular user will create between tand t′

Link Recommendation ProblemThe same problem can also be viewed as a link recommendation problem, where weaim to suggest to each user a list of people that the user is likely to create newconnections to.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 4 / 17

Page 12: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

The problem of link prediction and recommendation

Link PredictionWe are given a snapshot of a social network at time t

We seek to predict the edges that will be added to the network during theinterval from time t to a future time t′

e.g. we are given a large network, say Facebook, at time t and for each user we wouldlike to predict what new edges (friendships) that particular user will create between tand t′

Link Recommendation ProblemThe same problem can also be viewed as a link recommendation problem, where weaim to suggest to each user a list of people that the user is likely to create newconnections to.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 4 / 17

Page 13: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

The problem of link prediction and recommendation

Link PredictionWe are given a snapshot of a social network at time t

We seek to predict the edges that will be added to the network during theinterval from time t to a future time t′

e.g. we are given a large network, say Facebook, at time t and for each user we wouldlike to predict what new edges (friendships) that particular user will create between tand t′

Link Recommendation ProblemThe same problem can also be viewed as a link recommendation problem, where weaim to suggest to each user a list of people that the user is likely to create newconnections to.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 4 / 17

Page 14: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

The problem of link prediction and recommendation

Link PredictionWe are given a snapshot of a social network at time t

We seek to predict the edges that will be added to the network during theinterval from time t to a future time t′

e.g. we are given a large network, say Facebook, at time t and for each user we wouldlike to predict what new edges (friendships) that particular user will create between tand t′

Link Recommendation ProblemThe same problem can also be viewed as a link recommendation problem, where weaim to suggest to each user a list of people that the user is likely to create newconnections to.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 4 / 17

Page 15: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

The problem of link prediction and recommendation

Link PredictionWe are given a snapshot of a social network at time t

We seek to predict the edges that will be added to the network during theinterval from time t to a future time t′

e.g. we are given a large network, say Facebook, at time t and for each user we wouldlike to predict what new edges (friendships) that particular user will create between tand t′

Link Recommendation ProblemThe same problem can also be viewed as a link recommendation problem, where weaim to suggest to each user a list of people that the user is likely to create newconnections to.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 4 / 17

Page 16: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Challenges Involved

SparsityReal networks are really sparse, in Facebook, a typical user is connected toabout 100-200 out of more than 500 million nodes

Can it be modeled using network features only?New edges in Facebook social network

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 5 / 17

Page 17: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Challenges Involved

SparsityReal networks are really sparse, in Facebook, a typical user is connected toabout 100-200 out of more than 500 million nodes

Can it be modeled using network features only?New edges in Facebook social network

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 5 / 17

Page 18: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Creation of New Links: Important questions

How do network and node features interact?How important it is to have common interests and characteristics?

How important it is to be in the same social circle and be “close” in thenetwork in order to eventually connect.

Develop a method that combines the features of nodes (user profile) andedges (interaction) with the network structure

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 6 / 17

Page 19: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Creation of New Links: Important questions

How do network and node features interact?How important it is to have common interests and characteristics?

How important it is to be in the same social circle and be “close” in thenetwork in order to eventually connect.

Develop a method that combines the features of nodes (user profile) andedges (interaction) with the network structure

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 6 / 17

Page 20: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Creation of New Links: Important questions

How do network and node features interact?How important it is to have common interests and characteristics?

How important it is to be in the same social circle and be “close” in thenetwork in order to eventually connect.

Develop a method that combines the features of nodes (user profile) andedges (interaction) with the network structure

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 6 / 17

Page 21: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Supervised Random Walks

Basic IdeaIn a supervised way, learn how to bias a PageRank-like random walk on thenetwork so that it visits given nodes (positive training examples) more oftenthan the others.

Use node and edge features to learn edge strengths.

Random walk on such a weighted network will be more likely to visit“positive” than “negative” nodes.

Link Prediction: ‘positive’: nodes to which new edges will be created inthe future, negative: all other nodes.

Link recommendation: ‘positive’: nodes to which user clicks on

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 7 / 17

Page 22: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Supervised Random Walks

Basic IdeaIn a supervised way, learn how to bias a PageRank-like random walk on thenetwork so that it visits given nodes (positive training examples) more oftenthan the others.

Use node and edge features to learn edge strengths.

Random walk on such a weighted network will be more likely to visit“positive” than “negative” nodes.

Link Prediction: ‘positive’: nodes to which new edges will be created inthe future, negative: all other nodes.

Link recommendation: ‘positive’: nodes to which user clicks on

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 7 / 17

Page 23: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Learning Task

Training dataA source node s is given, along with the training examples to which s willcreate links in the future.

GoalLearn a function that assigns a strength (random walk probability) to eachedge.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 8 / 17

Page 24: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Learning Task

Training dataA source node s is given, along with the training examples to which s willcreate links in the future.

GoalLearn a function that assigns a strength (random walk probability) to eachedge.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 8 / 17

Page 25: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Link Prediction: Baseline Approaches

Link Prediction as a classification taskTake nodes to which s has created edges as positive training examples,all other nodes as negative training examples

Learn a classifier that predicts where node s is going to create links

Random walk with restartsStart a random walk at node s and compute the proximity of each other nodeto node s.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 9 / 17

Page 26: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Link Prediction: Baseline Approaches

Link Prediction as a classification taskTake nodes to which s has created edges as positive training examples,all other nodes as negative training examples

Learn a classifier that predicts where node s is going to create links

Random walk with restartsStart a random walk at node s and compute the proximity of each other nodeto node s.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 9 / 17

Page 27: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Relation to personalized PageRank

We are given a source node s and a set of destination nodesd1, . . . ,dk ∈ D to which s will create edges in the future

Aim is to bias the random walk such that it will visit nodes di more oftenthan the other nodes in the network

Can we directly set an arbitrary transition probability to each edge?

Would result in drastic over-fitting

Instead, we assign the transition probability for each edge (u,v) based onfeatures of nodes u and v, as well as features of edge (u,v).

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 10 / 17

Page 28: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Relation to personalized PageRank

We are given a source node s and a set of destination nodesd1, . . . ,dk ∈ D to which s will create edges in the future

Aim is to bias the random walk such that it will visit nodes di more oftenthan the other nodes in the network

Can we directly set an arbitrary transition probability to each edge?

Would result in drastic over-fitting

Instead, we assign the transition probability for each edge (u,v) based onfeatures of nodes u and v, as well as features of edge (u,v).

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 10 / 17

Page 29: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Relation to personalized PageRank

We are given a source node s and a set of destination nodesd1, . . . ,dk ∈ D to which s will create edges in the future

Aim is to bias the random walk such that it will visit nodes di more oftenthan the other nodes in the network

Can we directly set an arbitrary transition probability to each edge?

Would result in drastic over-fitting

Instead, we assign the transition probability for each edge (u,v) based onfeatures of nodes u and v, as well as features of edge (u,v).

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 10 / 17

Page 30: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Relation to personalized PageRank

We are given a source node s and a set of destination nodesd1, . . . ,dk ∈ D to which s will create edges in the future

Aim is to bias the random walk such that it will visit nodes di more oftenthan the other nodes in the network

Can we directly set an arbitrary transition probability to each edge?

Would result in drastic over-fitting

Instead, we assign the transition probability for each edge (u,v) based onfeatures of nodes u and v, as well as features of edge (u,v).

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 10 / 17

Page 31: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Relation to personalized PageRank

We are given a source node s and a set of destination nodesd1, . . . ,dk ∈ D to which s will create edges in the future

Aim is to bias the random walk such that it will visit nodes di more oftenthan the other nodes in the network

Can we directly set an arbitrary transition probability to each edge?

Would result in drastic over-fitting

Instead, we assign the transition probability for each edge (u,v) based onfeatures of nodes u and v, as well as features of edge (u,v).

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 10 / 17

Page 32: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Problem Formulation

Directed graph G(V,E)

Node s, destination nodes D = {d1, . . . ,dk} and no-link nodesL = {l1, . . . , ln}

Each edge (u,v) has a feature vector ψ(u,v) that describes the nodes uand v (e.g., gender, age, hometown) and the interaction attributes (e.g.,time of edge creation, messages exchanges, photos appeared togetherin)

Compute the strength auv = fw(ψuv) for edge (u,v).

We want to learn the function fw(ψ) in the training phase of the algorithm

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 11 / 17

Page 33: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Problem Formulation

Directed graph G(V,E)

Node s, destination nodes D = {d1, . . . ,dk} and no-link nodesL = {l1, . . . , ln}Each edge (u,v) has a feature vector ψ(u,v) that describes the nodes uand v (e.g., gender, age, hometown) and the interaction attributes (e.g.,time of edge creation, messages exchanges, photos appeared togetherin)

Compute the strength auv = fw(ψuv) for edge (u,v).

We want to learn the function fw(ψ) in the training phase of the algorithm

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 11 / 17

Page 34: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Problem Formulation

Directed graph G(V,E)

Node s, destination nodes D = {d1, . . . ,dk} and no-link nodesL = {l1, . . . , ln}Each edge (u,v) has a feature vector ψ(u,v) that describes the nodes uand v (e.g., gender, age, hometown) and the interaction attributes (e.g.,time of edge creation, messages exchanges, photos appeared togetherin)

Compute the strength auv = fw(ψuv) for edge (u,v).

We want to learn the function fw(ψ) in the training phase of the algorithm

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 11 / 17

Page 35: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Predicting new edges using Edge Strength

Edge strengths of all edges are calculated using fwRandom walk with restarts is run from s

Stationary distribution p of the random walk assigns each node u aprobability pu

Top ranked nodes are predicted as destinations of future links of s

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 12 / 17

Page 36: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Using edge weights

Function fw(ψuv) combines the attributes ψuv and the parameter vector wto output a non-negative weight auv for each edge

We use this to build the random walk stochastic transition matrix Q′ suchthat

Q′uv =auv

∑w auw,(u,v) ∈ E

Corresponding matrix for random walk with restart:

Quv = (1− c)Q′uv + c1(v = s)

Verify that Q is row stochastic

P1×n is the stationary distribution of the Random walk with restarts, and isthe solution of the following equation:

P = PQ

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 13 / 17

Page 37: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Using edge weights

Function fw(ψuv) combines the attributes ψuv and the parameter vector wto output a non-negative weight auv for each edge

We use this to build the random walk stochastic transition matrix Q′ suchthat

Q′uv =auv

∑w auw,(u,v) ∈ E

Corresponding matrix for random walk with restart:

Quv = (1− c)Q′uv + c1(v = s)

Verify that Q is row stochastic

P1×n is the stationary distribution of the Random walk with restarts, and isthe solution of the following equation:

P = PQ

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 13 / 17

Page 38: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Using edge weights

Function fw(ψuv) combines the attributes ψuv and the parameter vector wto output a non-negative weight auv for each edge

We use this to build the random walk stochastic transition matrix Q′ suchthat

Q′uv =auv

∑w auw,(u,v) ∈ E

Corresponding matrix for random walk with restart:

Quv = (1− c)Q′uv + c1(v = s)

Verify that Q is row stochastic

P1×n is the stationary distribution of the Random walk with restarts, and isthe solution of the following equation:

P = PQ

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 13 / 17

Page 39: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Using edge weights

Function fw(ψuv) combines the attributes ψuv and the parameter vector wto output a non-negative weight auv for each edge

We use this to build the random walk stochastic transition matrix Q′ suchthat

Q′uv =auv

∑w auw,(u,v) ∈ E

Corresponding matrix for random walk with restart:

Quv = (1− c)Q′uv + c1(v = s)

Verify that Q is row stochastic

P1×n is the stationary distribution of the Random walk with restarts, and isthe solution of the following equation:

P = PQ

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 13 / 17

Page 40: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Optimization Problem

Aim: Learn the parameters w of function fw(ψuv) that assigns each edgea strength of auv

Criterion: Assign the weights such that the random walk is more likely tovisit nodes in D than L, i.e., pl < pd, for each d ∈ D and l ∈ L

Optimization function

minwF(w) = ||w||2 such that ∀d ∈ D, l ∈ L : pl < pd

pis are the pageRank scoresA smaller w is preferred simply for regularization

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 14 / 17

Page 41: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Optimization Problem

Aim: Learn the parameters w of function fw(ψuv) that assigns each edgea strength of auv

Criterion: Assign the weights such that the random walk is more likely tovisit nodes in D than L, i.e., pl < pd, for each d ∈ D and l ∈ L

Optimization function

minwF(w) = ||w||2 such that ∀d ∈ D, l ∈ L : pl < pd

pis are the pageRank scoresA smaller w is preferred simply for regularization

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 14 / 17

Page 42: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Optimization function: Softer version

minwF(w) = ||w||2 +λ ∑d∈D,l∈L

h(pl−pd)

h(.) : loss function such that h(.) = 0 as pl < pd and h(.)> 0 for pl−pd > 0

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 15 / 17

Page 43: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Features used for the Facebook Network

For each edge (i, j),

The number of common friends between the two nodes

Communication and observation features: probability of communicationand profile observation in one week period

Edge initiator: Individual making the friend request is encoded as +1 or -1

Edge age

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 16 / 17

Page 44: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Features used for the Facebook Network

For each edge (i, j),

The number of common friends between the two nodes

Communication and observation features: probability of communicationand profile observation in one week period

Edge initiator: Individual making the friend request is encoded as +1 or -1

Edge age

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 16 / 17

Page 45: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Features used for the Facebook Network

For each edge (i, j),

The number of common friends between the two nodes

Communication and observation features: probability of communicationand profile observation in one week period

Edge initiator: Individual making the friend request is encoded as +1 or -1

Edge age

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 16 / 17

Page 46: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Features used for the Facebook Network

For each edge (i, j),

The number of common friends between the two nodes

Communication and observation features: probability of communicationand profile observation in one week period

Edge initiator: Individual making the friend request is encoded as +1 or -1

Edge age

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 16 / 17

Page 47: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

Features used for the Facebook Network

For each edge (i, j),

The number of common friends between the two nodes

Communication and observation features: probability of communicationand profile observation in one week period

Edge initiator: Individual making the friend request is encoded as +1 or -1

Edge age

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 16 / 17

Page 48: Supervised Random Walkscse.iitkgp.ac.in/.../SC14/supervised_random_walks.pdf · 2014. 11. 17. · A +cv A A: transition matrix, c: restart probability, v A: restart vector with all

References

Random Walk with Restarts: Pan, Jia-Yu, et al. “Automatic multimediacross-modal correlation discovery.” Proceedings of the tenth ACMSIGKDD international conference on Knowledge discovery and datamining. ACM, 2004.

Supervised Random Walks: Backstrom, Lars, and Jure Leskovec.“Supervised random walks: predicting and recommending links in socialnetworks.” Proceedings of the fourth ACM international conference onWeb search and data mining. ACM, 2011.

Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 17 / 17