28
Many random walks are faster than one Noga Alon Tel Aviv University Chen Avin Ben Gurion University Michal Koucky Czech Academy of Sciences Gady Kozma Weizmann Institute Zvi Lotker Ben Gurion University Mark R. Tuttle Intel

Many random walks are faster than one

  • Upload
    elijah

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Many random walks are faster than one. Random walks. Random step: Move to an adjacent node chosen at random (and uniformly) Random walk: Take an infinite sequence of random steps Random walks are cool! Who needs applications?. Applications. Graph exploration - PowerPoint PPT Presentation

Citation preview

Page 1: Many random walks are faster than one

Many random walksare faster than one

Noga Alon Tel Aviv University

Chen Avin Ben Gurion University

Michal Koucky Czech Academy of Sciences

Gady Kozma Weizmann Institute

Zvi Lotker Ben Gurion University

Mark R. Tuttle Intel

Page 2: Many random walks are faster than one

Random walks

• Random step: – Move to an adjacent node chosen at random (and uniformly)

• Random walk: – Take an infinite sequence of random steps

• Random walks are cool! Who needs applications?

Page 3: Many random walks are faster than one

Applications• Graph exploration

– Randomization avoids need to know topology– Randomization rules when the graph is changing or unknown

• Tracking– Hunters and prey start on different nodes– Hunters must locate and track prey

• Communication: devices send messages at random– Exhibits locality, simplicity, low-overhead, robustness– Becoming a popular approach to mobile devices

• And querying, searching, routing, self-stabilization in wireless ad hoc, peer-to-peer, and distributed systems– Example: find the max node when the edges go up and down– Can’t use depth-first search: can’t backtrack over a missing edge

Page 4: Many random walks are faster than one

Latency is a problem• There are many measures of latency:

– Hitting time: Expected time E(H) to visit a given node

– Cover time: Expected time E(C) to visit all nodes

– Mixing time: Expected time to reach the stationary distribution

• A walks spends an v fraction of time at node v on average

• After the mixing time, a walk is at node v with probability v

Page 5: Many random walks are faster than one

Our questionCan multiple walks reduce the latency?

Choose a node v in a graph.Start k random walks from node v.

Can k walks cover the graph k times faster than 1 walk?

Our answer:

Many times yes,

but not always.

Page 6: Many random walks are faster than one

Outline• First some fun: Calculate speed-ups for simple graphs

– Clique (complete graph): linear speed-up– Barbell: exponential speed-up– Cycle: logarithmic speed-up

• Then some answers: When is linear speed-up possible? – Simple formulation of our linear speed-up result– General formulations of our linear speed-up result

• In terms of the ratio cover-time/hitting-time• In terms of mixing time

• Conclusions and open problems

Page 7: Many random walks are faster than one

Calculating speed-ups for simple graphs

Page 8: Many random walks are faster than one

Computer science probability• Coin flipping

– p is the probability a coin lands heads– 1/p is the expected waiting time until a coin lands heads

• Markov inequality

• Chernoff inequality

– When = log(n) and = 4, this probability is very small!

– If you expect log(n) samples to be bad, then with high probability fewer than 5log(n) really are bad

kkXP

1)(

4/2

))1(( eXP

4log44/log42 nee nn

Page 9: Many random walks are faster than one

Calculating speed-ups for

simple graphs

Let’s calculate E(C1) and E(Ck) for

the clique, the barbell, the cycle.

Page 10: Many random walks are faster than one

The clique

Page 11: Many random walks are faster than one

Clique hitting time: n

• A random walk starting at node A– Chooses a random node each step– Chooses node B with probability 1/n– Expected waiting time until choosing B is n

• Hitting time from A to B is n

A

B

Page 12: Many random walks are faster than one

Clique cover time: n log(n)

• Random walk visits nodes a1, a2, …, an

• Assume i visited nodes, n-i unvisited nodes– (n-i)/n is probability the next node chosen is unvisited– n/(n-i) is the expected timed until an unvisted node is chosen

• Cover time is

Ei = time to visit i+1st node after visiting ith node

a1 a2 … a3 … … ai … ai+1 … … an

i nodes visited

nnn

nn

n

n

n

nEEE n log1

1

1

121121

Page 13: Many random walks are faster than one

Clique speed-up: k• A k-walk chooses nodes k times faster

– 1 step of a k-walk chooses k nodes at random– k steps of a 1-walk chooses k nodes at random

• Calculate expectations, then regroup terms:

)()( 1 tCPktCP k

)()()()()( 111 k

t

k

tt

CkEtCkPtkCkPtCPCE

Page 14: Many random walks are faster than one

The barbell

Page 15: Many random walks are faster than one

Barbell cover time: n2

• The walk starts at O and moves to L or R: let’s say to L• The walk must move back to O in order to cover R• How long do we expect to wait for this L O transition?

– From L, the walk moves to O with probability 1/(n+1)

• Expect to fail n times and move to BL instead of O

– From BL, the walk takes a long time to return to L

• Remember the hitting time in the clique is n

• Expect n steps to return to L from inside BL

O RL

BL BR

21 ))(()()( nnnOLECE (trust me)

Page 16: Many random walks are faster than one

Barbell speed-up: 2k

• Start k=c log(n) walks on O (but let’s ignore ugly constants)

• Expect half to move to BL, half to BR: that’s log(n) in each

• Expect log(n) walks in BL and BR to stay there for n steps– Remember hitting time for the clique is n

• Expect log(n) walks in BL and BR to cover them in n steps– Remember k-walk cover time for the clique is n log(n)/k

• So expect log(n) walks to cover barbell in n steps, not n2

– Trust me: Proof must turn each “expect” into “with high probability”– Rejoice with me: That’s a speed up of n=2log(n) = 2k

O

BL BR

RL

Page 17: Many random walks are faster than one

The cycle

01n

ii-1i+1

Page 18: Many random walks are faster than one

Cycle cover time: n2

• Let Ei be expected time to reach 0 from i

– E0 = 0

– Ei = 1 + Ei+1/2 + Ei-1/2

– En = E1

• Solve these recurrence relations– Show Ei = (i-1)E1 - (i-1)i

• Notice Ei+1 – Ei = Ei – Ei-1 – 2

• Define Di+1 = Ei+1 – Ei and notice Di+1 = Di – 2 = E1 – 2i

– Show E1 ≈ n

• Notice E1 = En = (n-1)E1 - (n-1)n and solve for E1

• So Ei ≈ (i-1)n – (i-1)i = (i-1)(n-i)– Maximized at i = n/2 and maximum value is n2/4

01n

ii-1i+1

Page 19: Many random walks are faster than one

Cycle speed-up: log(k)• Theorem: If Ck n2/s then s log k

• Proof: We will show the following:

steps in hits walks the of one

s

nnkP

22

2

01n

n/2

ks

ke

ke

s

s

log

2

2

1

Page 20: Many random walks are faster than one

2

12

2

2

steps in hits w walksome

s

nnP

assumptionby s

nCE k

2

)(

Markoffby 2

1)2(2

s

nCP k

2

12

2

2

y probabilit with in hits walksome Sos

nn

2

12

2

y probabilit with in hit are nodes all Sos

n

Page 21: Many random walks are faster than one

• Walk w takes n/2 more steps in one direction than the other– Let Si = +1 or -1 indicate whether w moves left or right at step i

– Let Dt = S1 + S2 + + St be the difference in steps left – steps right

• We can show using Chernoff

• So

skes

nnP

steps in hits w walksome

22

2

ssn en

DP )2

( /2 2

ssn ken

DkPkPP )2

()()( /2 2hits w given ahits w some

Page 22: Many random walks are faster than one

These speed-ups are all over the map!(linear, exponential, logarithmic)

What is the right answer?

When is linear speed-up possible?A simple answer.

A general answer.

Page 23: Many random walks are faster than one

Matthews’ Theorem• Theorem: For any graph G

C1 H1 log (n)

• This bound may or may not be tight– On a clique, the cover time is nlog(n) and hitting time is n– On a line, the cover time and hitting time are both n2

Page 24: Many random walks are faster than one

Matthews’ Theorem for k walks• Theorem: For any graph G and k log(n)

Ck (e/k) H1 log (n) + noise

• Think of a random walk of length eH1 as a trial– Starting from any node, either the walk hits v or it doesn’t

• Bound the probability that log(n) trials fail– A walk hits v in H1 expected time (hitting time definition) – A walk of length eH1 fails to hit v with probability < 1/e (Markoff)– So log(n) walks of length eH1 fails with probability < (1/e)log n = 1/n

• Obtain log(n) trials using k random walks– k walks of length (log n/k) eH1 amount to log n trials

• So the k-walk cover time is (e/k) H1 log (n) + noise

Page 25: Many random walks are faster than one

Simple speed-up• Theorem: When Matthews’ bound is tight,

we have linear speed up for k log(n)

• Proof: – C1 H1 log (n) when Matthew’s bound is tight– Ck (e/k) H1 log (n) by previous result– Ck (e/k) C1

• Observations:– Matthews is tight for many important graphs: cliques,

expanders, torus, hypercubes, d-dimensional grids, d-regular balanced trees, certain random graphs, etc.

– We can prove a speed-up even when Matthews is not tight …

Page 26: Many random walks are faster than one

General speed-up• Speed-up in terms of cover-time/hitting-time ratio:

– Theorem: If R(n) = E(C1)/E(H1) and k R(n)1-, then

E(Ck) (1/k) E(C1) + noise– When Matthews is tight, R(n) = log(n)

• Replaces that constant e with 1, but at cost of slightly smaller k

• Speed-up in terms of mixing time:

– Theorem: If G is a d-regular graph with mixing time M, then

E(Ck) (M log(n)/k) E(C1) + noise

Page 27: Many random walks are faster than one

Expanders• Expanders are highly-connected, sparse graphs:

– Every nodes has degree d– Every set of at least half the nodes has at least n neighbors

• Expanders have many applications:– Robust communication networks– Error correcting codes, random number generators, cryptography– Distributed memories, sorting networks, topology, physics…

• Expanders yield impressive cover time speed-ups:– We proved linear speed-up for many graphs for k log n– We can prove linear speed-up for expanders for k n!

Page 28: Many random walks are faster than one

Conclusions• Linear speed-ups are possible for many important graphs

– Speed-ups are related to the ratio C1/H1 of cover and hitting times– Linear speed-ups occur when this ratio is large– This result is tight…

• Open problems:– Is the speed-up always at most k? always at least log k?– Is there a property characterizing speed-up better than C1/H1?– What if random walks start at different nodes, not the same node?– What is random walks can communicate or leave “breadcrumbs”?– What if the prey can move, not just the hunters?– What if the graph is actually changing dynamically?