Many random walks are faster than one

Many random walksare faster than one

Noga Alon Tel Aviv University

Chen Avin Ben Gurion University

Michal Koucky Czech Academy of Sciences

Gady Kozma Weizmann Institute

Zvi Lotker Ben Gurion University

Mark R. Tuttle Intel

Random walks

• Random step: – Move to an adjacent node chosen at random (and uniformly)

• Random walk: – Take an infinite sequence of random steps

• Random walks are cool! Who needs applications?

Applications• Graph exploration

– Randomization avoids need to know topology– Randomization rules when the graph is changing or unknown

• Tracking– Hunters and prey start on different nodes– Hunters must locate and track prey

• Communication: devices send messages at random– Exhibits locality, simplicity, low-overhead, robustness– Becoming a popular approach to mobile devices

• And querying, searching, routing, self-stabilization in wireless ad hoc, peer-to-peer, and distributed systems– Example: find the max node when the edges go up and down– Can’t use depth-first search: can’t backtrack over a missing edge

Latency is a problem• There are many measures of latency:

– Hitting time: Expected time E(H) to visit a given node

– Cover time: Expected time E(C) to visit all nodes

– Mixing time: Expected time to reach the stationary distribution

• A walks spends an v fraction of time at node v on average

• After the mixing time, a walk is at node v with probability v

Our questionCan multiple walks reduce the latency?

Choose a node v in a graph.Start k random walks from node v.

Can k walks cover the graph k times faster than 1 walk?

Our answer:

Many times yes,

but not always.

Outline• First some fun: Calculate speed-ups for simple graphs

– Clique (complete graph): linear speed-up– Barbell: exponential speed-up– Cycle: logarithmic speed-up

• Then some answers: When is linear speed-up possible? – Simple formulation of our linear speed-up result– General formulations of our linear speed-up result

• In terms of the ratio cover-time/hitting-time• In terms of mixing time

• Conclusions and open problems

Calculating speed-ups for simple graphs

Computer science probability• Coin flipping

– p is the probability a coin lands heads– 1/p is the expected waiting time until a coin lands heads

• Markov inequality

• Chernoff inequality

– When = log(n) and = 4, this probability is very small!

– If you expect log(n) samples to be bad, then with high probability fewer than 5log(n) really are bad

kkXP

1)(

4/2

))1(( eXP

4log44/log42 nee nn

Calculating speed-ups for

simple graphs

Let’s calculate E(C1) and E(Ck) for

the clique, the barbell, the cycle.

The clique

Clique hitting time: n

• A random walk starting at node A– Chooses a random node each step– Chooses node B with probability 1/n– Expected waiting time until choosing B is n

• Hitting time from A to B is n

A

B

Clique cover time: n log(n)

• Random walk visits nodes a1, a2, …, an

• Assume i visited nodes, n-i unvisited nodes– (n-i)/n is probability the next node chosen is unvisited– n/(n-i) is the expected timed until an unvisted node is chosen

• Cover time is

Ei = time to visit i+1st node after visiting ith node

a1 a2 … a3 … … ai … ai+1 … … an

i nodes visited

nnn

nn

n

n

n

nEEE n log1

1

1

121121

Clique speed-up: k• A k-walk chooses nodes k times faster

– 1 step of a k-walk chooses k nodes at random– k steps of a 1-walk chooses k nodes at random

• Calculate expectations, then regroup terms:

)()( 1 tCPktCP k

)()()()()( 111 k

t

k

tt

CkEtCkPtkCkPtCPCE

The barbell

Barbell cover time: n2

• The walk starts at O and moves to L or R: let’s say to L• The walk must move back to O in order to cover R• How long do we expect to wait for this L O transition?

– From L, the walk moves to O with probability 1/(n+1)

• Expect to fail n times and move to BL instead of O

– From BL, the walk takes a long time to return to L

• Remember the hitting time in the clique is n

• Expect n steps to return to L from inside BL

O RL

BL BR

21 ))(()()( nnnOLECE (trust me)

Barbell speed-up: 2k

• Start k=c log(n) walks on O (but let’s ignore ugly constants)

• Expect half to move to BL, half to BR: that’s log(n) in each

• Expect log(n) walks in BL and BR to stay there for n steps– Remember hitting time for the clique is n

• Expect log(n) walks in BL and BR to cover them in n steps– Remember k-walk cover time for the clique is n log(n)/k

• So expect log(n) walks to cover barbell in n steps, not n2

– Trust me: Proof must turn each “expect” into “with high probability”– Rejoice with me: That’s a speed up of n=2log(n) = 2k

O

BL BR

RL

The cycle

01n

ii-1i+1

Cycle cover time: n2

• Let Ei be expected time to reach 0 from i

– E0 = 0

– Ei = 1 + Ei+1/2 + Ei-1/2

– En = E1

• Solve these recurrence relations– Show Ei = (i-1)E1 - (i-1)i

• Notice Ei+1 – Ei = Ei – Ei-1 – 2

• Define Di+1 = Ei+1 – Ei and notice Di+1 = Di – 2 = E1 – 2i

– Show E1 ≈ n

• Notice E1 = En = (n-1)E1 - (n-1)n and solve for E1

• So Ei ≈ (i-1)n – (i-1)i = (i-1)(n-i)– Maximized at i = n/2 and maximum value is n2/4

01n

ii-1i+1

Cycle speed-up: log(k)• Theorem: If Ck n2/s then s log k

• Proof: We will show the following:

steps in hits walks the of one

s

nnkP

22

2

01n

n/2

ks

ke

ke

s

s

log

2

2

1

2

12

2

2

steps in hits w walksome

s

nnP

assumptionby s

nCE k

2

)(

Markoffby 2

1)2(2

s

nCP k

2

12

2

2

y probabilit with in hits walksome Sos

nn

2

12

2

y probabilit with in hit are nodes all Sos

n

• Walk w takes n/2 more steps in one direction than the other– Let Si = +1 or -1 indicate whether w moves left or right at step i

– Let Dt = S1 + S2 + + St be the difference in steps left – steps right

• We can show using Chernoff

• So

skes

nnP

steps in hits w walksome

22

2

ssn en

DP )2

( /2 2

ssn ken

DkPkPP )2

()()( /2 2hits w given ahits w some

These speed-ups are all over the map!(linear, exponential, logarithmic)

What is the right answer?

When is linear speed-up possible?A simple answer.

A general answer.

Matthews’ Theorem• Theorem: For any graph G

C1 H1 log (n)

• This bound may or may not be tight– On a clique, the cover time is nlog(n) and hitting time is n– On a line, the cover time and hitting time are both n2

Matthews’ Theorem for k walks• Theorem: For any graph G and k log(n)

Ck (e/k) H1 log (n) + noise

• Think of a random walk of length eH1 as a trial– Starting from any node, either the walk hits v or it doesn’t

• Bound the probability that log(n) trials fail– A walk hits v in H1 expected time (hitting time definition) – A walk of length eH1 fails to hit v with probability < 1/e (Markoff)– So log(n) walks of length eH1 fails with probability < (1/e)log n = 1/n

• Obtain log(n) trials using k random walks– k walks of length (log n/k) eH1 amount to log n trials

• So the k-walk cover time is (e/k) H1 log (n) + noise

Simple speed-up• Theorem: When Matthews’ bound is tight,

we have linear speed up for k log(n)

• Proof: – C1 H1 log (n) when Matthew’s bound is tight– Ck (e/k) H1 log (n) by previous result– Ck (e/k) C1

• Observations:– Matthews is tight for many important graphs: cliques,

expanders, torus, hypercubes, d-dimensional grids, d-regular balanced trees, certain random graphs, etc.

– We can prove a speed-up even when Matthews is not tight …

General speed-up• Speed-up in terms of cover-time/hitting-time ratio:

– Theorem: If R(n) = E(C1)/E(H1) and k R(n)1-, then

E(Ck) (1/k) E(C1) + noise– When Matthews is tight, R(n) = log(n)

• Replaces that constant e with 1, but at cost of slightly smaller k

• Speed-up in terms of mixing time:

– Theorem: If G is a d-regular graph with mixing time M, then

E(Ck) (M log(n)/k) E(C1) + noise

Expanders• Expanders are highly-connected, sparse graphs:

– Every nodes has degree d– Every set of at least half the nodes has at least n neighbors

• Expanders have many applications:– Robust communication networks– Error correcting codes, random number generators, cryptography– Distributed memories, sorting networks, topology, physics…

• Expanders yield impressive cover time speed-ups:– We proved linear speed-up for many graphs for k log n– We can prove linear speed-up for expanders for k n!

Conclusions• Linear speed-ups are possible for many important graphs

– Speed-ups are related to the ratio C1/H1 of cover and hitting times– Linear speed-ups occur when this ratio is large– This result is tight…

• Open problems:– Is the speed-up always at most k? always at least log k?– Is there a property characterizing speed-up better than C1/H1?– What if random walks start at different nodes, not the same node?– What is random walks can communicate or leave “breadcrumbs”?– What if the prey can move, not just the hunters?– What if the graph is actually changing dynamically?

Documents

Many random walks are faster than one