28
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Embed Size (px)

Citation preview

Page 1: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS

Fall 2011Prof. Jennifer WelchCSCE 668

Self Stabilization 1

Page 2: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Reference

CSCE 668Self Stabilization

2

Self-Stabilization, Shlomi Dolev, MIT Press, 2000. Chapter 2

Slides prepared for the book by Shlomi Dolev available at

http://www.cs.bgu.ac.il/~dolev/book/slides.html

Page 3: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Self-Stabilization

CSCE 668Self Stabilization

3

A powerful form of fault-tolerance. Starting from an arbitrary system

configuration, the algorithm is able to start working properly all on its own

Arbitrary system configuration is caused by some transient failure: message loss, corrupted memory, processor failure, loss of synchrony,…

As long as system is well-behaved sufficiently long, the algorithm can correct itself.

Paradigm has been applied to both shared memory and message passing models

Page 4: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Definitions

CSCE 668Self Stabilization

4

Execution no longer defined to start with an initial configuration instead can start with an arbitrary configuration

Depending on the problem to be solved, certain executions are considered legal, forming the set LE.

A configuration C is safe if every admissible execution starting with C is in LE.

An algorithm is self-stabilizing if every admissible execution reaches a safe configuration.

Page 5: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Self-Stabilization Definition

CSCE 668Self Stabilization

5

…………

…………

……

arbitraryconfiguration

safeconfiguration

legalexecution …

Page 6: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Communication Model

CSCE 668Self Stabilization

6

A "hybrid" of message passing and shared memory

Communication topology is represented as an undirected graph not necessarily fully connected

Processors correspond to vertices Corresponding to each edge (pi,pj) are two

shared read/write registers: Rij : written by pi and read by pj

Rji : written by pj and read by pi

Page 7: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Communication Model

CSCE 668Self Stabilization

7

p0

p1

p3

p2

R01R10

R12

R21

R32

R23

R31

R13

Page 8: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Self-Stabilizing Spanning Tree Definition

CSCE 668Self Stabilization

8

Every processor has a variable parent in its local state.

There is a distinguished root processor. LE consists of all admissible executions

in which the parent variables form a spanning tree rooted at root.

Page 9: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

SS Spanning Tree Algorithm

CSCE 668Self Stabilization

9

Each processor has local variable parent, id of neighbor who is parent dist, estimated distance to root

Root sets dist to 0, and copies state to all its "outgoing" registers

Non-root reads neighbors' states from “incoming” registers and adopts as its parent the neighbor with the smallest distance, and sets its distance to one more

Nodes perform these actions repeatedly

Page 10: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

SS Spanning Tree Algorithm

CSCE 668Self Stabilization

10

Code for root p0:

while true do parent := dist := 0 for each neighbor pj do

R0j := 0 // write shared variable

endfor

Page 11: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

SS Spanning Tree Algorithm

CSCE 668Self Stabilization

11

Code for non-root pi:while true do

for each neighbor pj do neigh-dist[j] := Rji // read shared variable

dist := 1 + min{neigh-dist[j] : pj is a neighbor} foundParent := false

for each neighbor pj do if !foundParent and neigh-dist[j] = dist - 1 then parent := j; foundParent := true endif

Rij := dist // write shared variable endforendwhile

storage of negative valuesis not allowed

Page 12: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Output of Spanning Tree Algorithm

CSCE 668Self Stabilization

12

2

0

1

3

2

1

1

2

numbers are distancesred arrows indicate parentsblack edges are non-tree edges

root

Page 13: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Correctness Proof of SS ST Alg

CSCE 668Self Stabilization

13

Definition: Executions are partitioned into asynchronous rounds, which are the shortest segments containing at least one step by each processor.

Definition: is the degree (maximum number of neighbors) of the communication graph.

Definition: D is the diameter of the communication graph.

Page 14: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Correctness Proof of SS ST Alg

CSCE 668Self Stabilization

14

Lemma: Consider any admissible execution. There exists T1 < T2 < … < TD such that after asynchronous round Tk:(a) every proc. at distance ≤ k from root has dist = shortest path distance to root and parent variables form a BFS tree(b) every proc. at distance > k from root has dist ≥ k.

Page 15: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Correctness Proof of SS ST Alg

CSCE 668Self Stabilization

15

Proof: By induction on k.

Basis (k = 1): Let T1 = 5.

Initially all distances are nonnegative. Procs might start with program counter in the middle of

an iteration of the outer while loop; after at most 2 rounds, partial iterations are done.

After next rounds, all non-root procs have completed read for-loop at least once and computed dist: all are > 0

After next rounds, all non-root procs have completed write for-loop at least once

After next rounds, all non-root procs have completed read for-loop at least once and computed dist: every neighbor of root reads 0 from root and > 0 from every other node, so sets dist to 1 and parent to root.

Page 16: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Correctness Proof of SS ST Alg

CSCE 668Self Stabilization

16

Induction (k > 1): Assume for k - 1 and show for k. Let Tk = Tk-1 + 2.

Consider the execution just after end of asynchronous round Tk-1. After next rounds, all non-root nodes have executed write for-

loop at least once (and written their dist values). After next rounds, all non-root nodes have executed read for-

loop at least once. Suppose pi is at distance d ≤ k from root.

pi has at least one neighbor pj at distance d-1 ≤ k-1 from root, and no neighbor that is closer to the root.

By inductive hypothesis, pj's register has correct value in it and all other neighbors of pi have registers with values ≥ d-1.

Thus pi correctly computes dist and parent. Suppose pi is at distance > k from root.

Every neighbor of pi is at distance ≥ k from root. By inductive hypothesis, all their registers have values ≥ k-1. Thus pi computes dist to be ≥ k.

Page 17: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Correctness Proof of SS ST Alg

CSCE 668Self Stabilization

17

Since every processor is at most distance D from root, previous lemma implies that a correct breadth-first spanning tree has been constructed after O(D) asynchronous rounds, no matter what the starting configuration.

Page 18: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Another Classic SS Algorithm

CSCE 668Self Stabilization

18

Proposed by Dijkstra Suggested for mutual exclusion

we will view it as a "token circulation" algorithm

Uses a stronger model of computation in one atomic step, a proc can read all its

"incoming" registers and write all its "outgoing" registers

Page 19: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Ring Communication Topology

CSCE 668Self Stabilization

19

Procs are arranged in a unidirectional ring.

Only need one register for each proc.

p0 p1

p3 p2

R3

R2

R1

R0

p0 writes into R0,p1 reads from R0,etc.

Page 20: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Processor's States

CSCE 668Self Stabilization

20

Each processor's state consists solely of an integer, ranging from 0 to K - 1 (for suitable value of K)

Actually, processor just stores this information in its register.

Page 21: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Definition of Holding the Token

CSCE 668Self Stabilization

21

Proc p0 holds the token if R0 = Rn-1.

Proc pi (other than p0) holds the token if Ri ≠ Ri-1.

Page 22: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Self-Stabilizing Token Circulation Definition

CSCE 668Self Stabilization

22

LE consists of all admissible executions in which in every configuration only one processor

holds the token and every processor holds the token infinitely

often

(Note resemblance to mutual exclusion problem.)

Page 23: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Dijkstra's Algorithm

CSCE 668Self Stabilization

23

code for p0:

while true do if R0 = Rn-1 then

R0 := (R0 + 1) mod K

endifendwhile

executes atomically

code for pi, i ≠ 0:

while true do if Ri≠ Ri-1 then

Ri := Ri-1

endifendwhile

Page 24: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Analysis of Dijkstra's Algorithm

CSCE 668Self Stabilization

24

Lemma: If all registers are equal in a configuration, then the configuration is safe.

Proof: p0 p1

p3 p2

3

3

3

3 Suppose K = 5.

4

4

4

4

0

0

0

0

1

Page 25: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Analysis of Dijkstra's Algorithm

CSCE 668Self Stabilization

25

If execution begins with arbitrary values between 0 and K-1 in the registers, how can we show that eventually all the values will be the same (i.e., reach a safe state)?

Depends on K being large enough. Suppose K = n+1 (so there are n+1

different values). Lemma 1: In every configuration, there

is at least one integer in {0,…,K-1} that does not appear in any register.

Page 26: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Analysis of Dijkstra's Algorithm

CSCE 668Self Stabilization

26

Lemma 2: In every admissible execution (starting from any configuration), p0 holds the token, and thus changes R0, at least once during every n rounds.

Proof: Suppose in contradiction there is a segment of n rounds in which p0 does not change R0.

Once p1 takes a step in the first round, R1 = R0, and this equality remains true.

Once p2 takes a step in the second round, R2 = R1 = R0, and this equality remains true.

… Once pn-1 takes a step in the (n-1)-st round, Rn-1 = Rn-2 = …

= R0.

So when p0 takes a step in the n-th round, it will change R0.

Page 27: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

Analysis of Dijkstra's Algorithm

CSCE 668Self Stabilization

27

Theorem: In any admissible execution starting at any configuration C, a safe configuration is reached within O(n2) rounds.

Proof: Let j be a value not in any register in C. By Lemma 2, p0 changes R0 (by incrementing

it) at least once every n rounds. Thus eventually R0 holds j, in configuration D,

after at most O(n2) rounds. Since other procs only copy values, no

register holds j between C and D. After at most n more rounds, the value j

propagates around the ring to pn-1.

Page 28: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1

What about Reducing K?

CSCE 668Self Stabilization

28

Easy to see that K = n (n different values) suffices: either there is a missing value or p0's value is unique.

Can also show that K = n - 1 (n-1 different values) suffices.

But if K < n - 1 (less than n-1 different values), then there is a counter-example.

If the strong atomicity model is weakened to our familiar read/write atomicity, then K > 2n - 2 suffices.