73
Selected topics in distributed computing Shmuel Zaks [email protected]

Selected topics in distributed computing

  • Upload
    bjorn

  • View
    23

  • Download
    1

Embed Size (px)

DESCRIPTION

Selected topics in distributed computing. Shmuel Zaks [email protected]. Part 1: Lower bounds and impossibility results Part 2: On two fundamental notions: snapshot in a network, cost of synchronization Part 3 : Self stabilization. - PowerPoint PPT Presentation

Citation preview

Page 1: Selected topics in distributed computing

Selected topics in distributed computing

Shmuel [email protected]

Page 2: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 2

Part 1: Lower bounds and

impossibility results

Part 2: On two fundamental notions:

snapshot in a network,

cost of synchronization

Part 3: Self stabilization

Page 3: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 3

A new approach to fault toleranceA unified approach to transient

failures by formally incorporating them into the design model

Page 4: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 4

shared memory, synchronous

66

6

6 6

x

x

x

x

x7

7

7

7 7

88

8

8 8

xx

x

x

x

clock synchronization

Part 3.1: Clock synchronization (of Gouda and Herman)

Page 5: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 5

Program at each processor p :

choose a neighbor x of p ; if t(p) = t(x) then t(p):= t(p)+1;

Stabilizing if all start with the same time, however …

Page 6: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 6

6 7

6

4 7

Is it stabilizing? no! (may even deadlock)

if t(p) = t(x) then t(p):= t(p)+1;

Page 7: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 7

Program at each processor p :

choose a neighbor x of p ;

if t(p) = t(x) then t(p):= t(p)+1;

Is it stabilizing?

if t(p) < t(x) then t(p):= t(x);

Page 8: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 8

6 7

6

4 7

x 7

77

7

7

x

x

x

x

8

8

x

x

8x8x

x8

Is it stabilizing? yes!

if t(p) = t(x) then t(p):= t(p)+1;

if t(p) < t(x) then t(p):= t(x);

Page 9: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 9

6 7

6

4 7

x 8

87

7

7

x

x

x

x

9

9

x

x

8x8x

x8

Is it stabilizing? no!

if t(p) = t(x) then t(p):= t(p)+1;

if t(p) < t(x) then t(p):= t(x);

Page 10: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 10

Program at each processor p :

choose a neighbor x of p

if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):=

t(x);

fairly;

Is it stabilizing? yes!

Page 11: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 11

proof

f (s) =n×range(s) +top(s)

range(s) =max clock - min clock

top(s) =# max clocks

Page 12: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 12

6 7

6

4 7

f (s) =5 ×3+2 =17

f (s) =n×range(s) +top(s)

n=5, range(S)=7-4=3, top(S)=2

S:

Page 13: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 13

6 7

6

4 7

f (s) =n×range(s) +top(s)

1s s

n=5, range(S1)=7-7=0, top(S1)=5

S1:

1f (s ) =5 ×0 +5 =5 < 17 =f (s)

7

7

7 7

7

xx x

xx

Page 14: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 14

6 7

6

4 7

f (s) =n×range(s) +top(s)

2s s

n=5, range(S2)=8-7=1, top(S2)=2

S2:

2f (s ) =5 ×1+2 =7 < 17 =f (s)

7

7

7 8

8

xx x

xx

Page 15: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 15

6 7

6

4 7

f (s) =n×range(s) +top(s)

2 3s s

n=5, range(S3)=9-8=1, top(S3)=2

S3:

3 2f (s ) =5 ×1+2 =7 =7 =f (s )

7

7

7 8

8

xx x

xx

98

8

8 9

x

xx

xx

Page 16: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 16

6 7

6

47

2. s s' f (s' ) f (s)

1. f (s) =n×range(s) +top(s) n

3. Eventually s ... s' s.t.

f (s' ) <f (s)

hw: prove.

Use for:

correctness (especially termination) + complexity

S:

Page 17: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 17

distributed control ring of (finite-state machine) processes for each machine define privileges (when a

transition is enabled) nodes are influenced only by neighbors legitimate states in each legitimate state there is at least one

privileged machine in each legitimate state each possible move

will bring the system again to a legitimate state

Part 3.2: Mutual exclusion (Dijkstra’s 1st algorithm)

Page 18: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 18

A system is self-stabilizing if , regardless of the initial state and regardless of the privilege selected each time for the next move, at least one privilege will always be present and the system is guaranteed to find itself in a legitimate state after a finite number of moves.

Page 19: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 19

non-stablestates

stablestates

Finite number of transitions

Page 20: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 20

Token ring. N machines, denoted 0 through N-1. A machine is priviliged (“has token”) or

not priviliged. daemon/scheduler determines which of

the priviliged machines will make the next move

Legitimate States: all states with exactly one privileged machine.

In Dijkstra’s model:

Page 21: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 21

Notes to consider:

scheduler (daemon) : points at each time to one of the privileged

machines, or point at each time to one of the machines. centralized scheduler, distributed schedulerfairness: Do we need fairness, and of what kind? Unconditional Fairness: Each machine gets

pointed to infinitely often. Type of fairness: - each machine is pointed at infinitely often, - round robin, or - other.

Page 22: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 22

N processors 0,1,…,N-1 Solution with k-state Machines machine P0

if x[0] = x[N-1] then x[0] := x[0]+1 mod k

machine Pi ( i= 1,2,…,N-1)

if x[i] ≠ x[i-1] then x[i] := x[i-1]

Dijkstra’s 1st algorithm

Page 23: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 23

2

0

1

0

p0

p3

p1

p2

N=4, K=3

State : 0,1,2

Page 24: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 24

2

0

1

0

p0

p3

p1

p2

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

Page 25: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 25

2

0

1

0

p0

p3

p1

p2

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

Page 26: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 26

2

0

1

0

p0

p3p1

p2

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

So far:

Page 27: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 27

2

0

1

0

p0

p3p1

p2

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

21

2

2

0

0

0

0

K=3

Etc…

Example 1

Page 28: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 28

2

0

1

0

p0

p3p1

p2

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

21

2

2

0

1

1

0

K=2

0 11 0

0

0

1

Etc…

Example 2

Page 29: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 29

2

0

1

0

p0

p3p1

p2

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

21

2

2

0

1

1

0

K=2

0 11

1

0

0

0

1

1

0

01

0

0

1Etc…

Example 3

Page 30: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 30

Machine Number Privileged Non-privileged

0 X[0] = X[N-1] X[0] X[N-1] 1,…,N-1 X[i] X[i-1] X[i] = X[i-1]

Informally:“If machine is privileged, then make it non-privileged”.

Note:

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

Page 31: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 31

Legitimate States: exactly one privileged machine

2

0

1

0

p0

p3p1

p2

21

2

2

1

1

1

0

K=21

1

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

0

Page 32: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 32

Legitimate States

0 0 0 0 0 ……

1 0 0 0 0

1 1 0 0 0 0 0 K–1 K–1 K–1……

1 1 1 1 1 0 K-1 K–1 K–1 K–1

2 1 1 1 1 … 2 2 2 2 2 … K–1 K–1 K–1 K–1 K–1

Page 33: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 33

Goal: Show that from an arbitrary state, the system stabilizes to a legitimate state in a finite number of steps.

Intuition:

0 1 2 3 4

0 2 4 1 01 2 4 1 01 2 4 1 12 2 4 1 1

2 2 2 1 1

0 1 2 3 4

0 3 2 1 01 3 2 1 01 1 2 1 01 1 2 2 0

1 1 2 2 2

N=5, K = 5N=5, K = 4

Page 34: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 34

0 1 2 3 4

0 1 2 1 0 0 0 2 1 01 0 2 1 01 0 2 1 11 0 2 2 11 0 0 2 1

0 1 2 3 4

1 1 0 2 12 1 0 2 12 1 0 2 22 1 0 0 22 1 1 0 22 2 1 0 20 2 1 0 20 2 1 0 00 2 1 1 00 2 2 1 00 0 2 1 0

N=5, K = 3

hw: can you get to this configuration?

hw: which configurations have the same property? (Check also for all other examples.)

Page 35: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 35

Theorem:

Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 .

Page 36: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 36

Lemma 2: (liveness) At every configuration, at least one machine is privileged.

Lemma 3: Starting from any configuration, P0 will eventually make a move.

Corollary: Starting at any configuration, P0 will

make an infinite number of steps.(this still does not guarantee stabilization.)

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

Lemma 1: If all states are equal then the system is stabilized.

(Hw)

Page 37: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 37

Definition: Pi is called special in a configuration if

Lemma 4: If P0 is special in a configuration, then the system will eventually stabilize.

j i x[j ] x[i]

Lemma 5: If in a configuration one of the states is missing, then the system will eventually stabilize.

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

Page 38: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 38

Theorem:

Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 .

Proof:

Case 1: K< N-1 Case 2: K = N-1

Case 3: K = N Case 4: K > N

Page 39: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 39

Theorem:

Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 .

Proof:

( e.g., N=4, k=2)

Case 1. k < N-1

Page 40: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 40

Theorem:

Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 .

Proof:

Case 4. k > N

… Lemma 5

Page 41: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 41

Theorem:

Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 .

Proof:

Case 3. k = N

… Lemma 5Either one state is missing,

or all states are distinct. … Lemma 4

Page 42: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 42

Theorem:

Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2 .

Proof:

Case 2. k = N - 1

… Lemma 5Either one state is missing,

or one state occurs twice.

Page 43: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 43

… Lemma 4Either P0 is special,

or x[0] = x[i] i < N-

1

i = N-1

P0 is not enabled, and after any other move there is a mising state, and … Lemma 5.

If Pi, i 0 makes a move, there is missing state, and done by Lemma 5. If P0 makes a move, we get back to the previous case.

So, one state occurs twice.

Page 44: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 44

Theorem:

Assuming a distrubted scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-1 .

hw: prove

Page 45: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 45

S = a configuration (a system state)

r(S) = number of maximal runs of equal states Example: S = 2222011000, f(S) = 4 r(S) + 1 if first(S) = last(S)f(S) = r(S) otherwise.

Note: S is legitimate iff f(S) = 2.

Use of a potential function

Page 46: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 46

Show: If K > N and f(S) > 2, then f(S) never increases and is eventually decreased.

Recall: A common technique to prove termination and correctness

Hw: 1. finish the proof.

2. use f to analyze complexity

Page 47: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 47

Applications

Distributed data structures Digital and analog circuits Genetic algorithms Network protocols Sensor networks

Some general comments on self stabilization

Page 48: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 48

is a special kind of fault tolerance guarantees an automatic recovery from

a transient failure as a design goal is a too strong

property and thus either too difficult to achieve or achieved at the expense of other goals

Self stabilization

Page 49: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 49

Uniform protocol : same program at each processor

Theorem: There is no uniform protocol that solves the mutual exclusion problem (under either a centralized or a distributed scheduler)

Back to Dijkstra’s algorithm

machine Pi: if x[i] ≠ x[i-1] then x[i] := x[i-1]

machine P0: if x[0] = x[N-1] then x[0] := x[0]+1 mod k

Page 50: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 50

0

0

0

0

p0

p3

p1

p20 0

p6

p5

3

33

7

7

7

Page 51: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 51

N processors 0,1,…,N-1 Solution with 3-state Machines P0:

if x[0] +1=x[1] then x[0]:=x[0]-1 PN-1 :

if (x[N-2] =x[0]) (x[N-1]≠ x[0]+1) then x[N-1]:=x[0]+1;

Pi ) i= 1,2,…,N-2( :if (x[i]+1=x[i-1]) (x[i]+1=x[i+1])

then x[i] := x[i]+1;

Part 3.3: Mutual exclusion (Dijkstra’s 3rd algorithm)

Page 52: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 52

2

1

2

1

p0

p4p1

p2

State : 0,1,2

p3 0

Page 53: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 53

P0: if x[0] +1=x[1] then x[0]:=x[0]-1

2

1

2

1

p0

p4p1

p2p3 0

Page 54: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 54

PN-1 :

if (x[N-2] =x[0]) (x[N-1]≠ x[0]+1) then x[N-1]:=x[0]+1;

2

1

2

1

p0

p4p1

p2p3 0

Page 55: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 55

Pi ) i= 1,2,…,N-2( :if (x[i]+1=x[i-1]) (x[i]+1=x[i+1])

then x[i] := x[i]+1;2

1

2

1

p0

p4p1

p2p3 0

Page 56: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 56

Pi ) i= 1,2,…,N-2( :if (x[i]+1=x[i-1]) (x[i]+1=x[i+1])

then x[i] := x[i]+1;2

1

2

1

p0

p4p1

p2p3 0

Page 57: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 57

Pi ) i= 1,2,…,N-2( :if (x[i]+1=x[i-1]) (x[i]+1=x[i+1])

then x[i] := x[i]+1;2

1

2

1

p0

p4p1

p2p3 0

Page 58: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 58

2

1

2

1

p0

p4p1

p2p3 0

Page 59: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 59

1 1

p0

p4p1

p2p3 1

If all are in the same state?

if )x[N-2] =x[0]( & )x[N-1]≠ x[0]+1( then x[N-1]:=x[0]+1;

1

1

2

if )x[i]+1=x[i-1]( v )x[i]+1=x[i+1]( then x[i] := x[i]+1;

2

2

if x[0] +1=x[1] then x[0]:=x[0]-1

1

0

2

Page 60: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 60

[0] [1] [2] ... [ 1]x x x x n

0

12

Page 61: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 61

For two neighbors

[ ] [ 1]

[ ] [ 1]

[ ] [ 1]

x i x i

x i x i

x i x i

Page 62: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 62

8

[0] [1] [2] ... [ 1]

n

x x x x n

s: 2 1 2 0 1 1 2 2

f)s( = number of + 2 x number of

f)s( = 4 + 2 x 1 = 6

Page 63: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 63

Page 64: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 64

Lemma 1: If there is a unique arrow in a configuration, then it moves back and forth infinitely.

Corollary: It suffices to show that from any configuration we eventually reach a configuration with a unique arrow.

Lemma 2: (liveness) Starting from any initial configuration, at least one process is priveleged.

Page 65: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 65

Lemma 3: Starting from any configuration, eventually .

Proof: if during an execution - .

else - f=0, but then there is no arrow, and in

the next step there is a unique arrow, and

then apply Lemma 1 .

1f

1f

Page 66: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 66

Lemma 4: Between every two successive moves of PN-1 there is at least one move of P0.Proof: recall: the program for PN-1 :

if (x[N-2] =x[0]) (x[N-1]≠ x[0]+1)

then x[N-1]:=x[0]+1; After PN-1 makes a step x[N-1]=x[0]+1,

And this condition makes PN-1 non-privileged.

This condition can become true only after P0 makes a

step [ P0: if x[0] +1=x[1] then

x[0]:=x[0]-1 ]

.

Page 67: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 67

Lemma 5: A sequence of moves in which P0 does not move is finite.

(therefore – impossible by Lemma 2)Proof: by Lemma 4 – in such a sequence

PN-1 makes at most one step. so, from some point only Pi makes a steps, .

1 2i n

steps 1,2,3,4,5. 1,2 – no change in arrows 3,4,5 – decrease no. of arrows..

Page 68: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 68

Lemma 5: A sequence of moves in which P0 does not move is finite.

(therefore – impossible by Lemma 2)Proof (cont.): from some point only steps 1 and 2.

Follows from topology .

Page 69: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 69

Theorem: Starting from any initial configuration, the system eventually stabilizes.

Proof: by Lemma 5 P0 makes infinite no. of moves.

P0 … P0 … P0 … P0 … P0 … …

Page 70: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 70

Proof (cont.)

P0 … P0 … P0 … P0 … P0 … …

phases

“there exists a rightmost arrow, and it points to the right”

Falsified in each phase.P0 … P0 … P0 … P0 … P0 … …

F T F T F T F T F T

Page 71: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 71

This is done only in steps 3, 4, 6 6 - ok so, assume only 3 and 4. in each phase: they have Δf=-3. 0 and 7 Δf<=2. others Δf=0. so, each phase f is decreased by at least

1.

Page 72: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 72

Complexity?

)following Chernoy, Shalom and Z.(

hw: 2 25 131 stabilization time 36 18

n n

Page 73: Selected topics in distributed computing

l'Aquila, Distributed Computing, 12-14/1/09 73

• E. W. Dijkstra, Self-stabilizing systems in spite of distributed control, CACM, 17,11, 1974, pp. 643-644.

• E. W. Dijkstra, A belated proof of self-stabilization, Distributed Computing, 1, 1, 1986, pp. 5-6.

• M. G. Gouda and T. Herman, Stabilizing Unison, Department of Computer Science, 1990.

• V. Chernoy, M. Shalom and S. Zaks, A Self-stabilizing algorithm with tight bounds for mutual exclusion on a ring, DISC 2008.

References