30
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable Broadcasts Spring 2005 Aran Bergman

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20041

Principles of Reliable Distributed Systems

Recitation 5: Reliable Broadcasts

Spring 2005

Aran Bergman

Page 2: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20042

Last on 046272

• Consistent Global State– FIFO Order– Happens before relation (Causal Order)

• Synchronous vs. Asynchronous models

• Failure Models (Processes and Links)

• Reliable Broadcast Services

Page 3: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20043

Process Failure Models (Reminder)

• The diagram is organized in terms of severity.

• The arrows present proper subsets, i.e. Crash failure model is a proper subset of Receive Omission model.– Receive Omission: A faulty

process stops prematurely, or intermittently omits to receive messages sent to it, or both.

Crash

Receive OmissionSend Omission

General Omission

Timing

Authenticated Byzantine

Byzantine

Benign

Malicious

Page 4: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20044

Link Failure Models (Reminder)

• Reliable links: – every message sent is eventually delivered

• Failure types:– Crash– Loss (omission)– Timing– Byzantine

Page 5: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20045

Reliable Broadcast Specifications

• Validity: if a correct process broadcasts m then all correct processes eventually deliver m

• Agreement: if a correct process delivers m then all correct processes eventually deliver m– Uniform Agreement: if any process delivers m then all

correct processes eventually deliver m

• Integrity: m is delivered by a correct process at most once, and only if it was previously broadcast

Page 6: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20046

Reliable Broadcast (cont’d)

• What happens if a process fails during the broadcast of a message?

Page 7: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20047

FIFO Broadcast

• If a process broadcasts a message m before it broadcasts a message m’, then no correct process delivers m’ unless it has previously delivered m.

• Alternative definition?– “all messages broadcast by the same process are

delivered to all processes in the order they are sent”

• Are these definitions equivalent?

Page 8: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20048

Example 1

m1

m2

m3

p (fau lty)

q (correct)

• Also, this alternative definition forces faulty processes to deliver messages. (impossible)

Page 9: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 20049

Causal Broadcast

• If the broadcast of a message m causally precedes the broadcast of a message m’, then no correct process delivers m’ unless it has previously delivered m.

• Event e causally precedes event f (e→f) iff:– a process executes both e and f, in that order, or– e is the broadcast of some message m and f is the

delivery of m, or– There is an event h, such that e→h and h→f.

Page 10: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200410

Causal Broadcast (cont’d)

• Alternative definition?– “if the broadcast of m causally precedes the

broadcast of m’, then every correct process that delivers both messages must deliver m before m’.”

• Are these definitions equivalent?

Page 11: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200411

Example 2

m1

m2

B (fau lty)

A (fau lty)

C (correct)

• In a system with failures –– A delivers a message that is only delivered by B.

– B broadcasts a response to A.

– C delivers a response to a message it never delivers.

Page 12: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200412

Atomic Broadcast and Uniformity

• Atomic Broadcast = Total Order

• Uniform – limit the behavior of faulty processes– Agreement, Integrity– FIFO Order, Causal Order, Total Order

Page 13: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200413

Benign Failures

• Suppose processes are only subject to crash failures.– They operate correctly up to the time they crash

(by definition).

• Can we assume that the message deliveries that a process makes before crashing are always ‘correct’?

Page 14: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200414

Benign Failures (cont’d)

• Even if a faulty process behaves correctly until it crashes, it may still deliver messages out-of-order before it crashes!

• Coordinator-based Atomic Broadcast algorithm:– When a process intends to broadcast a message m, it first sends m

to a coordinator.

– The coordinator delivers messages in the order in which it receives them, and periodically informs the other processes of this message delivery order.

– Other processes deliver messages according to this order.

Page 15: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200415

Benign Failures (cont’d)

– If the coordinator crashes, another process takes over as coordinator.

Page 16: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200416

Broadcast Primitives

Page 17: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200417

Broadcast Algorithms

• Our model-– Asynchronous– Benign process failures– Link specifications:

• Validity: If p sends m to q, and both p and q and the link between them are correct, then q eventually receives m.

• Uniform Integrity: For any message m, q receives m at most once from p, and only if p previously sent m to q.

• Our algorithms –– Satisfy Uniform Integrity.– Not optimized.

Page 18: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200418

Notations

• Reliable broadcast:– broadcast(R,m), deliver(R,m)

• FIFO broadcast:– broadcast(F,m), deliver(F,m)

• Causal broadcast:– broadcast(C,m), deliver(C,m)

• Every message includes:– The sender’s ID, denoted: sender(m)– A sequence number, denoted: seq#(m)

Page 19: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200419

Reliable Broadcast

Page 20: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200420

Reliable Broadcast (cont’d)

• When does the algorithm provide Reliable Broadcast?

• If we assume that:– There are only receive-omission failures– Every process p (whether correct or faulty) is

connected to every correct process via a path consisting entirely of correct processes and links (with the possible exception of p itself)

• Then the algorithm satisfies Uniform Agreement.

Page 21: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200421

FIFO Broadcast

• We give a reduction of FIFO Broadcast to Reliable Broadcast.

• The only assumption is that we have Reliable Broadcast. We don’t need the other assumptions (apart for benign failures for Uniform Integrity).

Page 22: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200422

FIFO Broadcast (cont’d)

Page 23: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200423

FIFO Broadcast (cont’d)

• The given algorithm also satisfies Uniform FIFO Broadcast.

• If the Reliable Broadcast algorithm used satisfies Uniform Agreement, the algorithm also satisfies Uniform Agreement.

Page 24: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200424

Causal Broadcast

• Why not use LTS?– It gives us causal delivery order + total order!

• In the lecture notes you saw an implementation with Vector Clocks

Page 25: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200425

Causal Broadcast (cont’d)

Page 26: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200426

Causal Broadcast (cont’d)

• We give a reduction of Causal Broadcast to Uniform FIFO Broadcast.

• The algorithm satisfies Uniform Causal Order.

• If the FIFO Broadcast satisfies Uniform Agreement, the derived algorithm also satisfies Uniform Agreement.

Page 27: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200428

Causal Broadcast (cont’d)

• The above algorithm is a “brute force” one (and very inefficient in message length)

• Instead of sending the messages in rcntDlvrs, we can maintain a msgList (like msgSet, but maintains order) of F-delivered messages and send only message IDs.

• Each process, when F-delivering a message, should check the msgList to see if it can deliver messages according to the order of received IDs.

Page 28: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200429

Causal Broadcast (cont’d)

• Since we have FIFO Broadcast, we don’t need to send all the IDs. Only the ID of the last message a process delivered from each process.

• Thus we get Vector Clocks

Page 29: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200430

Causal Broadcast (Take II)

Page 30: Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2004 1 Principles of Reliable Distributed Systems Recitation 5: Reliable

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 200431

Uniform Specifications

• Uniform Agreement: If a process (whether correct or faulty) delivers a message m, then all correct processes eventually deliver m.

• Uniform Integrity: For any message m, every process (whether correct or faulty) delivers m at most once, and only if some process broadcast m.

• Uniform FIFO Order: If a process broadcasts a message m before it broadcasts a message m’, then no process (whether correct of faulty) delivers m’ unless it has previously delivered m.

• Uniform Causal Order: If the broadcast of a message m causally precedes the broadcast of a message m’, then no process (whether correct or faulty) delivers m’ unless it has previously delivered m.

• Uniform Total Order: if any processes p and q (whether correct or faulty) both deliver messages m and m’, then p delivers m before m’ iff q delivers m before m’.