24
EECS 491 Introduction to Distributed Systems Fall 2019 Harsha V. Madhyastha

EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

  • Upload
    others

  • View
    25

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

EECS 491Introduction to Distributed

Systems

Fall 2019

Harsha V. Madhyastha

Page 2: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Case study: GFS

● Google File System◆ Distributed storage system tailored to Google’s

workload

● Workload characteristics and setting:◆ Multi-GB files◆ Files are mostly appended to◆ Failures are extremely common

September 24, 2019 EECS 491 – Lecture 7 2

Page 3: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

High-level Design● Files are split into 64 MB chunks

● Every chunk is replicated on three randomly selected machines

● A central chunkmaster server picks replicas of every chunk

September 24, 2019 EECS 491 – Lecture 7 3

Page 4: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

GFS Overview

September 24, 2019 EECS 491 – Lecture 7 4

Page 5: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Replication in GFS

Client Primary

Backup

Backup

Chunkmaster

September 24, 2019 EECS 491 – Lecture 7 5

Page 6: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Replication in GFS

● High latency to distant primary◆ In data center, bandwidth degrades with distance

September 24, 2019 EECS 491 – Lecture 7 6

Client Primary

Backup

Backup

Chunkmaster

Page 7: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Replication in GFS

● High latency to distant primary● Submitting write to nearest replica will

compromise total ordering of writes

September 24, 2019 EECS 491 – Lecture 7 7

Client Primary

Backup

Backup

Chunkmaster Client2

Page 8: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Replication in GFS

● High latency to distant primary● Writing to nearest replica compromises total ordering● Optimize performance without violating consistency?

September 24, 2019 EECS 491 – Lecture 7 8

Client Primary

Backup

Backup

Chunkmaster Client2

Page 9: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Data flow vs. Control flow

September 24, 2019 EECS 491 – Lecture 7 9

Page 10: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

GFS Performance Benchmark

September 24, 2019 EECS 491 – Lecture 7 10

Page 11: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Implementing RSMs

● Logical clock based ordering of requests◆ Cannot serve requests if any one replica is down

● Primary-backup replication◆ Replace primary/backup upon failure

September 24, 2019 EECS 491 – Lecture 7 11

Page 12: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Availability of P/B-based RSM● When is RSM unavailable to serve requests?

● Temporarily:◆ While primary is bootstrapping new backup◆ Replica is down but viewservice yet to detect

● Permanently:◆ Primary↔backup down but both can talk to viewservice◆ Primary fails while bootstrapping backup

September 24, 2019 EECS 491 – Lecture 7 12

Page 13: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

How to …

● … make RSM tolerant to network partitions?

● … ensure that operations don’t block even if some machines are unavailable?

September 24, 2019 EECS 491 – Lecture 7 13

Page 14: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

RSM via Consensus

● Idea: Apply update if majority of replicas commit● If 2f+1 replicas, need f+1 to commit

● Why majority? Why not fewer or more?● Remaining replicas cannot accept some other

update

September 24, 2019 EECS 491 – Lecture 7 14

Page 15: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Context for Today’s Lecture

● Say all replicas are in sync with each other

● First: Among several concurrent new updates, how to pick next update to apply?

● Later: How to apply all updates in a consistent order at all replicas?

September 24, 2019 EECS 491 – Lecture 7 15

Page 16: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Let’s plan a camping trip!● Before going away on

internships, three friends plan on going camping

● Can only coordinate via unreliable text messages

● How to decide on a camp to meet at?

September 24, 2019 EECS 491 – Lecture 7 16

Alice

BobSam

Page 17: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Strawman Approaches● Every user sends their proposal to everyone● Every user accepts first proposal received● Proposal accepted by majority is chosen● Why might this not work?

● Every user tags proposal with seq number● Every user collects proposals and accepts

highest seq number proposal● Why might this not work?

September 24, 2019 EECS 491 – Lecture 7 17

Page 18: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Paxos

● Original paper submitted in 1990◆ Tells mythical story of Greek island of Paxos with

“legislators” and “current law” passed through parliamentary voting protocol

● Widely used in industry todaySeptember 24, 2019 EECS 491 – Lecture 7 18

Page 19: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Desirable Properties● Safety

◆ “No bad things happen”◆ System never reaches an undesirable state

● Liveness◆ “Good things eventually happen”◆ System makes progress eventually

● Tradeoff between consistency and latency

September 24, 2019 EECS 491 – Lecture 7 19

Page 20: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Desired Properties of Solution

● Safety:◆ Choose a proposal only if accepted by a majority◆ Choose from proposals made

● Liveness:◆ If proposals exist, one will eventually be chosen◆ If a proposal is chosen, all replicas will eventually

discover that it was chosen

September 24, 2019 EECS 491 – Lecture 7 20

Page 21: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Project 2● View service:

◆ Draw state diagram◆ What events can cause view to change?◆ What state determines how to change view?◆ Think about cases not covered by unit tests

● Primary backup service:◆ Carefully think about implications of every failed RPC◆ Sleep for PingInterval before retrying RPC

September 24, 2019 EECS 491 – Lecture 7 21

Page 22: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Roles of a Process in Paxos

● Three conceptual roles◆ Proposers propose values◆ Acceptors accept values; chosen if majority accept◆ Learners learn the outcome (chosen value)

● In reality, a process can play any/all roles● Roles in bank account example?● Roles in camping trip example?● Roles if Paxos used to pass laws?

22September 24, 2019 EECS 491 – Lecture 7

Page 23: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Paxos: High-level Intuition

● May be unable to reach consensus in one round● So, protocol runs over multiple rounds● In each round:

◆ Elect a leader◆ Proposal by current leader accepted by majority

● Once a value is accepted by a majority, a diff. value won’t be proposed subsequently

September 24, 2019 EECS 491 – Lecture 7 23

Page 24: EECS 491 Introduction to Distributed Systems · 2019. 9. 26. · Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics

Paxos Overview● Three phases within each round● Prepare phase (elect a leader):

◆ Proposer sends unique proposal no. to all acceptors◆ Waits to get commitment from majority of acceptors

● Accept phase (get majority to accept):◆ Proposer sends proposed value to all acceptors◆ Waits to get proposal accepted by majority

● Learn phase (disseminate chosen value):◆ Learners discover value accepted by majority

September 24, 2019 EECS 491 – Lecture 7 24