8
Write Fast, Read in the Past: Causal Consistency for Client- Side Applications Marc Shapiro Joint work with: Marek Zawirski, Inria & LIP6 Annette Bieniusa, U. Kaiserslautern Valter Balegas, U. Nova Lisboa Sérgio Duarte, U. Nova Lisboa Carlos Baquero, U. Minho Nuno Preguiça, U. Nova Lisboa [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps] Cloud-backed applications 2 1. Request 4. Transmit update 50~200ms 3. Reply 2. Process request & store update Large amounts of shared mutable data Geo-replication: high availability, low latency, fault tolerance 10~200ms 10~200ms databas e app server ad-hoc app interface [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps] SwiftCloud approach 3 Transmit App + database at/near client Update shared store locally Availability + consistency partial database app Process request & store update Transmit Transmit fail-over full database Transmit [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps] Requirements & challenges Scalability with #items, #clients Partial replication of data … and of metadata Small, bounded metadata Availability causal consistency Eventual exactly-once delivery Causal order … despite partial replication … despite transient failures … despite permanent failures Convergence 4

Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

Write Fast, Read in the Past: Causal Consistency for Client-

Side Applications

Marc ShapiroJoint work with:

Marek Zawirski, Inria & LIP6Annette Bieniusa, U. Kaiserslautern

Valter Balegas, U. Nova LisboaSérgio Duarte, U. Nova Lisboa

Carlos Baquero, U. MinhoNuno Preguiça, U. Nova Lisboa

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Cloud-backed applications

2

1. Request4. Transmit

update

50~200ms3. Reply

2. Process request & store update

Large amounts of shared mutable dataGeo-replication: high availability, low latency, fault tolerance

10~200ms

10~20

0ms

database

app server

ad-hoc app interface

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

SwiftCloud approach

3

Transmit

App + database at/near clientUpdate shared store locallyAvailability + consistency

partialdatabase

app

Process request& store update Transmit

Transmit

fail-over

fulldatabase

Transmit

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Requirements & challenges

Scalability with #items, #clients• Partial replication of data• … and of metadata• Small, bounded metadata

Availability ⇒ causal consistency• Eventual exactly-once delivery• Causal order• … despite partial replication• … despite transient failures• … despite permanent failures

Convergence

4

Page 2: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Metadata

Tracking causality• Dependence: piggy-backed on update message• Delivery: compare with incoming message

Representations:• Graph: worst-case cost, transitive closure cost• Vector: compact transitive closure, constant

cost + optimisationsIssue: size(vector) = O (#masters)Log: pruning not contingent on clients

5

1063

depend on 10th update of

replica 1

received 10th update from

Replica 1

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Causality + scalability: DC

Full replica state:• Causally consistent• Transitively closed

Vector clock• size = O(#DCs)

6

DC

DC

DC

C

C C

C

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Causality + scalability: client

Partial replica: cache interest set Topology: Communicate with DC onlyState = DC state |interest set ∪ client updatesConcurrency: size(vector) = O(#DCs)!

7

DC

DC

DC

C

C C

C

Remote: Causally

consistent Local: Read My Writes

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Causal consistency & fail-over

Asynchronous propagation• At least once: stubborn• At most once: filter out duplicates: how?• Consistency with new DC?

8

fail-overDC

DC

DC

C

C C

C

Page 3: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Inconsistency with new DC

Causal gap at client: y.add(C) depends on x.add(T)

Oregon cannot satisfy: fail-over impossible; client blocks until Ireland recovers

9

xy

xy

y.add A

x.add T

y.add B

Oregon DC

Ireland DC

(y.add A)

(y.add A)

(y.add B)Paris Client

y.add(C)y.readread x

read not available

write not transmitted

xy

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Solution: reading in the past

Client served only updates that are K-stable ⇒ no gaps• K=1: always fresh, fail-over problematic• K=N: fail-over guaranteed, staleness‣ updates possibly buffered indefinitely

Default: K=2, covers common failures• Also improves throughput

10

x.read

xy

xy

y.add A

x.add T

y.add B

Oregon DC

Ireland DC

(y.add A)

(y.add A)

Paris Client y.add(C)y.readx

y

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Convergence by construction

Merging concurrent updates:• Deterministic• Dependent only on delivery poset• Not on delivery order, local info

Sufficient conditions:• Delivered updates commute• (or) Monotonic semi-lattice

CRDTs

11 [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Conflict-free Replicated Data Types (CRDTs)Converge concurrent updates

Encapsulate replication & resolutionRe-usable data types

Correct by construction

12

Register• Last-Writer Wins• Multi-Value

Set• Grow-Only• 2P• Observed-Remove

Map

Counter• Unlimited• Restricted non-

negativeGraph

• Directed• Monotonic DAG• Edit graph

Sequence

Page 4: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Highly-Available Transactions

Unit of reading / writing• Read committed• Read atomic

Update is visible only when • preceding visible• all in transaction visible

Single ID, increment• read in the past

No synchronisation necessary

13 [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Experiments

3 DCs in Amazon EC21 DC = 1 EC2 medium instance100 client nodes in PlanetLabMultiple sessions per nodeCache size: 512 objects

14

ms

msms

Social networking application•90% cache hitsEvaluate•Availability, responsiveness• Size of metadata• Latency, throughput

[Write'Fast,'Read'in'the'Past:'Causal+'Consistency'for'Client:Side'Apps]

• remote = application logic & data in DC'• SwiftCloud: application, updates at client,

asynchronous commit'• SwiftCloud RIP: read K-stable'• stable updates: read one, write K

Update caching + Read-In-Past minimize latency

RTT 2. RTT

Operations with > 1 cache miss, selective queries

Read-in-past + client-assisted fault tolerance

Client-side caching & updates

15

writes,RIP

reads,RIPreads/writes,

remote, no FT

reads, remote + stable update

writ

es, r

emot

e+st

able

upd

ate

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Latency vs. throughput

SwiftCloud

1 DC

2 DC

3 DC

1 DC

2 DC

3 DC

classic synch.

10×

60×

Page 5: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Latency vs. throughput

SwiftCloud

1 DC

2 DC

3 DC

1 DC

2 DC

3 DC

classic synch. 6×

12×

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Staleness for fault tolerance

18

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Metadata overhead

19

SwiftCloud PRACTI/Depot

● ● ●● ● ●

0

1000

2000

3000

4000

5000

7500 15000 22500 30000 375007500 15000 22500 30000 37500#users

syst

em m

etad

ata

[byt

es]

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

SummaryFast response, high availability

• Shared store replicated at client• Full consistency guarantees

Availability: Causal consistency is strongest possible• (+ transactions + red-blue)

Challenges & solutions:• Large database: partial replicate data, metadata• Small, bounded metadata: DC-based, pruning• Fail-over‣ At-most-once: Client ID + merge clocks‣ Avoid gaps: read in the past (K-stable)

• Convergence: CRDTs

20

Page 6: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Current & future work

SyncFree EU project• Inria, Nova, Minho, Kaiserslautern,

UC Louvain, Koç• Basho, Trifork, Rovio

Objectives• CRDTs in theory + real apps• Extreme scalability (DC and clients)• Transactions, invariants, security• Languages, patterns, tools• Crowd-sourced experiments

21 [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Backup slides

22

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Internet round-trip times

23

RTTS (ms)

Endpoints Speed of light min mean max

Berkeley – Canberra 79.9 174.0 174.7 176.0

Berkeley – New York 27.4 85.0 85.0 85.0

Berkeley – Trondheim 55.6 197.0 197.0 197.0

Pittsburgh – Ottawa 4.3 44.0 44.1 62.0

Pittsburgh – Hong-Kong 85.9 217.0 223.1 393.0

Pittsburgh – Dublin 42.0 115.0 115.7 116.0

Pittsburgh – Seattle 22.9 83.0 83.9 84.0

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Red/Blue consistency

Mix synchronous (red) and asynchronous (blue) updates

Correctness• All reads and updates are causally

ordered• Blue updates commute with both red

and blue ones• Red updates are totally ordered with

red ones (but not with blue ones)[Li et al. OSDI 2012]

24

Page 7: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps][From strong to eventual consistency: getting it right]

Eventual Consistency

• Every update eventually reaches every replica at least once

• An update has effect on a replica at most once• At all times, the state of a replica satisfies the

objects’ specifications• Two replicas that received the same set of

updates eventually reach the same stateTransactional: replace “update” with “transaction

comprising one or more updates”.

25 [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

CRDTs in the wild

Walter'[SOSP'2011]:'c:set'Riak'2.0:'counter,'set,'map'Facebook'Apollo'(announced'2014:06)'StateBox'(Erlang),'KnockBox'(Clojure),'meangirls'(Ruby),'riak:java:crdt'(Java)

26

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

SwiftCloud: geo-replication right to the client machine

Extreme numbers of clients

Large database

Causal consistency for servers and clients

Fault tolerance

Metadata proportional to # clients?

Full replication?

Switching servers?

27 [Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

SwiftCloud: geo-replication right to the client machine

Low latency … for reads

• Partial replication/caching in clients… for writes

• Weak consistency• Mergeable writes => CRDT• + Transactions with parallel snapshot isolation (PSI

transactions)

28

Page 8: Write Fast, Read in the Past: Cloud-backed applications ...people.rennes.inria.fr/Adrien.Lebre/PUBLIC/CloudDay2015/MarcShapi… · Current & future work ... (Ruby),'riak:java:crdt'(Java)

[Write Fast, Read in the Past: Causal+ Consistency for Client-Side Apps]

Causal transactions

Read a causally-consistent stateAll-or-nothing, isolated updatesNo consensus (not ACID)

29

xy

xy

!y1

!x1

?x1

!y2

!x2

?x2

x

?y1 ?y2

Not Available

Eventual Consistency

Causal Consistency

Non-Monotonic SI

Parallel SIUpdate

Serialisability

Snapshot Isolation Serialisability

Strict Serialisability

Transactional Causal Cons.

Partial Order Writes

Hard toprogram

Easy toprogram

Live

Causal order

Transactions

Total order writes

Total order reads

Total order reads & writesReal time

Total Order Writes

Total order reads

Total order reads & writes

Fast writes

Highperformance

Lowperformance

Wait-Free QueriesMinimum

Commitment Sync

Genuine Partial Replication

Forward Freshness

Genuine Partial Replication

Forward Freshness

Minimum Commitment SyncRelax read ordering

Wait-Free QueriesDecoupled read/writes

Relax write ordering

No order whatsoever

No atomicity