45
Don’t Give Up on Serializability Just Yet Neha Narula

Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

  • Upload
    lekien

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Don’t Give Up on Serializability Just Yet

Neha Narula

Page 2: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Don’t Give Up on Serializability Just Yet

Neha Narula MIT CSAIL

GOTO Chicago May 2015

2  

A journey into serializable systems

Page 3: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

@neha

3  

•  PhD candidate at MIT

•  Formerly at Google

•  Research in fast transactions for multi-core databases and distributed systems

Page 4: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

4  

However, the most important person in my gang will be a systems programmer. A person who can debug a device driver or a distributed system is a person who

can be trusted in a Hobbesian nightmare of breathtaking scope; a systems

programmer has seen the terrors of the world and understood the intrinsic horror

of existence.

Page 5: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

A journey into serializable systems

Page 6: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

6  

Page 7: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

1M messages/sec

1/5 of all page views in the US

1M messages/sec from mobile devices

Page 8: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Databases are difficult to scale

8  

Application servers are stateless; add more for

more traffic

Database is stateful

Page 9: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Distributed databases

9  

Partition data on multiple servers for more performance

Page 10: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Example partitioned database

Database  

Database  

Database  

widgets table

widget_id!

100-199!

0-99!

200-299!

Webservers

Database  

?!

Page 11: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

2007 •  Mapreduce •  Google File System •  Bigtable

11  

Page 12: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Pros/Cons •  In-memory •  HIGHLY scalable •  Transparently fault

tolerant •  Geo replication

12  

•  No schema •  Require complex

key/row/document design

•  No query language •  No indexes •  No transactions •  No guarantees

Page 13: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

13  

Page 14: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

14  

Page 15: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

15  

mysql> BEGIN TRANSACTION UPDATE … COMMIT

Page 16: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Problem with dropping transactions

•  Difficult to reason about concurrent interleavings

•  Might result in incorrect, unrecoverable state

16  

Page 17: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL
Page 18: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

“The hacker discovered that multiple simultaneous withdrawals

are processed essentially at the same time and that the system's software doesn't check quickly enough for a negative balance”

h1p://arstechnica.com/security/2014/03/yet-­‐another-­‐exchange-­‐hacked-­‐poloniex-­‐loses-­‐around-­‐50000-­‐in-­‐bitcoin/  

Page 19: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Consistency guarantees help us reason about our code and avoid

subtle bugs

Page 20: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Consistency A very misused word in systems! •  C as in ACID •  C as in CAP •  C as in sequential, causal, eventual, strict

consistency

Page 21: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

ACID Transactions Atomic Consistent Isolated Durable

21  

Whole thing happens or not

Application-defined correctness

Other transactions do not interfere

Can recover correctly from a crash

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRANSACTION ... COMMIT

Page 22: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

What is Serializability?

22  

Serializability != Serial

Page 23: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

What is Serializability? The result of executing a set of transactions is the same as if those transactions had executed one at a time, in some serial order. If each transaction preserves correctness, the DB will be in a correct state. We can pretend like there’s no concurrency!

23  

Page 24: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

TXN1(k, j Key) (Value, Value) { a := GET(k) b := GET(j) return a, b

}

Database transactions should be serializable

24  

TXN2(k, j Key) { ADD(k,1) ADD(j,1)

}

TXN1 TXN2

TXN2 TXN1

time

or"

To the programmer:"

Valid return values for TX1: (0,0)"

k=0,j=0"

or (1,1)"

Page 25: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Benefits of Serializability •  Do not have to reason about interleavings •  Do not have to express invariants separately

from the code!

25  

Page 26: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Serializability Costs •  On a multi-core database, serialization and

cache line transfers •  On a distributed database, serialization and

network calls

Concurrency control: Locking and coordination

26  

Page 27: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Eventual consistency If no new updates are made to the object, eventually all accesses will return the last updated value.

Page 28: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Eventual consistency If no new updates are made to the object, eventually all accesses will return the last updated value the same value. (What is last, really?) (And when do we stop writing?) (And what about multi-key transactions?)

Page 29: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Sequential consistency: cache coherence

P1   P2   P3  

RAM  

Page 30: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

P1:  W(x)a  P2:                          W(x)b  P3:                                                    R(x)a                                  R(x)b  

P1:  W(x)a  P2:                                                                    W(x)b  P3:                                                  R(x)a                                    R(x)b  

Lme  

Lme  

Page 31: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

P1:  W(x)a  P2:                          W(x)b  P3:                                                  R(x)b                                  R(x)a  

P1:                                                                            W(x)a  P2:                        W(x)b  P3:                                                  R(x)b                                      R(x)a  

Lme  

Lme  

Page 32: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

External Consistency Everything that sequential consistency has Except results actually match time. An external observer

Page 33: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

P1:  W(x)a  P2:                          W(x)b  P3:                                                    R(x)b                                  R(x)a  

The  value  of  x  is  b!  

Then  I  read  x=a?    

 P3:                                                      

Not Externally Consistent

Lme  

Page 34: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

CAP Theorem •  Brewer’s PODC talk: “Consistency, Availability,

Partition-tolerance: choose two” in 2000 – Partition-tolerance is a failure model – Choice: can you process reads and writes during a

partition or not?

•  FLP result – “Impossibility of Distributed Consensus with One Faulty Process” in 1985 – Asynchronous model; cannot tell the difference

between message delay and failure

Page 35: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

What does this mean?

It’s impossible to decide anything on the internet?

Page 36: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

NP-hard

Page 37: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

What does CAP mean? It’s impossible to 100% of the time decide everything on the internet if we can’t rely on synchronous messaging We can 100% of the time decide everything if partitions heal (we know the upper bound on message delays) We can still play Candy Crush

Page 38: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

CAP"Consistency vs. Performance

Consistency (like serializability) requires communication and blocking How do we reduce these costs while: •  Producing a correct ordering of reads and

writes and •  Handling failures and (eventually) making

progress?

Page 39: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Improving Serializability Performance

39  

Technique Systems

Atomic clocks to bound time skew

Spanner

Transaction chopping Lynx, ROCOCO

Commutative locking Escrow transactions, abstract data types, Doppel

Deterministic ordering Granola, Calvin

Page 40: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Goal: parallel performance •  Different concurrency control schemes for

popular, contended data •  Commutative locking •  Abstract datatypes •  Per-core (or per-server) data and

constraints

40  

Page 41: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Ordered PUT, insert to an ordered list, user-defined

functions

Operation Model Developers write transactions as stored procedures which are composed of operations on keys and values:

41  

value GET(k) void PUT(k,v) void INCR(k,n) void MAX(k,n) void MULT(k,n) void OPUT(k,v,o) void TOPK_INSERT(k,v,o) void UDF(k,v,a)

Traditional key/value operations

Operations on numeric values which modify the

existing value

Replicate for reads Save last write

Replicate for commutative operations

Log operations

Page 42: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Spanner/F1 “We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.”

Page 43: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Takeaways •  Use well-tested, long-lived database

systems •  Use SERIALIZABLE until it becomes a

performance problem •  Think about what is changing when you

move to systems with different models

43  

Page 44: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Thanks!"

The  Stata  Center  via  emax:  h1p://hip.cat/emax/  

[email protected] http://nehanaru.la

@neha

Page 45: Don’t Give Up on Serializability Just Yetgotocon.com/dl/.../slides/...DontGiveUpOnSerializabilityJustYet.pdf · Don’t Give Up on Serializability Just Yet Neha Narula MIT CSAIL

Questions? Please remember to evaluate via the GOTO

Guide App