43
Phase Reconciliation for Contended In- Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

Embed Size (px)

Citation preview

Page 1: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

1

Phase Reconciliation for Contended In-Memory

Transactions

Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris

MIT CSAIL and Harvard

Page 2: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

2

IncrTxn(k Key) {INCR(k, 1)

}

LikePageTxn(page Key, user Key) {INCR(page, 1)liked_pages := GET(user)PUT(user, liked_pages + page)

}

FriendTxn(u1 Key, u2 Key) {PUT(friend:u1:u2, 1)PUT(friend:u2:u1, 1)

}

Page 3: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

3

IncrTxn(k Key) {INCR(k, 1)

}

LikePageTxn(page Key, user Key) {INCR(page, 1)liked_pages := GET(user)PUT(user, liked_pages + page)

}

FriendTxn(u1 Key, u2 Key) {PUT(friend:u1:u2, 1)PUT(friend:u2:u1, 1)

}

Page 4: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

4

Applications experience write contention on popular data

Problem

Page 5: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

5

Page 6: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

6

Concurrency Control Enforces Serial Execution

core 0

core 1

core 2

INCR(x,1)

INCR(x,1)

INCR(x,1)

time

Transactions on the same records execute one at a time

Page 7: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

7

Throughput on a Contentious Transactional Workload

Page 8: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

8

Throughput on a Contentious Transactional Workload

Page 9: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

9

INCR on the Same Records Can Execute in Parallel

core 0

core 1

core 2

INCR(x0,1)

INCR(x1,1)

INCR(x2,1)

time

• Transactions on the same record can proceed in parallel on per-core slices and be reconciled later

• This is correct because INCR commutes

1

1

1

per-core slices of record x

x is split across cores

Page 10: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

10

Databases Must Support General Purpose Transactions

IncrTxn(k Key) {INCR(k, 1)

}

PutMaxTxn(k1 Key, k2 Key) {v1 := GET(k1)v2 := GET(k2)if v1 > v2:

PUT(k1, v2)else:

PUT(k2, v1)return v1,v2

}

IncrPutTxn(k1 Key, k2 Key, v Value) {

INCR(k1, 1)PUT(k2, v)

}

Must happen

atomically

Must happen atomically

Returns a value

Page 11: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

11

Challenge

Fast, general-purpose serializable transaction execution with per-core slices for contended records

Page 12: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

12

Phase Reconciliation

• Database automatically detects contention to split a record among cores

• Database cycles through phases: split, reconciliation, and joined

reconciliation

Joined Phase

Split Phase

Doppel, an in-memory transactional database

Page 13: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

13

Contributions

Phase reconciliation– Splittable operations– Efficient detection and response to

contention on individual records– Reordering of split transactions and

reads to reduce conflict– Fast reconciliation of split values

Page 14: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

14

Outline

1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation

Page 15: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

15

Split Phase

core 0

core 1

core 2

INCR(x,1)

INCR(x,1) PUT(y,2)

INCR(x,1) PUT(z,1)

core 3 INCR(x,1) PUT(y,2)

core 0

core 1

core 2

INCR(x0,1)

INCR(x1,1) PUT(y,2)

INCR(x2,1) PUT(z,1)

core 3 INCR(x3,1) PUT(y,2)

• The split phase transforms operations on contended records (x) into operations on per-core slices (x0, x1, x2, x3)

split phase

Page 16: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

16

• Transactions can operate on split and non-split records• Rest of the records use OCC (y, z)• OCC ensures serializability for the non-split parts of the

transaction

core 0

core 1

core 2

INCR(x0,1)

INCR(x1,1) PUT(y,2)

INCR(x2,1) PUT(z,1)

core 3 INCR(x3,1) PUT(y,2)

split phase

Page 17: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

17

• Split records have assigned operations for a given split phase• Cannot correctly process a read of x in the current state• Stash transaction to execute after reconciliation

core 0

core 1

core 2

INCR(x0,1)

INCR(x1,1) PUT(y,2)

INCR(x2,1) PUT(z,1)

core 3 INCR(x3,1) PUT(y,2)

split phase

INCR(x1,1)

INCR(x2,1)

INCR(x1,1)

GET(x)

Page 18: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

18

core 0

core 1

core 2

INCR(x0,1)

INCR(x1,1) PUT(y,2)

INCR(x2,1) PUT(z,1)

core 3 INCR(x3,1) PUT(y,2)

reco

ncile

split phase

• All threads hear they should reconcile their per-core state

• Stop processing per-core writes

GET(x)

INCR(x1,1)

INCR(x2,1)

INCR(x1,1)

Page 19: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

19

• Reconcile state to global store• Wait until all threads have finished reconciliation• Resume stashed read transactions in joined phase

core 0

core 1

core 2

core 3

All done

reco

ncile

reconciliation phase joined phase

x = x + x0

x = x + x1

x = x + x2

x = x + x3

GET(x)

Page 20: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

20

core 0

core 1

core 2

core 3

x = x + x0

x = x + x1

x = x + x2

x = x + x3

reco

ncile

reconciliation phase

GET(x)

All done

joined phase

• Reconcile state to global store• Wait until all threads have finished reconciliation• Resume stashed read transactions in joined phase

Page 21: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

21

core 0

core 1

core 2

core 3

GET(x)

• Process new transactions in joined phase using OCC• No split data

All done

joined phase

INCR(x)

INCR(x, 1)

GET(x)GET(x)

Page 22: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

22

Batching Amortizes the Cost of Reconciliation

core 0

core 1

core 2

INCR(x0,1)

INCR(x1,1) INCR(y,2)

INCR(x2,1) INCR(z,1)

core 3 INCR(x3,1) INCR(y,2)

GET(x)

• Wait to accumulate stashed transactions, batch for joined phase• Amortize the cost of reconciliation over many transactions• Reads would have conflicted; now they do not

INCR(x1,1)

INCR(x2,1) INCR(z,1)

GET(x)

GET(x)

All done

reco

ncile

GET(x)

GET(x)

GET(x)

split phasejoined phase

Page 23: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

23

Phase Reconciliation Summary

• Many contentious writes happen in parallel in split phases

• Reads and any other incompatible operations happen correctly in joined phases

Page 24: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

24

Outline

1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation

Page 25: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

Ordered PUT and insert to an ordered

list

Operation Model

Developers write transactions as stored procedures which are composed of operations on keys and values:

25

value GET(k)void PUT(k,v)

void INCR(k,n)void MAX(k,n)void MULT(k,n)

void OPUT(k,v,o)void TOPK_INSERT(k,v,o)

Traditional key/value operations

Operations on numeric values

which modify the existing value

Not splittable

Splittable

Page 26: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

26

27

55

MAX Can Be Efficiently Reconciled

core 0

core 1

core 2

MAX(x0,55)

MAX(x1,10)

MAX(x2,21)

21

• Each core keeps one piece of state xi

• O(#cores) time to reconcile x• Result is compatible with any order

MAX(x0,2)

MAX(x1,27)

x = 55

Page 27: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

27

What Operations Does Doppel Split?

Properties of operations that Doppel can split:– Commutative– Can be efficiently reconciled– Single key– Have no return value

However:– Only one operation per record per split

phase

Page 28: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

28

Outline

1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation

Page 29: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

29

Which Records Does Doppel Split?

• Database starts out with no split data• Count conflicts on records–Make key split if #conflicts >

conflictThreshold

• Count stashes on records in the split phase–Move key back to non-split if #stashes

too high

Page 30: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

30

Outline

1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation

Page 31: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

31

Experimental Setup and Implementation

• All experiments run on an 80 core Intel server running 64 bit Linux 3.12 with 256GB of RAM

• Doppel implemented as a multithreaded Go server; one worker thread per core

• Transactions are procedures written in Go• All data fits in memory; don’t measure RPC• All graphs measure throughput in

transactions/sec

Page 32: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

32

Performance Evaluation

• How much does Doppel improve throughput on contentious write-only workloads?

• What kinds of read/write workloads benefit?

• Does Doppel improve throughput for a realistic application: RUBiS?

Page 33: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

33

Doppel Executes Conflicting Workloads in Parallel

Th

rou

gh

pu

t (m

illio

ns

txn

s/se

c)

20 cores, 1M 16 byte keys, transaction: INCR(x,1) all on same key

Doppel OCC 2PL0

5000000

10000000

15000000

20000000

25000000

30000000

35000000

Page 34: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

34

Doppel Outperforms OCC Even With Low Contention

20 cores, 1M 16 byte keys, transaction: INCR(x,1) on different keys

5% of writes to contended

key

Page 35: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

35

Contentious Workloads Scale Well

1M 16 byte keys, transaction: INCR(x,1) all writing same key

Communication of phase changing

Page 36: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

36

LIKE Benchmark

• Users liking pages on a social network• 2 tables: users, pages• Two transactions:

– Increment page’s like count, insert user like of page– Read a page’s like count, read user’s last like

• 1M users, 1M pages, Zipfian distribution of page popularity

Doppel splits the page-like-counts for popular pagesBut those counts are also read more often

Page 37: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

37

Benefits Even When There Are Reads and Writes to the Same Popular Keys

Doppel OCC0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

9000000

Th

rou

gh

pu

t (m

illio

ns

txn

s/se

c)

20 cores, transactions: 50% LIKE read, 50% LIKE write

Page 38: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

38

Doppel Outperforms OCC For A Wide Range of Read/Write Mixes

20 cores, transactions: LIKE read, LIKE write

Doppel does not split any data and

performs the same as OCC

More stashed read transactions

Page 39: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

39

RUBiS

• Auction application modeled after eBay– Users bid on auctions, comment, list new items,

search

• 1M users and 33K auctions• 7 tables, 17 transactions• 85% read only transactions (RUBiS bidding mix)

• Two workloads:– Uniform distribution of bids– Skewed distribution of bids; a few auctions are very

popular

Page 40: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

40

StoreBid Transaction

StoreBidTxn(bidder, amount, item) {INCR(NumBidsKey(item),1)MAX(MaxBidKey(item), amount)

OPUT(MaxBidderKey(item), bidder, amount)

PUT(NewBidKey(), Bid{bidder, amount, item})}

All commutative operations on potentially conflicting auction

metadata

Inserting new bids is not likely to conflict

Page 41: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

41

Uniform Skewed0

2000000

4000000

6000000

8000000

10000000

12000000

DoppelOCC

Doppel Improves Throughput on an Application Benchmark

Th

rou

gh

pu

t (m

illio

ns

txn

s/se

c)

80 cores, 1M users 33K auctions, RUBiS bidding mix

8% StoreBid Transactions

3.2x throughput improveme

nt

Page 42: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

42

Related Work

• Commutativity in distributed systems and concurrency control– [Weihl ’88]

– CRDTs [Shapiro ’11]

– RedBlue consistency [Li ’12]

– Walter [Lloyd ’12]

• Optimistic concurrency control– [Kung ’81]

– Silo [Tu ’13]

• Split counters in multicore OSes

Page 43: Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1

43

ConclusionDoppel:

• Achieves parallel performance when many

transactions conflict by combining per-core data and

concurrency control

• Performs comparably to OCC on uniform or read-

heavy workloads while improving performance

significantly on skewed workloads.

http://pdos.csail.mit.edu/doppel