110
Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control Jialin Li, Ellis Michael, Dan R. K. Ports

Eris: Coordination-Free Consistent Transactions Using In ... · B1 C1 A2 B2 C1A3 B2 C1 DROP T1 (ABC) A1 B1 C1 T1 (ABC) A1 B1 C1 T2 (AB) A2 B2 T3 (A) A3. Network Implementation

  • Upload
    others

  • View
    44

  • Download
    0

Embed Size (px)

Citation preview

Eris: Coordination-Free Consistent Transactions Using

In-Network Concurrency ControlJialin Li, Ellis Michael, Dan R. K. Ports

Web services and applications rely on distributed storage systems

Web services and applications rely on distributed storage systems

Web services and applications rely on distributed storage systems

Web services and applications rely on distributed storage systems

Web services and applications rely on distributed storage systems

Partitioned for Scalability, Replicated for Availability

Partitioned for Scalability, Replicated for Availability

Partitioned for Scalability, Replicated for Availability

Shard 3

Client

Shard 1

Shard 2

Existing transactional systems: extensive coordination

Shard 3

Client

Shard 1

Shard 2

req prepare ok commit

Existing transactional systems: extensive coordination

Shard 3

Client

Shard 1

Shard 2

req prepare ok commit

Existing transactional systems: extensive coordination

Shard 3

Client

Shard 1

Shard 2

req prepare ok commit

Existing transactional systems: extensive coordination

• Processes independent transactions without coordination in the normal case

• Performance within 3% of a nontransactional, unreplicated system on TPC-C

• Strongly consistent, fault tolerant transactions with minimal performance penalties

In this talk … Eris

Key Contributions

A new architecture that divides the responsibility for transactional guarantees in a new way

…leveraging the datacenter network to order messages within and across shards

…and a co-designed transaction protocol with minimal coordination.

Traditional Layered Approach

Atomic Commitment (2PC)

Concurrency Control (2PL)

Concurrency Control (2PL)

Replication (Paxos)

Replica Replica

Replica

Replication (Paxos)

Replica Replica

Replica

Traditional Layered Approach

Atomic Commitment (2PC)

Concurrency Control (2PL)

Concurrency Control (2PL)

Replication (Paxos)

Replica Replica

Replica

Replication (Paxos)

Replica Replica

Replica

Ordering (within shard)

Reliability (within shard)

Isolation

Traditional Layered Approach

Atomic Commitment (2PC)

Concurrency Control (2PL)

Concurrency Control (2PL)

Replication (Paxos)

Replica Replica

Replica

Replication (Paxos)

Replica Replica

Replica

Ordering (within shard)

Reliability (within shard)

Ordering (across shard)

Isolation

Traditional Layered Approach

Atomic Commitment (2PC)

Concurrency Control (2PL)

Concurrency Control (2PL)

Replication (Paxos)

Replica Replica

Replica

Replication (Paxos)

Replica Replica

Replica

Ordering (within shard)

Reliability (within shard)

Reliability (across shards)

Ordering (across shard)

Isolation

Traditional Layered Approach

Ordering (within shard)

Reliability (within shard)

Reliability (across shards)

Ordering (across shard)

Isolation

Traditional Layered Approach

Ordering (within shard)

Reliability (within shard)

Reliability (across shards)

Multi-sequencing

Independent Transaction Protocol

General Transaction Protocol

Eris

A new way to divide the responsibilities for different guarantees

Ordering (across shard)

Isolation

Traditional Layered Approach

Ordering (within shard)

Reliability (within shard)

Reliability (across shards)

Multi-sequencing

Independent Transaction Protocol

General Transaction Protocol

Eris

ApplicationNetwork

A new way to divide the responsibilities for different guarantees

Outline

1. Introduction

2. In-Network Concurrency Control

3. Transaction Model

4. Eris Protocol

5. Evaluation

In-Network Concurrency Control Goals

• Globally consistent ordering across messages delivered to multiple destination shards

• No reliable delivery guarantee

• Recipients can detect dropped messages

A

B

C

Receivers

T1(ABC)

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

A

B

C

Receivers

T1(ABC)

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

A

B

C

Receivers

T1(ABC)

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

A

B

C

Receivers

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

A

B

C

Receivers

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

DROP

A

B

C

Receivers

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

DROPT1(ABC)

T2(AB)T2

(AB)

T2(AB)T2

(AB)

T1(ABC)T1

(ABC)

T1(ABC)T1

(ABC)

T1(ABC)T1

(ABC)

A

B

C

Receivers

T1(ABC)

T1(ABC)

T2(AB)

T2(AB)

DROPT1(ABC)

Multi-Sequenced Groupcast

• Groupcast: message header specifies a set of destination multicast groups

• Multi-sequenced groupcast: messages are sequenced atomically across all recipient groups

• Sequencer keeps a counter for each group

• Extends OUM in NOPaxos [OSDI ’16]

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0

A

B

C

Receivers

Sequencer

T1 (ABC)

Counter: A0 B0 C0

A

B

C

Receivers

Sequencer

T1 (ABC)

Counter: A0 B0 C0

A

B

C

Receivers

Sequencer

T1 (ABC)

Counter: A0 B0 C0A1 B1 C1

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1 B1 C1

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

T2(AB)

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

T2(AB)

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

T2(AB)

A2 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2 (AB)

A2 B2

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2B2

T2(AB)

A2 B2

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1

T3(A)

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1

T3(A)

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1

T3(A)

A3 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1A3 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

T3(A)

A3

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1A3 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

T3(A)

A3

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1A3 B2 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

T3(A)

A3

A

B

C

Receivers

Sequencer

Counter: A0 B0 C0A1 B1 C1

T1 (ABC)

A1B1 C1

A2 B2 C1A3 B2 C1

DROP

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T2(AB)

A2 B2

T3(A)

A3

Network Implementation

• Groupcast routing using OpenFlow

• Sequencer implementations:

✤ Programmable switches, written in P4

✤ Middlebox prototype using network processors

• Global epoch number for sequencer failures

What have we accomplished so far?

• Consistently ordered groupcast primitive with drop detection

• How do we go from multi-sequenced groupcast to transactions?

Outline

1. Introduction

2. In-Network Concurrency Control

3. Transaction Model

4. Eris Protocol

5. Evaluation

Transaction ModelEris supports two types of transactions

• Independent transactions:

✤ One-shot (stored procedures)

✤ No cross-shard dependencies

✤ Proposed by H-Store [VLDB ’07] and Granola [ATC ’12]

• Fully general transactions

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Name SalaryBob 450

Name SalaryCharlie 500

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Name SalaryBob 450

Name SalaryCharlie 500

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE 500 < (SELECT AVG(t2.Salary) FROM tb t2) COMMIT

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Name SalaryBob 450

Name SalaryCharlie 500

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE 500 < (SELECT AVG(t2.Salary) FROM tb t2) COMMIT

Not In

depend

ent!

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Name SalaryBob 450

Name SalaryCharlie 500

Independent Transaction

Name SalaryAlice 600

Name SalaryBob 350

Name SalaryCharlie 400

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

START TRANSACTIONUPDATE tb t1 SET t1.Salary = t1.Salary + 100 WHERE t1.Salary < 500 COMMIT

Name SalaryBob 450

Name SalaryCharlie 500

Many applications consist entirely of independent transactions (e.g. TPC-C)

Why independent transactions?

• No coordination/communication across shards

• Executing them serially at each shard in a consistent order guarantees serializability

• Multi-sequenced groupcast establishes such an order

• How to handle message drops and sequencer/server failures?

Outline

1. Introduction

2. In-Network Concurrency Control

3. Transaction Model

4. Eris Protocol

5. Evaluation

Shard 3

Client

Shard 1

Shard 2

Sequencer

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

Shard 3

Client

Shard 1

Shard 2

Sequencer

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

Shard 3

Client

Shard 1

Shard 2

Sequencer

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

Shard 3

Client

Shard 1

Shard 2

Sequencer

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

Shard 3

Client

Shard 1

Shard 2

Sequencer

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

Shard 3

Client

Shard 1

Shard 2

Sequencer

1 round trip

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

Shard 3

Client

Shard 1

Shard 2

Sequencer

1 round trip

nocoordination

Normal Case

Learner

Learner

Learner

Replica

Replica

Replica

Replica

Replica

Replica

How to handle dropped messages?A

B

C

DROP

T1 (ABC)

A1 B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1B1 C1

T3(A)

A3

How to handle dropped messages?A

B

CT1

(ABC)

A1 B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1B1 C1

T3(A)

A3

How to handle dropped messages?A

B

CT1

(ABC)

A1 B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1B1 C1

T2(AB)

A2 B2

T3(A)

A3

How to handle dropped messages?A

B

CT1

(ABC)

A1 B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1B1 C1

T2(AB)

A2 B2

T3(A)

A3

How to handle dropped messages?A

B

CT1

(ABC)

A1 B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1B1 C1

T2(AB)

A2 B2

T3(A)

A3

How to handle dropped messages?A

B

CT1

(ABC)

A1 B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1B1 C1

T2(AB)

A2 B2

T3(A)

A3

Global coordination problem

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

Received A2?T1

(ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

Received A2?Received A2?

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

Received A2?

Received A2?

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

Not Found

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T2(AB)

A2 B2

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

Not Found

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T2(AB)

A2 B2

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

FailureCoordinator

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3T2

(AB)

A2 B2

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

FailureCoordinator

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1C1

T1 (ABC)

A1 B1 C1

T3(A)

A3T2(AB)

A2B2

T2(AB)

A2 B2

The Failure CoordinatorA

B

C

DROP

Received A2?

Received A2?

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T1 (ABC)

A1 B1C1

FailureCoordinator

The Failure CoordinatorA

B

C

DROP

Not Found

Not Found

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T1 (ABC)

A1 B1C1

FailureCoordinator

The Failure CoordinatorA

B

C

DROP

Not Found

Not Found

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T1 (ABC)

A1 B1C1

FailureCoordinator

The Failure CoordinatorA

B

C

DROP

Drop A2

Drop A2

Drop A2

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T1 (ABC)

A1 B1C1

FailureCoordinator

The Failure CoordinatorA

B

C

Drop A2

Drop A2

Drop A2

NOOP

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T1 (ABC)

A1 B1C1

FailureCoordinator

The Failure CoordinatorA

B

C

Drop A2

Drop A2

Drop A2

NOOP

Drops: A2

Drops: A2

T1 (ABC)

A1B1 C1

T1 (ABC)

A1 B1 C1

T3(A)

A3

T1 (ABC)

A1 B1C1

FailureCoordinator

Designated Learner and Sequencer FailuresDesignated learner (DL) failure:

• View change based protocol

• Ensures new DL learns all committed transactions from previous views

Sequencer failure:

• Higher epoch number from the new sequencer

• Epoch change ensures all replicas across all shards start the new epoch in consistent states

Can we process non-independent transactions efficiently?

Can we process non-independent transactions efficiently?

Yes, by dividing them into multiple independent transactions

(See the paper!)

Outline

1. Introduction

2. In-Network Concurrency Control

3. Transaction Model

4. Eris Protocol

5. Evaluation

Evaluation Setup• 3-level fat-tree topology testbed

• 15 shards, 3 replicas per shard

• 2.5 GHz Intel Xeon E5-2680 servers

• Middlebox sequencer implementation using Cavium Octeon CN6880

• YCSB+T and TPC-C workloads

Comparison Systems

• Lock-Store (2PC + 2PL + Paxos)

• TAPIR [SOSP ’15]

• Granola [ATC ‘12]

• Non-transactional, unreplicated (NT-UR)

Eris performs well on independent transactions

Lock-Store TAPIR Granola Eris NT-UR0K

300K

600K

900K

1,200K

Distributed independent transactions

Thro

ughp

ut (t

xns/

sec)

Eris performs well on independent transactions

Lock-Store TAPIR Granola Eris NT-UR0K

300K

600K

900K

1,200K

Distributed independent transactions

Thro

ughp

ut (t

xns/

sec)

Eris outperforms Lock-Store, TAPIR and

Granola by more than 3X

Eris performs well on independent transactions

Lock-Store TAPIR Granola Eris NT-UR0K

300K

600K

900K

1,200K

Distributed independent transactions

Thro

ughp

ut (t

xns/

sec)

Eris achieves throughput within

10% of NT-UR

Eris outperforms Lock-Store, TAPIR and

Granola by more than 3X

Eris performs well on independent transactions

Lock-Store TAPIR Granola Eris NT-UR0K

300K

600K

900K

1,200K

Distributed independent transactions

Thro

ughp

ut (t

xns/

sec)

Eris achieves throughput within

10% of NT-UR

Eris outperforms Lock-Store, TAPIR and

Granola by more than 3X

More than 70% reduction in latency compared to Lock-Store, and within 10% latency of NT-UR

Eris also performs well on general transactions

Lock-Store TAPIR Granola Eris NT-UR0K

300K

600K

900K

1,200K

Distributed general transactions

Thro

ughp

ut (t

xns/

sec)

Eris also performs well on general transactions

Lock-Store TAPIR Granola Eris NT-UR0K

300K

600K

900K

1,200K

Distributed general transactions

Thro

ughp

ut (t

xns/

sec) Eris maintains

throughput within 10% of NT-UR

0K

60K

120K

180K

240K

Lock-Store TAPIR Granola Eris NT-UR

TPC-C benchmark

Thro

ughp

ut (t

xns/

sec)

Eris excels at complex transactional application.

0K

60K

120K

180K

240K

Lock-Store TAPIR Granola Eris NT-UR

TPC-C benchmark

Thro

ughp

ut (t

xns/

sec)

Eris excels at complex transactional application.

7.6X and 6.4X higher throughput than

Lock-Store and Tapir

0K

60K

120K

180K

240K

Lock-Store TAPIR Granola Eris NT-UR

TPC-C benchmark

Thro

ughp

ut (t

xns/

sec)

Eris excels at complex transactional application.

7.6X and 6.4X higher throughput than

Lock-Store and Tapir

within 3% throughput of NT-UR

Eris is resilient to network anomalies

0K

450K

900K

1,350K

1,800K

0.01% 0.1% 1% 10%

Eris Lock-Store TAPIRGranola NT-UR

Packet Drop Rate

Thro

ughp

ut (t

xns/

sec)

Eris is resilient to network anomalies

0K

450K

900K

1,350K

1,800K

0.01% 0.1% 1% 10%

Eris Lock-Store TAPIRGranola NT-UR

Packet Drop Rate

TAPIRLock-Store

Eris

Granola

NT-UR

Thro

ughp

ut (t

xns/

sec)

Related WorkCo-designing distributed systems with the network

• NOPaxos [OSDI ‘16], Speculative Paxos [NSDI ‘15], NetPaxos [SOSR ‘15]

Sequencers for transaction processing

• Hyder [CIDR ‘11], vCorfu [NSDI ‘17], Calvin [SIGMOD ‘12]

Independent and other restricted transaction models

• H-Store [VLDB ‘07], Granola [ATC ‘12], Calvin [SIGMOD ‘12]

Conclusion• A new division of responsibility for transaction processing

✤ An in-network concurrency control mechanism that establishes a consistent order of transactions across shards

✤ An efficient protocol that ensures reliable delivery of independent transactions

✤ A general transaction layer atop independent transaction processing

• Result: strongly consistent, fault-tolerant transactions with minimal performance overhead