33
Database High Availability Using SHADOW Systems Jaemyung Kim, Kenneth Salem, Khuzaima Daudjee, Ashraf Aboulnaga, and Xin Pan University of Waterloo SoCC 2015

Database High Availability Using SHADOW Systems

Embed Size (px)

Citation preview

Page 1: Database High Availability Using SHADOW Systems

Database High Availability Using SHADOWSystems

Jaemyung Kim, Kenneth Salem, Khuzaima Daudjee,Ashraf Aboulnaga, and Xin Pan

University of Waterloo

SoCC 2015

Page 2: Database High Availability Using SHADOW Systems

SHADOW Hot Standby HA for Cloud

How can we exploitshared persistent storage

to build betterhighly available database systems?

2

Page 3: Database High Availability Using SHADOW Systems

Overview

1 Standalone and Hot Standby Failure Handling

2 Shared Storage in Cloud Settings

3 SHADOW: Hot Standby HA for Cloud

4 Performance Evaluation

5 Conclusion

3

Page 4: Database High Availability Using SHADOW Systems

Example DBMS Setting

x DBMSbu↵er pool

Log DB

BEGIN CHECKPOINTEND CHECKPOINT

C

nodefailure

restart recovery

4

Page 5: Database High Availability Using SHADOW Systems

Standalone Restart Recovery

x DBMSbu↵er pool

Log DB

BEGIN CHECKPOINTEND CHECKPOINT

C

nodefailure

restart recovery

4

Page 6: Database High Availability Using SHADOW Systems

Standalone Restart Recovery

x DBMSbu↵er pool

Log DB

BEGIN CHECKPOINTEND CHECKPOINT

C

nodefailure

restart recovery

4

Page 7: Database High Availability Using SHADOW Systems

Typical Shared-Nothing Hot Standby

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

x

nodefailure

failover

BEGIN CHECKPOINTEND CHECKPOINT

C C

5

Page 8: Database High Availability Using SHADOW Systems

Hot Standby Failure and Failover

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

x

nodefailure

failover

BEGIN CHECKPOINTEND CHECKPOINT

C C

5

Page 9: Database High Availability Using SHADOW Systems

Hot Standby Failure and Failover

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

x

nodefailure

failover

BEGIN CHECKPOINTEND CHECKPOINT

C C

5

Page 10: Database High Availability Using SHADOW Systems

Hot Standby Is Widely Used

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

x

nodefailure

failover

BEGIN CHECKPOINTEND CHECKPOINT

C C

5

Page 11: Database High Availability Using SHADOW Systems

Typical Hot Standby High Availability

Storage in Cloud?

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

Persistent Storage = Reliable Shared StorageHow can we exploit shared persistent storageto build better highly available database systems?

6

Page 12: Database High Availability Using SHADOW Systems

Shared Storage in Cloud Settings

Storage in Cloud?

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

Persistent Storage = Reliable Shared StorageHow can we exploit shared persistent storageto build better highly available database systems?

6

Page 13: Database High Availability Using SHADOW Systems

Shared Storage in Cloud Settings

Storage in Cloud?

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

Persistent Storage = Reliable Shared Storage

How can we exploit shared persistent storageto build better highly available database systems?

6

Page 14: Database High Availability Using SHADOW Systems

Shared Storage in Cloud Settings

Storage in Cloud?

Active DBMSbu↵er pool

Log DB

Standby DBMS

bu↵er pool

Log DB

Persistent Storage = Reliable Shared Storage

How can we exploit shared persistent storageto build better highly available database systems?

6

Page 15: Database High Availability Using SHADOW Systems

SHADOW: Hot Standby HA for Cloud

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB DBLog Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

Recycled Hot Standby HA in Cloud

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 16: Database High Availability Using SHADOW Systems

SHADOW: Single Logical Log

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB DB

Log

LogLog

async rep

Xwrite-o✏oading coordinatedcheckpoint

Single Logical Log

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 17: Database High Availability Using SHADOW Systems

SHADOW: Single Logical Database

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB DB

Log

LogLog

async rep

Xwrite-o✏oading coordinatedcheckpoint

Single Logical Database

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 18: Database High Availability Using SHADOW Systems

SHADOW: Hot Standby HA for Cloud

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

SHADOW High Availability

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 19: Database High Availability Using SHADOW Systems

SHADOW: Hot Standby HA for Cloud

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 20: Database High Availability Using SHADOW Systems

SHADOW: Node Failure and Failover

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 21: Database High Availability Using SHADOW Systems

SHADOW: Node Failure and Failover

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 22: Database High Availability Using SHADOW Systems

Advantages of SHADOW HA

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 23: Database High Availability Using SHADOW Systems

Advantages of SHADOW HA

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 24: Database High Availability Using SHADOW Systems

Advantages of SHADOW HA

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 25: Database High Availability Using SHADOW Systems

Advantages of SHADOW HA

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 26: Database High Availability Using SHADOW Systems

Advantages of SHADOW HA

Active DBMSbu↵er pool

Standby DBMS

bu↵er pool

x

nodefailure

failover

DB

DB

Log Log

Log

async rep

Xwrite-o✏oading coordinatedcheckpoint

BEGIN CHECKPOINTEND CHECKPOINT

C

Simplicity: pushes responsibility for durabilityand replication into the storage system

Flexibility: decouples database replication fromDBMS replication

Performance: write-o✏oading (logging andcheckpointing)

E�ciency: less I/O bandwidth

7

Page 27: Database High Availability Using SHADOW Systems

Experimental Methodology

Compare SHADOW System with two baselinesStandalone (SA): single DBMS with restart recovery (varyingcheckpint interval)Synchronous Replication (SR): two replicated DBMSes (nativePostgreSQL implementation)

TPC-C Benchmark (100 Warehouses), no think time

PostgreSQL 9.3 (database fits in memory)

Linux kernel 3.2.0-56

Amazon EC2 with Elastic Block Store (EBS)

8

Page 28: Database High Availability Using SHADOW Systems

TPC-C Benchmark Throughtput

0

10000

20000

30000

40000

50000

60000

SAD SA10 SA

Thro

ughput (t

pm

C)

Standalone

Amazon EC2 c3.4xlarge instances, 100WH TPC-C workloaddatabase fits in memory (PostgreSQL 9.3, Linux kernel 3.2.0-56)

Hot Standby

9

Page 29: Database High Availability Using SHADOW Systems

TPC-C Benchmark Throughtput

0

10000

20000

30000

40000

50000

60000

SAD SA10 SA SR

Thro

ughput (t

pm

C)

Standalone

Amazon EC2 c3.4xlarge instances, 100WH TPC-C workloaddatabase fits in memory (PostgreSQL 9.3, Linux kernel 3.2.0-56)

Hot Standby

9

Page 30: Database High Availability Using SHADOW Systems

TPC-C Benchmark Throughtput

0

10000

20000

30000

40000

50000

60000

SAD SA10 SA SR SHADOW

Thro

ughput (t

pm

C)

Standalone

Amazon EC2 c3.4xlarge instances, 100WH TPC-C workloaddatabase fits in memory (PostgreSQL 9.3, Linux kernel 3.2.0-56)

Hot Standby

9

Page 31: Database High Availability Using SHADOW Systems

Variability of TPC-C Throughput

0

10000

20000

30000

40000

50000

60000

SAD SA10 SA

tpm

C

Standalone

Box-and-whisker plot (Q1,Q2,Q3)Ten second interval (new order transactions per second x 60))

0

10000

20000

30000

40000

50000

60000

SAD SA10 SA SR SHADOW

tpm

C

Hot Standby

10

Page 32: Database High Availability Using SHADOW Systems

Conclusion

SHADOW Hot Standby HA for CloudHow to exploit shared storage to build better high availabledatabase systems?

Single logical copy of the database and logPushes responsibility for replication out of the DBMS and intothe underlying storage tierDecouples database replication from DBMS replication

Outperforms PostgreSQL’s native replication on a TPC-Cbenchmark

Stabler throughput(tpmC) over time

Geographical limitationReplication coverage is limited to shared storage coverage

11

Page 33: Database High Availability Using SHADOW Systems

Mahalo!

12