Shootout at the AWS Corral

Preview:

Citation preview

Shootout at the AWS Corral

EC2

RDS

Heroku

Josh BerkusPostgreSQL Experts Inc.SCALE 13

https://github.com/manageacloud/

cloud-benchmark-postgres

Thanks!Ruben Rubio Rey

Thanks!

What is AmazonWeb Services?

Magic?

Image by Sperlingsmaedchen. Free for non-profit use only.

a bunch of servers

with virtualization

shared storage

… and a great API with stuff

The Good● Fast deployment

– new servers in minutes, with a script

● Easy scale-out– add replicas in minutes

● Minimize ops staff– no HW wranglers

¢heap(at the low end)

ex$pensive(at the high end)

The Bad● Low system resources

– VMs are small/slow

● Security– more attack vectors

The Ugly● Everything is shared

– Network– IO / Storage– CPU (partly)

Your performance depends on somebody else's peak load.

Sharingis

Not Caring

ephemeral cloud● DR is not optional● your virtual DB server will go away● you need replicas & backup

why Postgres?

transactional DB workout● works CPU● works RAM● works IO● works network IO

at the same time, in parallel

Cast of Three

The GunslingerRoll-Your-Own

The RancherRDS

The DandyHeroku

The Gunslinger

The Gunslinger

Roll Your Ownon EC2

Become a Gunslinger

I. Create an EC2 instance

II. Install PostgreSQL on it

III. Configure PostgreSQL

Roll-Your-Own ++● cheapest option● highly configurable● install whatever you want

– version– extensions

Roll-Your-Own --● you still do all the admin

– installation– backup/redundancy– updates – OS updates

● configuration required

A Fistful of Services● AMIs

– “clone” your database server setup

● AWS's other services– caching, queueing, s3 storage, etc.

Instance Types● m3.* general-purpose● c3.* small, CPU-bound DBs $$● r3.* maximize caching $$$● i2.* Data warehousing $$$$

instance tips● Use m3.* if you don't know what to use● Get enough RAM to cache your

database if you can

Storage Types● EBS + Provisioned IOPS

– large size, latency issues– reliable, snapshots– choose “EBS optimized”

● General SSD– better for bursts (about 20% better)– high variability

Storage Types● Instance Storage + SSD option

– low-latency, limited size– risky: data loss, data corruption– just for “running with scissors”

PrIOPS != througput

PrIOPS fallacy● not a guarantee, a limit

– but mostly pretty consistent

● each IOP is no more than 8K● random access → each page is an IOP

– no real prefetch

● PrIOPS ~ rows/second

Stuff to Set Up● Backup: WAL-E to S3● Replication: not optional

– in another Availbility Zone

● Monitoring for instance failure● Secure your instance

– SSL– pg_hba.conf

Configuration Tips● random_page_cost = 1.5● wal_buffers = 32 to 64MB● stats_temp_directory =

/mnt/tmpfs/stats● synchronous_commit = off

– if you can afford it

Junior Gunslinger● “small”

“economical”● m3.medium● 1 core

3.75GB RAM● 40GB

+ 1000 PrIOPS

Senior Gunslinger● “large”

“performance”● r3.2xlarge● 8 cores

61GB RAM● 200GB

+ 4000 PrIOPS

Junior Gunslinger Costinstance: $36.50/month

EBS PrIOPS: $105/month

S3 Archive: $5/month

X2 for replica

== $288.00 a month

(+ misc charges)

Cheaper Gunslingerinstance: $36.50/month

EBS PrIOPS: $105/month

S3 Archive: $5/month

no replica

== $146.50 a month

(+ misc charges)

Senior Gunslingerarchive-only: $760.70/month

with replica: $1509.40/month

(+ misc charges)

The Rancher

The RancherRDS

RelationalDatabaseService

In The

Middle

Ranching 1011. Go to AWS RDS

2. Choose “PostgreSQL”

3. Select instance size and storage

4. Launch

5. Connect over port 5432

RDS ++

● Simpler deployment● AWS manages updates, uptime● Easy replicas● Double redundancy

– multizone warm standbyOR replicas

– regular DB snapshots

RDS --● Limited extensions● 9.3 only

– and not promptly updated

● No shell access● Still might have to configure

Postgres

Rancher equipment● Integration with some AWS sevices

– caching– S3 snapshotting– plus regular access to other services

● 2 dozen extensions available

RDS Options● Instance types:

– same m3.* and r3.* options– no c3 or i3 instances currently

● again, get enough RAM● All storage is EBS

– take PrIOPS storage options

RDS redundancy● Do Multi-AZ instances or replication

– Multi-AZ: automated failover– Replication: better performance

● Set up auto DB snapshots– automatically deleted snapshots?

RDS configuration● Same as Roll-Your-Own

– except: be cautious, defaults are OK

● Except fewer security options

RDS CostSmall, single: $ 184.35Small, redundant: $ 358.70Large, single: $1119.35

Large, redundant: $2342.70

(+ misc charges)

The Dandy

The DandyHeroku

Herokufor

white-gloveservice

Doing the Dandy1. Choose “Create Database”

2. Pick a size

3. Launch

4. Connect using supplied credentials

Heroku ++● Heroku manages everything

– updates, backups, availability, configuration

● really no Ops staff● Heroku-only features● Latest Postgres stuff

– sometimes feature previews

Heroku --● No configurability

– webapp assumed– can't control AZ, etc.

● Limited extensions & versions● Costs escalate● No shell

Dandy Bling● git-based instance management

– works really well with Rails/Django

● Dataclips– web-sharable matviews!

● Followers == replicas

Dandy Bling● About 20 extensions● Heroku addons and apps● encryption● Access all AWS services

Heroku options

● 5 database “sizes”● 3 levels of HA/uptime● that's it

Heroku Sizing● Small

Standard 2: 3.5 GB RAMshared hosting

● LargeStandard 6: 60GB RAMdedicated instance

Heroku Sizing● Small

archive: $200HA: $350

● Largearchive: $2000HA: $3500

the shootout

pgbench++● ships with Postgres● microbenchmark

– very simple “bank trade” workload

● fast to set up and run

pgbench--● doesn't do complex queries● pure random data / access● unrealistic balance of work

– too reliant on single-row write speed

● not very tunable

pgbench sizing1. memory read-write (RW):

– 50% of RAM, write transactions

2. memory read-only (RO):– 50% of RAM, read-only queries

3. disk read-write (RW):– 200% of RAM, write transactions

pgbench small●memory RW

● pgbench -i -s 100 --foreign-keys● pgbench -c 4 -T 900

●memory RO● pgbench -i -s 100 --foreign-keys● pgbench -c 4 -T 900 -S

●disk RW● pgbench -i -s 400 --foreign-keys● pgbench -c 4 -T 900

pgbench large●memory RW

● pgbench -i -s 1000 --foreign-keys● pgbench -c 16 -T 900

●memory RO● pgbench -i -s 1000 --foreign-keys● pgbench -c 16 -T 900 -S

●disk RW● pgbench -i -s 7000 --foreign-keys● pgbench -c 16 -T 900

metrics● TPS: transactions-per-second

– measures multiple things

● Load Time: time to build the database

run many many timesrun many many times

Box PlotBox Plot

495 TPS

587 TPS

1685 TPS

2537 TPS

6156 TPS

50

90

10

Min

Max

0.3X

0.4X

Median

1.7X

4X

50

90

10

Min

Max

when the smoke clears ...

featuresVersions Extensions Superuser Replication

EC2 Any All Yes Yes

RDS 9.3 only Some No Yes

Heroku 9.3, 9.4, betas Some No Yes

Auto-Failover

Snapshots Extras Support

EC2 No DIY DIY No

RDS Yes* Yes No No

Heroku Yes Yes Yes Yes

EC2 RDS Heroku$0.00

$100.00

$200.00

$300.00

$400.00

Small Instance Pricing

Archive

HA

cos

t pe

r m

on

th

EC2 RDS Heroku$0.00

$500.00

$1,000.00

$1,500.00

$2,000.00

$2,500.00

$3,000.00

$3,500.00

$4,000.00

Large Instance Pricing

Archive

HA

cos

t pe

r m

on

th

EC2 Heroku RDS RDS HA0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Small Node Load TimeIn-memory DB (smaller is faster)

median

90%

Min

ute

s

EC2 Heroku RDS RDS HA0

5

10

15

20

25

30

Small Node Load TimeOn-Disk DB (smaller is faster)

median

90%

Min

ute

s

EC2 Heroku RDS RDS HA0

5

10

15

20

25

30

Large Node Load TimeIn-Memory DB (smaller is faster)

median

90%

Min

ute

s

EC2 Heroku RDS RDS HA0

20

40

60

80

100

120

140

160

180

Large Node Load TimeOn-Disk DB (smaller is faster)

median

90%

Min

ute

s

EC2 Heroku RDS RDS HA0

100

200

300

400

500

600

700

800

In-Memory RW TestSmall Node (taller is faster)

median

90%

TP

S

EC2 Heroku RDS RDS HA0

500

1000

1500

2000

2500

3000

In-Memory RW TestLarge Node (taller is faster)

median

90%

TP

S

EC2 Heroku RDS RDS HA0

1000

2000

3000

4000

5000

6000

7000

In-Memory RO TestSmall Node (taller is faster)

median

90%

TP

S

EC2 Heroku RDS RDS HA0

5000

10000

15000

20000

25000

30000

In-Memory RO TestLarge Node (taller is faster)

median

90%

TP

S

EC2 Heroku RDS RDS HA0

50

100

150

200

250

300

350

400

On-Disk RW TestSmall Node (taller is faster)

median

90%

TP

S

EC2 Heroku RDS RDS HA0

200

400

600

800

1000

1200

1400

1600

On-Disk RW TestLarge Node (taller is faster)

median

90%

TP

S

What's Next

More Clouds● Rackspace● Digital Ocean● OpenShift● Google Compute Engine

More Benchmarks● OLTPBench?

– Wikipedia, Auctionmark, Epinions

● DVDStore?● New benchmark?

– really need something more “webby”

● NoSQLish benchmark?

Better Visualizations● better graphs● automated graph

generation● detailed response

times and time graphs

“running with scissors”● test for pure ephemeral instances● no transaction log● local SSD● just for RO load-balancing

more shooting● Josh Berkus: josh@pgexperts.com

– www.pgexperts.com

● More Shootouts– www.databasesoup.com– https://github.com/manageacloud/cloud-

benchmark-postgres/– pgConf.US NYC, pgCon Ottawa

Copyright 2015 PostgreSQL Experts Inc. Released under the Creative Commons Share-Alike 3.0 License. All images and trademarks are the property of their respective owners.