How Heroku Postgres Works - Citus...

Preview:

Citation preview

How Heroku Postgres Works

Greg Burekgregburek@heroku.com

Nov 2015 +

OutlineHeroku and Databases as a Service

How we manage 1 million+ Postgres DBs

Monitoring for 1m+

Incident response for 1m+

Disaster Recovery for 1m+

What is Heroku?

Applications

MonitoringLogging

Build And Deploy

Add-ons

Creating postgresql-solid-8793... done, (free) Adding postgresql-solid-8793 to demo-app... done Setting HEROKU_POSTGRESQL_GRAY_URL and restarting demo-app... done, v2782 Database has been created and is available ! This database is empty. If upgrading, you can transfer ! data from another database with pgbackups:restore Use `heroku addons:docs heroku-postgresql` to view documentation.

$

$ heroku addons:create heroku-postgresql --app demo-app

heroku-postgresql

Applications

MonitoringLogging

Build And Deploy

Databases

MonitoringLogging

Export and Import of dataBackups for HA and DR

heroku-postgresql

heroku-postgresqlheroku-redis

Single Tenant Production

ServerClusterDatabase

Single Tenant Production

ServerClusterDatabase

Multi-tenant Production

ServerclDb

clDb

clDb

Single Tenant Production

ServerClusterDatabase

Multi-tenant Production

ServerclDb

clDb

clDb

Hobby

ServerCluster

Db Db DbDb

Db Db DbDb

Db Db DbDb

Single Tenant Production

Serverlxc

Database

Multi-tenant Production

ServerlxcDb

lxcDb

lxcDb

Hobby

Serverlxc

Db Db DbDb

Db Db DbDb

Db Db DbDb

Plan vCPU RAM PIOPs Multi-tenant Connections

standard-0premium-0 2 1GB 200 Yes 120

standard-2premium-2 2 3.5GB 200 Yes 400

standard-4premium-4 2 15GB 1000 No 500

standard-5premium-5 4 30GB 2000 No 500

standard-6premium-6 8 60GB 3000 No 500

standard-7premium-7enterprise-7

16 120GB 4000 No 500

enterprise-8 32 240GB 4000 No 500

Single Tenant Production

ServerClusterDatabase

Multi-tenant Production

ServerclDb

clDb

clDb

Hobby

ServerCluster

Db Db DbDb

Db Db DbDb

Db Db DbDb

Shogun

ShogunMonitoring

psql

SSH

AWS APIs

Service{

database: ‘d3lwi9ef2’,

port: 5432,

username: ‘u23f8doife9’,

password: ‘dfwefujp’,

created_at: ‘2012-05-02’,

state: ‘available’

}

Server{

ip: ‘192.168.0.1’

instance_id: ‘i-2fidj3c8’,

ami: ‘pg-prod’,

availability_zone: ‘us-east-1a’

created_at: ‘2012-05-02’,

state: ‘booting’

}

available

creating

uncertain

unavailable

deprovisioning

deprovisioned

service.feel service.tick

Need to do this all the time

SSH

psql

echo '1'

select 1

agentless-ish

collectdpglogplex-collector

wal-e

available

creating

uncertain

unavailable

deprovisioning

deprovisioned

Incident Workers

Automated incident resolution

await_resolution

triggered

human_intervention

resolved

archived

resource down?

restart resource and file a ticket

HA leader down?

fail over to standby and file a ticket

server down?

stop and start AWS instance

Stuff happens constantly

Stuff happens constantly

Incidents let us not worry about 99% of it

Circuit Breakers

everything down?

page a human

EBS disk apocalypse?

page a human

database disk full?

add a new EBS disk and xfs_grow

/dev/xvdg

LVM

/database

/dev/xvdg

LVM

/database

/dev/xvdh

/dev/xvdg

LVM

/database

/dev/xvdh

/dev/xvdg

LVM

/database

/dev/xvdh

Server Features

Infrastructure Feature Flags

Immutable-ish Infrastructure

Durability and Availability

INSERT INTO … 1. Write to WAL

2. Keep it in memory

4. Flush to disk3. Respond to client

Ship WAL at least every 60s

S3

fork

follower

timeline

T0

participant

participant

followers

fork

Point In Time Recovery

disaster

disaster

HA recovery

STONITH

complicated project

modularize and build APIs

composable services

abstract-able services

Thanks!@gregburek

@herokupostgres

+