Replication - Nick Carboni - ManageIQ Design Summit 2016

Preview:

Citation preview

Database Replication in ManageIQ

Introduction

Nick Carboni

Software Engineer at Red Hat

Appliance and database for ManageIQ (Platform Team)

Replication in ManageIQ

Multiple databases can be used for far away environments

Database replication allows for easy reporting and visibility

Database 1 Database 2

Global Database

Replication!

rubyrep

http://www.rubyrep.org

https://github.com/rubyrep/rubyrep

Last commit in 2011

rubyrep

Ruby + ActiveRecord

Triggers = Developer headaches

SQL

Configuration is … hardelsif commands.include? command status = commands[command][:command].run(args.slice(1, 1_000_000))

rubyrep in ManageIQ

We use our own fork

https://github.com/ManageIQ/rubyrep

Mainly support for new versions of ActiveRecord

rubyrep in ManageIQPR to add support for Rails 4.2

March 27, 2015

“Yup. It's basically dead. Our changes are also SOOO crazy different, that this is like a Frankenstein project now. I wonder if we should just write something different from scratch given our experience...or perhaps we should rethink replication altogether and come up with a plan.”

-@Fryguy

rubyrep in ManageIQ

Worker process

Database Synchronization role

Worker will run on the server with the role

Passive global database

pglogical

http://2ndquadrant.com/en-us/resources/pglogical/

https://github.com/2ndQuadrant/pglogical/

Inclusion into PostgreSQL core in progress

9.7+??

pglogical

PostgreSQL extension

Process runs on database server

Managed by PostgreSQL service

SQL stored procedure API

postgresql.conf configuration

pglogicalPostgreSQL logical decoding

https://www.postgresql.org/docs/9.4/static/logicaldecoding.html

Output plugin

pglogical_output

Consumer extension

pglogical

Replication slots

https://www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION-SLOTS

pglogical in ManageIQ

AR connection adapter extension

Exposes a ruby method per pglogical stored procedure

Configure “subscriptions” to remote regions through global region UI

Exclude table configuration unchanged

“Advanced Settings” tab in remote region UI

Global Region Configuration

Performance testing

How long does it take to replicate a “backlog” of data?

Vary amount of data and artificial network latency

Measure time until “backlog” is zero

rubyrep - number of rows in rr_pending_changes table

pglogical - transaction log location compared to flush location

SELECT pg_xlog_location_diff(pg_current_xlog_location(), flush_location) AS lag_bytes FROM pg_stat_replication;

Also record time to insert test rows

Insert Performance

Rows pglogical rubyrep

1,000 0.158 (σ = 0.006) 0.189 (σ = 0.011)

10,000 1.499 (σ = 0.095) 1.84 (σ = 0.093)

100,000 16.105 (σ = 0.296) 18.042 (σ = 0.320)

1,000,000 147.508 (σ = 3.557) 195.399 (σ = 18.317)

Sample size = 10

No Latency

25ms Latency (50ms ping time)

100ms Latency (200ms ping time)

Q&A