19
Database Replication in ManageIQ

Replication - Nick Carboni - ManageIQ Design Summit 2016

Embed Size (px)

Citation preview

Page 1: Replication - Nick Carboni - ManageIQ Design Summit 2016

Database Replication in ManageIQ

Page 2: Replication - Nick Carboni - ManageIQ Design Summit 2016

Introduction

Nick Carboni

Software Engineer at Red Hat

Appliance and database for ManageIQ (Platform Team)

Page 3: Replication - Nick Carboni - ManageIQ Design Summit 2016

Replication in ManageIQ

Multiple databases can be used for far away environments

Database replication allows for easy reporting and visibility

Database 1 Database 2

Global Database

Replication!

Page 4: Replication - Nick Carboni - ManageIQ Design Summit 2016

rubyrep

http://www.rubyrep.org

https://github.com/rubyrep/rubyrep

Last commit in 2011

Page 5: Replication - Nick Carboni - ManageIQ Design Summit 2016

rubyrep

Ruby + ActiveRecord

Triggers = Developer headaches

SQL

Configuration is … hardelsif commands.include? command status = commands[command][:command].run(args.slice(1, 1_000_000))

Page 6: Replication - Nick Carboni - ManageIQ Design Summit 2016

rubyrep in ManageIQ

We use our own fork

https://github.com/ManageIQ/rubyrep

Mainly support for new versions of ActiveRecord

Page 7: Replication - Nick Carboni - ManageIQ Design Summit 2016

rubyrep in ManageIQPR to add support for Rails 4.2

March 27, 2015

“Yup. It's basically dead. Our changes are also SOOO crazy different, that this is like a Frankenstein project now. I wonder if we should just write something different from scratch given our experience...or perhaps we should rethink replication altogether and come up with a plan.”

-@Fryguy

Page 8: Replication - Nick Carboni - ManageIQ Design Summit 2016

rubyrep in ManageIQ

Worker process

Database Synchronization role

Worker will run on the server with the role

Passive global database

Page 9: Replication - Nick Carboni - ManageIQ Design Summit 2016

pglogical

http://2ndquadrant.com/en-us/resources/pglogical/

https://github.com/2ndQuadrant/pglogical/

Inclusion into PostgreSQL core in progress

9.7+??

Page 10: Replication - Nick Carboni - ManageIQ Design Summit 2016

pglogical

PostgreSQL extension

Process runs on database server

Managed by PostgreSQL service

SQL stored procedure API

postgresql.conf configuration

Page 11: Replication - Nick Carboni - ManageIQ Design Summit 2016

pglogicalPostgreSQL logical decoding

https://www.postgresql.org/docs/9.4/static/logicaldecoding.html

Output plugin

pglogical_output

Consumer extension

pglogical

Replication slots

https://www.postgresql.org/docs/9.4/static/warm-standby.html#STREAMING-REPLICATION-SLOTS

Page 12: Replication - Nick Carboni - ManageIQ Design Summit 2016

pglogical in ManageIQ

AR connection adapter extension

Exposes a ruby method per pglogical stored procedure

Configure “subscriptions” to remote regions through global region UI

Exclude table configuration unchanged

“Advanced Settings” tab in remote region UI

Page 13: Replication - Nick Carboni - ManageIQ Design Summit 2016

Global Region Configuration

Page 14: Replication - Nick Carboni - ManageIQ Design Summit 2016

Performance testing

How long does it take to replicate a “backlog” of data?

Vary amount of data and artificial network latency

Measure time until “backlog” is zero

rubyrep - number of rows in rr_pending_changes table

pglogical - transaction log location compared to flush location

SELECT pg_xlog_location_diff(pg_current_xlog_location(), flush_location) AS lag_bytes FROM pg_stat_replication;

Also record time to insert test rows

Page 15: Replication - Nick Carboni - ManageIQ Design Summit 2016

Insert Performance

Rows pglogical rubyrep

1,000 0.158 (σ = 0.006) 0.189 (σ = 0.011)

10,000 1.499 (σ = 0.095) 1.84 (σ = 0.093)

100,000 16.105 (σ = 0.296) 18.042 (σ = 0.320)

1,000,000 147.508 (σ = 3.557) 195.399 (σ = 18.317)

Sample size = 10

Page 16: Replication - Nick Carboni - ManageIQ Design Summit 2016

No Latency

Page 17: Replication - Nick Carboni - ManageIQ Design Summit 2016

25ms Latency (50ms ping time)

Page 18: Replication - Nick Carboni - ManageIQ Design Summit 2016

100ms Latency (200ms ping time)

Page 19: Replication - Nick Carboni - ManageIQ Design Summit 2016

Q&A