ADRIAN DOZSA
DECEMBER 5, 2019
Your account balance is eventually consistent -A Postgres Active-Active story from Banking
High Availability is critical for payment systems, as the cost of downtime for mission-critical systems in Banking is on
average $10M per hour. System availability needs are expressed as “5 9’s”. This in practice requires both single-site
and dual-site redundancy of database nodes, multi-master Active-Active processing across Data Centers, and
“Continuous Availability” at both Data Centers even when the WAN between them fails. The right balance must be
struck in the overall design between handling extremely write-intensive requests, with high throughput and low
latency, Availability under a variety of failure modes, and enforcement of “bank account balance” immediate and
eventual consistency.
This talk will present our journey to migrate a mission-critical payment system to Postgres in a multi-master Active-
Active topology, using Postgres-BDR, in a manner that addresses these challenges. This will include detailing
selection of replication modes used, transaction recovery and duplicate transaction avoidance mechanisms during
node failover, and incorporating appropriate patterns of use for replication conflict detection-and-resolution
mechanisms (particularly Conflict-Free Replication Data Types).
In conclusion we’ll see how Postgres-BDR can satisfy the demanding requirements of a mission-critical banking
application.
This is a hidden slide
Abstract
• Payments Systems
• Where we started from
• Single Site
• High Availability
• Consistency
• Dual Site
• High Availability
• Consistency
• Conclusion
Agenda
• Over a decade experience
building payments systems
• Focus on Non-Functional Requirements
(aka -ilities)
• Application Engineer/Architect, not a DBA
• ACI Worldwide
About me
About ACI Worldwide
Banks
Financial
Intermediaries
Merchants
Corporates
Customer
Segments
Deployment
Models
Platform
(ACI Data
Center)
Licensed
Solution Areas
Retail
Payments
Real-Time
Payments
Merchant
Payments
Bill
Payments
Digital
Channels
Payments
Intelligence
Americas
4,600+customers
EMEA
400+customers
Asia/Pacific
200+customers
45years in payments
5,300+organizations around
the world use ACI solutions
Payments Systems
A few defining characteristics
o Very high cost of failure ($9.3M per hour as per ITIC 2018)
o Very High-Availability needs – 5 9’s (over 86% need 5 9’s as per ITIC 2018)
o No single point of failure
o Strict consistency requirements
o Low latency (tens of ms business transaction)
o High throughput (thousands of business transactions per sec)
o Very high write ratios (up to 100%)
Payments Systems
Not all payments are equal
Low value payments High value payments
Retail payments (aka cards) Realtime and Wholesale payments
Your grocery store purchase Your grocery store paying their supplier
Payment loss or uncertainty
in rare occasions is tolerated
Payment loss or uncertainty
not tolerated
Two message payment
Authorization now + Settlement later
Single message payment
Authorization and Settlement in one
“Second chance” to fix it No “second chances”
Where we started from
Where we started from
Site A
ApplicationApplication
RAC Cluster
Golden Gate
(async)
Site B
ApplicationApplication
RAC Cluster
• Oracle deployment
• RAC for single site availability
o No single point of failure
• GoldenGate for dual-site
Active-Active availability
o Advanced conflict detection
and resolution logic
o Not all data is replicated
• 5 9’s Availability
Single Site Availability
HAProxy
Application
Replica pair
Application
Active Passive
HAProxy
Promote
Temporary outage
HAProxy
Application
Active-Active replica pair
Application
Active Active
HAProxy
Already active
No downtime
BDR
• How about in-flight transaction?
• Retry on second node
• No failed transaction
• Failure transparent to business logic
Transparent failover
• Availability ✓
• Consistency ?
Single site
• Oracle RAC
o Shared disks
o Shared memory
o Shared locks (coordination)
• Implications
o Nodes never diverge
(not even temporary)
• Postgres BDR
o Separate disks
o Separate memory
o Independent transactions
o Nodes diverge
(for a split sec, or failover)
Shared-everything vs Shared-nothing
Possible consistency failures (e.g. duplicates):
1. during failover and recovery
2. during switch-over
3. during failback
4. use of remote_write
Consistency failures
HAProxy
Application
Problem 1: failover and recovery
Application
HAProxy
BDR
-$100
sync
-$100
-$200 -$200
Duplicate transaction
• Commit uncertainty (à la Schrödinger's cat)
• No way to find what happened
• Risk of duplicates on recovery
The problem
HAProxy
Application
Solution: CAMO
Application
HAProxy
BDR
Remote first
sync
-$100-$100
No duplicates
App: Do you have the txn?
DB: Yes!
App: Ok. Nothing to do.
Did it commit?
-$100-$100
• Commit at Most Once
• Remote first
• Allows safe application retries
• Removes commit uncertainty
• Protects against duplicates
• Needs application involvement
CAMO
CAMO Performance
0
10
20
30
40
50
60
200
400
800
1,6
00
2,4
00
3,2
00
4,0
00
4,8
00
5,6
00
6,4
00
7,2
00
8,0
00
8,8
00
9,6
00
10,4
00
11,2
00
12,0
00
12,8
00
Late
nc
y [
ms
]
TPS
CAMO vs sync - 8k inserts
CAMO CAMO remote_write Sync remote_write
1
10
100
1000
200
400
800
1,6
00
2,4
00
3,2
00
4,0
00
La
ten
cy [
ms
]
TPS
CAMO vs sync - 32k inserts
CAMO CAMO remote_write Sync remote_write
Problem 2: dual node connections
HAProxy
ApplicationApplication
HAProxy
BDR
switch
key1key1
Duplicate
Idle connections
move to 2nd node
Solution: always connect to one node only
HAProxy
ApplicationApplication
HAProxy
BDR
switch
key1
No duplicates
kill all sessions
replay killed sessions
HAProxy
Application
Problem 3: failback
Application
HAProxy
BDR
key1key1
Duplicate
failback
not in-sync
Solution: site failover
replication lag
no CAMO
Problem 4: remote_write
HAProxy
ApplicationApplication
HAProxy
BDR
key1
key1
Duplicate
remote_write
key1
Solution: wait to apply
HAProxy
ApplicationApplication
HAProxy
BDR
key1
key1
No duplicates
remote_write
key1
wait to apply all
Single Site Availability
• No single point of failure
• Transparent failover
• No lost transactions
• No duplicates (or other constraint violations)
• Postgres-BDR can successfully replace Oracle RAC
Single Site
Dual Site Availability
• Need Disaster Recovery → Dual site
• Proven and fast failover → Active site
• Latency → Asynchronous replication
→ Dual site Active-Active Asynchronous replication
Dual site
Dual site topology
Sync(CAMO)
Site A
BDR Group
HAProxy
Application
Async
Site B
HAProxy
Application
Sync(CAMO)
• CAP Theorem → Availability vs Consistency
• High-Availability → Eventual consistency
• Conflict detection and resolution (timestamp based, CRDTs)
Active-Active
Timestamp based CDR
Site A Site B
T1 John Doe status A … John Doe status B … T2
T3 John Doe status B … John Doe status B … T3
T1 < T2 T1 < T2
Convergence
Disjoint updates
Site A Site B
T1 John Doe new status old address John Doe old status new address T2
T3 John Doe old status new address John Doe old status new address T3
T1 < T2 T1 < T2
Discarded update
Column level conflict resolution
Site A Site B
T1 John Doe new status old address John Doe old status new address T2
T3 John Doe new status new address John Doe new status new address T3
apply column update apply column update
Both updates retained
Amounts
Site A Site B
T1 John Doe $1200 … John Doe $900 … T2
T3 John Doe $900 … John Doe $900 … T3
T1 < T2 T1 < T2
T0 John Doe $1000 … John Doe $1000 … T0
+$200 -$100
Convergence, but wrong
Where’s my money?
Amounts with CRDTs
Site A Site B
T1 John Doe $1200 … John Doe $900 … T2
T3 John Doe $1100 … John Doe $1100 … T3
-$100 +$200
T0 John Doe $1000 … John Doe $1000 … T0
+$200 -$100
Convergence, and correct
Balances/Limits
Site A Site B
T1 John Doe $100 … John Doe $0 … T2
T3 John Doe -$100 … John Doe -$100 … T3
-$200 -$100
T0 John Doe $200 … John Doe $200 … T0
-$100 -$200
Convergence, and correct
Compromise
• We need strict global constraints
o E.g. bank liquidity
• Hard problem… for some other time… ☺
Strict limits
• Dual site multi-master Active-Active Postgres deployment using BDR
• Fully redundant and fully consistent single site
• Strong eventual consistency across sites
• 5 9’s availability
• Postgres-BDR can successfully replace Oracle RAC + GoldenGate
Conclusion
QUESTIONS?
ADRIAN DOZSA
linkedin.com/in/dozsa
THANK YOU
Takeaway
You can bank on Postgres ☺