Upload
tyrell-bowdle
View
229
Download
0
Tags:
Embed Size (px)
Citation preview
Exploiting Distributed Version Concurrency in a Transactional Memory
Cluster
Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza
University of Toronto, Canada
Transactional Memory Programming ParadigmEach thread executing a parallel region: Announces start of a transaction Executes operations on shared objects Attempts to commit the transaction
If no data race, commit succeeds, operations take effect
Otherwise commit fails, operations discarded, transaction restarted
Simpler than locking!
Transactional Memory Used in multiprocessor platforms
Our work: the first TM implementation on a cluster Supports both SQL and parallel scientific
applications (C++)
TM in a Multiprocessor Node
Multiple physical copies of data High memory overhead
A
Copy of A
T1: Read(A)
T2: Write(A)
T1: ActiveT2: Active
TM on a ClusterKey Idea 1. Distributed Versions Different versions of data arise
naturally in a cluster Create new version on different
node, others read own versions
write read readread
Exploiting Distributed Page Versions
mem0
txn0
mem1
txn1
mem2
txn2
memN
txnN
network
...
Distributed Transactional Memory (DTM)
v3 v2 v1 v0
Key Idea 2: Concurrent “Snapshots” Inside Each Node
read
v1 v1 v2 v2 v2
v2
Txn0 (v1) Txn1 (v2)
Key Idea 2: Concurrent “Snapshots” Inside Each Node
read
v1 v1 v2 v2 v2
v2
Txn0 (v1) Txn1 (v2)
v1 v1 v2 v2 v2
v2
Key Idea 2: Concurrent “Snapshots” Inside Each Node
read
v1 v1 v2 v2 v2
v2
Txn0 (v1) Txn1 (v2)
v1 v1 v1 v2 v2
v2
Distributed Transactional Memory
A novel fine-grained distributed concurrency control algorithm
Low memory overhead Exploits distributed versions Supports multithreading within the node Provides 1-copy serializability
Outline Programming Interface Design
Data access tracking Data replication Conflict resolution
Experiments Related work and Conclusions
Programming Interface init_transactions() begin_transaction() allocate_dtmemory() commit_transaction()
Need to declare TM variables explicitly
Data Access Tracking DTM traps reads and writes to shared
memory by either one of:
Virtual memory protection Classic page-level memory protection
technique
Operator overloading in C++ Trapping reads: conversion operator Trapping writes: assignment ops (=, +=, …)
& increment/decrement(++/--)
Data Replication
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
Twin Creation
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
Wr p1P1 Twin
Twin Creation
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
Wr p2P1 Twin
P2 Twin
Diff Creation
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
Broadcast of the Modifications at Commit
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
Diff broadcast (vers 8)
Latest Version = 7 Latest Version = 7
v2 v1
v1
Other Nodes Enqueue Diffs
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
Diff broadcast (vers 8) v2 v1v8
v8 v1
Latest Version = 7 Latest Version = 7
Update Latest Version
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
v2 v1v8
v8 v1
Latest Version = 7 Latest Version = 8
Other Nodes Acknowledge Receipt
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
v2 v1
v8 v1
Ack (vers 8)
v8
Latest Version = 7 Latest Version = 8
T1 Commits
……
Page 1
Page 2
Page n
T1(UPDATE)
……
Page 1
Page 2
Page n
v2 v1
v8 v1
v8
Latest Version = 8 Latest Version = 8
Lazy Diff Application
.
.
.
Page 1 V0
Page 2 V0
V8 V1
Page N V3
V5 V4
T2(V2):Rd(…, P1, P2)
Latest Version = 8
V2 V1V8
Lazy Diff Application
.
.
.
Page 1
Page 2 V0
Page N V3
V5 V4
V8
V2
V8 V1
T2(V2):Rd(…, P1, P2)
Latest Version = 8
Lazy Diff Application
.
.
.
Page 1 V2
V8
Page 2 V1
V8
Page N V3
V5 V4
T2(V2):Rd(…, P1, P2)
Latest Version = 8
Lazy Diff Application
.
.
.
Page 1 V2
V8
Page 2 V1
V8
Page N V3
V5 V4T3(V8):Rd(PN)
T2(V2):Rd(…, P1, P2)
Latest Version = 8
Lazy Diff Application
.
.
.
Page 1 V2
V8
Page 2 V1
V8
Page N V5T3(V8):Rd(PN)
T2(V2):Rd(…, P1, P2)
Latest Version = 8
Waiting Due to Conflict
T3(V8):Rd(PN, P2)
.
.
.
Page 1 V2
V8
Page 2 V1
V8
Page N V5
T2(V2):Rd(…, P1, P2)
Wait until T2 commits
Latest Version = 8
Transaction Abort Due to Conflict
.
.
.
Page 1
Page 2 V0
Page N V3
V5 V4
V8
V2
V8 V1
T3(V8):Rd(P2)
T2(V2):Rd(…, P1, P2)
Latest Version = 8
Transaction Abort Due to Conflict
.
.
.
Page 1
Page 2 V8
Page N V3
V5 V4
V8
V2
T3(V8):Rd(P2)
CONFLICT!
T2(V2):Rd(…, P1, P2)
Latest Version = 8
Write-Write Conflict Resolution Can be done in two ways
Executing all updates on a master node, which enforces serialization order
OR Aborting the local update transaction upon
receiving a conflicting diff flush
More on this in the paper
Experimental Platform Cluster of Dual AMD Athlon Computers
512 MB RAM 1.5GHz CPUs RedHat Fedora Linux OS
Benchmarks for Experiments TPC-W e-commerce benchmark
Models an on-line book store Industry-standard workload mixes
Browsing (5% updates) Shopping (20% updates) Ordering (50% updates)
Database size of ~600MB
Hash-table micro-benchmark (in paper)
Application of DTM for E-Commerce
Web Server
The Internet
Customer
App Server
HTTP RPC SQL
Customer
HTTP
Customer
HTTP
HTTP
Web Server
Web Server
App Server
App Server
DATABASE
Application of DTM for E-Commerce
We use a Transactional Memory Cluster as the DB Tier
Web Server
The Internet
Customer
App Server
HTTP RPC SQL
Customer
HTTP
Customer HTTP
HTTP
Web Server
Web Server
App Server
App Server
DB Server
DB Server
DB Server
Cluster Architecture
MySQL In-memory Tier
Master Slave Slave SlaveSlave
Scheduler
MMAP On-disk Database MMAP On-disk Database
Implementation Details We use MySQL’s in-memory HEAP
tables RB-Tree main-memory index No transactional properties
Provided by inserting TM calls
Multiple threads running on each node
Baseline for Comparison State-of-the-art Conflict-aware
protocol for scaling e-commerce on clusters Coarse grained (per-table) concurrency
control
(USITS’03, Middleware’03)
Throughput Scaling
0
50
100
150
200
250
300
350
0 1 2 3 4 5 6 7 8
# of Slave Replicas
Th
rou
gh
pu
t (W
IPS
)
Ordering Shopping Browsing
Fraction of Aborted Transactions
# of slaves Ordering Shopping Browsing
1 1.15% 1.44% 0.63%
2 0.35% 2.27% 1.34%
4 0.07% 1.70% 2.37%
6 0.02% 0.41% 2.07%
8 0.00% 0.22% 1.59%
Comparison (browsing)
0
50
100
150
200
250
300
350
0 2 4 6 8
Number of Replicas
Th
rou
gh
pu
t (W
IPS
)
Conflict-Aware DTM
Comparison (shopping)
0
50
100
150
200
250
300
350
0 2 4 6 8
Number of Replicas
Th
rou
gh
pu
t (W
IPS
)
Conflict-Aware DTM
Comparison (ordering)
0
20
40
60
80
100
120
140
160
180
200
0 2 4 6 8
Number of Replicas
Th
rou
gh
pu
t (W
IPS
)
Conflict-Aware DTM
Related Work Distributed concurrency control for database
applications Postgres-R(SI), Wu and Kemme (ICDE’05) Ganymed, Plattner and Alonso (Middleware’04)
Distributed object stores Argus (’83), QuickStore (’94), OOPSLA’03
Distributed Shared Memory TreadMarks, Keleher et al. (USENIX’94) Tang et al. (IPDPS’04)
Conclusions New software-only transactional memory
scheme on a cluster Both strong consistency and scaling
Fine-grained distributed concurrency control Exploits distributed versions, low memory
overheads Improved throughput scaling for e-
commerce web sites
Questions?
Backup slides
Example Program#include <dtm_types.h>typedef struct Point {
dtm_int x;dtm_int y;
} Point;init_transactions();for (int i = 0; i < 10; i++) {
begin_transaction();Point * p = allocate_dtmemory();p->x = rand();p->y = rand();
commit_transaction();}
Query weights
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
OrdIdx(0.35)
ShpIdx(0.1)
BrwIdx(0.03)
Ord,NoIdx(0.26)
Shp,NoIdx(0.07)
Brw,NoIdx(0.02)
Writes
Reads
Decreasing the fraction of aborts
1.34%
2.34% 2.37%
2.68%
2.07%
2.83%
1.59%
1.34%
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
M + 2S M + 2S,Confl.
Reduce
M + 4S M + 4S,Confl.
Reduce
M + 6S M + 6S,Confl.
Reduce
M + 8S M + 8S,Confl.
Reduce
Fra
cti
on
of
Ab
ort
s
Micro benchmark experiments
0
200
400
600
800
1000
1200
1 2 3 4 5 6 7 8 9 10
number of machines
Th
rou
gh
pu
t (
x 10
00 )
1% 5% 10% 15% 20%
Micro benchmark experiments (with read-only optimization)
0
100
200
300
400
500
1 2 3 4 5 6 7 8 9 10
number of machines
Th
rou
gh
pu
t (
x 10
00 )
R/O Opt Base
Fraction of aborts
# of machines 1 2 4 6 8 10
% aborts 0 0.57 1.69 2.94 4.05 5.08