Upload
exponential-inc
View
336
Download
0
Embed Size (px)
DESCRIPTION
Servers, Storage and Networking have all been virtualized, the next big wave is the database. SQL databases are the one thing in the cloud that require single dedicated instances. Database virtualization changes all of this, enabling full elasticity without sacrificing functionality.
Citation preview
Database VirtualizationThe Next Wave of Big Data
Mike Hogan, CEO
2
Agenda
• Big Data: A Moving Target
• Common Understanding of Virtualization
• Database Virtualization Challenge
• Alternative 1: NoSQL
• Alternative 2: Sharding
• Introducing Database Virtualization
• Narrowing the Gap Between Databases and Big Data
3
Big Data: A Moving Target
• Definition: Too much data to handle in a traditional database
• Big Data tools leverage scale-out architectures e.g. Hadoop
• Technology advances make Big Data a moving target
• Databases adopting scale-out, virtual database architectures
Dat
a V
olu
me
Time
BIG Data
© Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.
What is Database Virtualization?
5
The Dedicated Server
A Server
Server Utilization
Headroom (to avoid failure)
Usage Spike
(Average 10%)
6
The Virtualized App Server
Shared among many customers
Plenty of room for usage peaks
Virtualization enables Cloud Providers to sell 3-4 TIMES more servers than they actually own. This is how they make money.
7
Database Virtualization Challenges
• No coordination between databases (data & locking)
Bank Balance = $10M
Withdraw $10M
Wire $8M
Wire $8M
Bank Balance = -$16M
Bank
You
• Requires a distributed locking solution
• Distributed locking is fairly easy to build…
• …but building it to perform well is extremely hard
• It took Oracle RAC 10 years …70 “cloud years”
8
Alternative 1: NoSQL
Elasticity enables you to burstacross servers, so you can run them at high utilization
9
Alternative 1: NoSQL
Moves functionality to the application tier…more work for you
Your Application
Cons:1. Non-relational (build this into your app)2. Reduces consistency: different users/different answers3. Removes transactions (build this into your app)4. Less functionality e.g. joins (build these into your app)
The DBMS SQLNoSQL
App App
You buy this part
You build & maintain this part
Pros:1. Scalability2. Elastic = high utilization
10
Alternative 2: SQL Sharding
Masters
Slaves
EACH server must handle the peak for ITS data
Cons:1. Not elastic = no bursting across servers2. Rigid partitioning model3. Requires slaves for fail-over (vs. high-availability)4. You have to build & maintain routing code
Pros:1. Relational2. Consistent data (ACID)3. Transactional4. Full functionality
No elasticity means no bursting across servers, requiring low utilization.
Not highly-available, relies on fail-over
11
Introducing Database Virtualization
Highly-available data tier shared across multiple
database clusters
Database Tier(CPU)
Storage Tier(I/O)
Virtualizes & Shares Storage Tier across Elastic Database Clusters
Shared among many customers
Plenty of room for usage peaks
Pros:1. Relational2. Consistent data (ACID)3. Transactional
4. Full functionality5. Elastic6. No slaves
12
Introducing Database Virtualization
Processed at the storage tier, only results are sent
back to the database
Database Tier(CPU)
Storage Tier(I/O)
Distributed Parallel Process Across Storage Servers
Query:What were my sales last month?
• Distributed Parallel Processing: Similar to Map-Reduce & Oracle Exadata• This Narrows the Gap between Databases and Big Data
13
Database Virtualization Enables DBaaS
Processing sharedacross database nodes
Highly-available data tier shared across multiple
database clusters
Database Tier(CPU)
Storage Tier(I/O)
Virtualizes & Shares Storage Tier across Elastic Database Clusters
14
Cloud Computing’s Enabling Technologies
Server
• Server Virtualization• VMWare, Citrix
Storage
• Storage Virtualization• EMC, Netapp, IBM, Dell, HP
Network
• Network Virtualization• Cisco, VMWare, Oracle
DBMS
• Database Virtualization• ScaleDB
© Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.
How About Performance?
16
Performance: ScaleDB vs. InnoDBPerformance tests running on DL380 servers, large data set
0
500
1000
1500
2000
2500
550
1238
1884
2236
MariaDB+InnoDB
ScaleDB1-Node
ScaleDB2-Nodes
ScaleDB3-Nodes
Benchmark Details: YCSB Workload A, 1:1 Read/Write Ratio, Database Size: 200M Rows, MariaDB V5.3.5
Op
erat
ion
s p
er S
eco
nd
17
Performance: ScaleDB vs. InnoDBPerformance tests running on HP Cloud (Read:Write Ratio = 1:1)
MySQL +InnoDB
ScaleDB1-Node
ScaleDB2-Nodes
Benchmark Details: YCSB Workload A, 1:1 Read/Write Ratio, Database Size: 40M Rows, MySQL V5.1.42
Op
erat
ion
s p
er S
eco
nd
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
544
3542
4668
18
Performance: ScaleDB vs. InnoDBPerformance tests running on HP Cloud (Read-Only)
MySQL +InnoDB
ScaleDB1-Node
ScaleDB2-Nodes
Benchmark Details: YCSB Workload A, 1:0 Read/Write Ratio, Database Size: 40M Rows, MySQL V5.1.42
0
2000
4000
6000
8000
10000
12000
930
6117
11920
Op
erat
ion
s p
er S
eco
nd
19
Performance: ScaleDB vs. InnoDBSysbench benchmark running on HP Cloud (Read-Only)
MySQL +InnoDB
ScaleDB1-Node
ScaleDB2-Nodes
Benchmark Details: Sysbench, Read-Only, Database Size: 500M Rows, MySQL V5.1.42
Tran
sact
ion
s p
er S
eco
nd
0
50
100
150
200
250
7
134
250
20
Performance: ScaleDB vs. InnoDBSysbench benchmark running on HP Cloud (10% Write )
MySQL +InnoDB
ScaleDB1-Node
ScaleDB2-Nodes
Benchmark Details: Sysbench, 10% Write, Database Size: 500M Rows, MySQL V5.1.42
Tran
sact
ion
s p
er S
eco
nd
0
10
20
30
40
50
60
70
80
3
50
79
21
Summary
• Database Scale-out & Parallelization Address Big Data
• Scaling-out SQL Database Problem: Distributed Locking
• Alternative 1: NoSQL
• Alternative 2: Sharding
• Both Shift Functionality to the Application Tier
• Introducing Database Virtualization…with Performance!
• Closing the Gap Between Databases and Big Data
© Copyright 2013 ScaleDB. The information contained herein is subject to change without notice.
Thank You