1
Contents
I. Why NewSQL?
II. NewSQL 기본 개념
III. NewSQL 종류
IV.NewSQL 정리
2
Why NewSQL?
3
Thinking – Extreme Data
Qcon London 2012
4 출처 : Netflix in the Cloud (http://www.slideshare.net/adrianco/netflix-in-the-cloud-2011)
Thinking - Traffic Explosion
5 Qcon London 2012
Organizations need deeper insights
6
Solutions
□Buy High end Technology
□Higher more developers
□Using NoSQL
□Using NewSQL
7
Solution – Buy High End Technology
Oracle, IBM
8
Solution – Higher more developers
http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_4_series/madone_4_5
□Application Level Sharding
□Build your replication middleware
□…
9
Solutions – Use NoSQL
□새로운 비 관계형 데이터 베이스
□분산 아키텍처
□수평 확장성
□고정된 테이블 스키마가 없음
□Join, UPDATE, DELETE 연산이 없음
□트랜잭션이 없음
□SQL 지원이 없음
10
NoSQL Ecosystems
451 group
11
MongoDB
□Document-oriented database JSON-style documents: Lists, Maps, primitives
Schema-less
□Transaction = update of a single document
□Rich query language for dynamic queries
□Tunable writes: speed reliability
□Highly scalable and available
12
MongoDB 사용예
□Use cases High volume writes
Complex data
Semi-structured data
□주요 고객 Foursquare
Bit.ly Intuit
SourceForge, NY Times
GILT Groupe, Evite,
SugarCRM
13
Apache Cassandra
□Column-oriented database/Extensible row store Think Row ~= java.util.SortedMap
□Transaction = update of a row
□Fast writes = append to a log
□Tunable reads/writes: consistency / availability
□Extremely scalable
Transparent and dynamic clustering
Rack and datacenter aware data replication
□CQL = “SQL”-like DDL and DML
14
Apache Cassandra 사용 예
□사용 예 Big data
Multiple Data Center distributed database
Persistent cache
(Write intensive) Logging
High-availability (writes)
□주요 고객 Digg, Facebook, Twitter, Reddit, Rackspace
Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX
The largest production cluster has over 100 TB of data in over 150 machines.“ – Casssandra web site
15
□새로운 관계형 데이터베이스
□SQL과 ACID 트랜잭션을 유지
□새롭고 개선된 분산 아키텍처
□뛰어난 확장성과 성능을 지원
□NewSQL vendors: ScaleDB, NimbusDB, ..., VoltDB
Solutions – Use NewSQL
16 http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
17
NewSQL 정의 – Wikipedia
NewSQL is a class of modern relational
database management systems that seek
to provide the same scalable performance
of NoSQL systems for OLTP workloads while
still maintaining the ACID guarantees of a
traditional single-node database system
NewSQL is a class of modern relational
database management systems that seek
to provide the same scalable performance
of NoSQL systems for OLTP workloads while
still maintaining the ACID guarantees of a
traditional single-node database system
http://en.wikipedia.org/wiki/NewSQL
18
NewSQL 정의 – 451 Group
A DBMS that delivers the scalability and
flexibility promised by NoSQL while retaining
the support for SQL queries and/or ACID, or
to improve performance for appropriate
workloads.
A DBMS that delivers the scalability and
flexibility promised by NoSQL while retaining
the support for SQL queries and/or ACID, or
to improve performance for appropriate
workloads.
http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
19
NewSQL 정의 – Stonbraker
SQL as the primary interface.
ACID support for transactions
Non-locking concurrency control.
High per-node performance.
Parallel, shared-nothing architecture.
SQL as the primary interface.
ACID support for transactions
Non-locking concurrency control.
High per-node performance.
Parallel, shared-nothing architecture.
http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
20
NewSQL Category New Database New MySQL Storage Engines Transparent Clustering
21 OSBC
The evolving database landscape
22
MySQL Ecosystem
23
NewSQL Ecosystem
24
New Database
□ Newly designed from scratch to achieve scalability and performance One of the key considerations in improving the
performance is making non-disk (memory) or new kinds of disks (flash/SSD) the primary data store.
some (hopefully minor) changes to the code will be required and data migration is still needed.
□Solutions Software-Only: VoltDB, NuoDB, Drizzle, Google Spanner
Supported as an appliance: Clustrix, Translattice.
http://www.linuxforu.com/2012/01/newsql-handle-big-data/
25
New MySQL Storage Engines
□Highly optimized storage engines for MySQL
□Scale better than built-in engines, such as InnoDB. Good part: the usage of the MySQL interface
Downside part: data migration from other databases
□Solutions TokuDB, MemSQL, Xeround, Akiban, NDB
http://www.linuxforu.com/2012/01/newsql-handle-big-data/
26
Transparent Clustering
□Retain the OLTP databases in their original format, but provide a pluggable feature Cluster transparently
Ensure Scalability
□Avoid the rewrite code or perform any data migration
□Solutions Cluster transparently: Schooner MySQL, Continuent
Tungsten, ScalArc
Ensure Scalability: ScaleBase, dbShards
http://www.linuxforu.com/2012/01/newsql-handle-big-data/
27
NewSQL Products VoltDB Google Spanner
28
VoltDB
http://voltdb.com/products-services/products, http://www.slideshare.net/chris.e.richardson/polygot-persistenceforjavadevs-jfokus2012reorgpptx
□VoltDB, 2010, GPL/VoltDB Proprietary License, Java/C++ □Type: NewSQL, New Database □Main Point: In-memory Database, Java Stored Procedure, VoltDB
implements the design of the academic H-Store project □Protocol: SQL □Transaction: Yes □Data Storage: Memory □Features
□ in-memory relational database □Durability thru replication, snapshots, logging □Transparent partitioning □ACID-level consistency □Synchronous multi-master replication □Database Replication
29
VoltDB- Technical Overview
“OLTP Through the Looking Glass” http://cs-www.cs.yale.edu/homes/dna/papers/oltpperf-sigmod08.pdf
VoltDB avoids the overhead of traditional databases K-safety for fault tolerance
• no logging
In memory operation for maximum throughput
• no buffer management
Partitions operate autonomously and single-threaded
• no latching or locking
Built to horizontally scale
X
X X
X
29
30
X X
X
X X
VoltDB - Partitions (1/3)
1 partition per physical CPU core –Each physical server has multiple VoltDB partitions Data - Two types of tables –Partitioned
Single column serves as partitioning key Rows are spread across all VoltDB partitions by partition column Transactional data (high frequency of modification)
–Replicated All rows exist within all VoltDB partitions Relatively static data (low frequency of modification)
Code - Two types of work – both ACID –Single-Partition
All insert/update/delete operations within single partition Majority of transactional workload
–Multi-Partition CRUD against partitioned tables across multiple partitions Insert/update/delete on replicated tables
31
VoltDB - Partitions (2/3)
Single-partition vs. Multi-partition
1 101 2
1 101 3
4 401 2
1 knife
2 spoon
3 fork
Partition 1
2 201 1
5 501 3
5 502 2
1 knife
2 spoon
3 fork
Partition 2
3 201 1
6 601 1
6 601 2
1 knife
2 spoon
3 fork
Partition 3
table orders : customer_id (partition key) (partitioned) order_id product_id
table products : product_id (replicated) product_name
select count(*) from orders where customer_id = 5 single-partition
select count(*) from orders where product_id = 3 multi-partition
insert into orders (customer_id, order_id, product_id) values (3,303,2) single-partition
update products set product_name = ‘spork’ where product_id = 3 multi-partition
32
VoltDB - Partitions (3/3)
Looking inside a VoltDB partition… – Each partition contains data and an
execution engine.
– The execution engine contains a queue for transaction requests.
– Requests are executed sequentially (single threaded).
Work
Queue
execution engine
Table Data Index Data
- Complete copy of all replicated tables - Portion of rows (about 1/partitions) of all partitioned tables
33
VoltDB - Compiling
The database is constructed from – The schema (DDL)
– The work load (Java stored procedures)
– The Project (users, groups, partitioning)
VoltCompiler creates application catalog – Copy to servers along with 1 .jar and
1 .so
– Start servers
CREATE TABLE HELLOWORLD (
HELLO CHAR(15),
WORLD CHAR(15),
DIALECT CHAR(15),
PRIMARY KEY (DIALECT)
);
Schema
import org.voltdb. * ;
@ProcInfo(
partitionInfo = "HELLOWORLD.DIA
singlePartition = true
)
public class Insert extends VoltPr
public final SQLStmt sql =
new SQLStmt("INSERT INTO HELLO
public VoltTable[] run( String hel
import org.voltdb. * ;
@ProcInfo(
partitionInfo = "HELLOWORLD.DIA
singlePartition = true
)
public class Insert extends VoltPr
public final SQLStmt sql =
new SQLStmt("INSERT INTO HELLO
public VoltTable[] run( String hel
import org.voltdb. * ;
@ProcInfo(
partitionInfo = "HE
singlePartition = t
public final SQLStmt
public VoltTable[] run
Stored Procedures
<?xml version="1.0"?>
<project>
<database name='data
<schema path='ddl.
<partition table=‘
</database>
</project>
Project.xml
34
VoltDB - Transactions
All access to VoltDB is via Java stored procedures (Java + SQL)
A single invocation of a stored procedure is a transaction (committed on success)
Limits round trips between DBMS and application
High performance client applications communicate asynchronously with VoltDB
SQL
35
VoltDB - Clusters/Durability
Scalability – Increase RAM in servers to add capacity
– Add servers to increase performance / capacity
– Consistently measuring 90% of single-node performance increase per additional node
High availability – K-safety for redundancy
Snapshots – Scheduled, continuous, on demand
Spooling to data warehouse
Disaster Recovery/WAN replication (Future) – Asynchronous replication
36
Google Spanner
http://research.google.com/archive/spanner.html
□Google, 2012, Paper, C++ □Type: NewSQL, New Database □Main Point: Google's scalable, multi-version, globally-distributed, and
synchronously-replicated database
□Distributed multiversion database General-purpose transactions (ACID) SQL query language Schematized tables Semi-relational data model
□Running in production Storage for Google’s ad data Replaced a sharded MySQL database
37
Google Spanner Overview
http://research.google.com/archive/spanner.html
□Feature: Lock-free distributed read transactions
□Property: External consistency of distributed transactions
□First system at global scale □Implementation: Integration of concurrency
control, replication, and 2PC □Correctness and performance □Enabling technology: TrueTime □Interval-based global time
38
Design Goals for Spanner
http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf
39
MySQL Cluster – NDB Architecture
http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-overview.html
40
Schooner MySQL Active Cluster
http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-overview.html
41
dbShards Architecture
http://www.linuxforu.com/2012/01/newsql-handle-big-data/
42
NewSQL 정리
43
Database 업계의 3가지 Trends
□NoSQL 데이터베이스:
분산 아키텍처의 확장성 등의 요구 사항을 충족하며, 스키마 없는 데이터
관리 요구 사항에 부합하도록 설계됨.
□NewSQL 데이터베이스:
분산 아키텍처의 확장성 등의 요구 사항을 충족하거나 혹은 수평 확장을
필요로하지 않지만 성능을 개선은 되도록 설계됨.
□Data Grid/Cache 제품:
응용 프로그램 및 데이터베이스 성능을 높이기 위해 메모리에 데이터를
저장하도록 설계됨.
44
결론
□데이터 저장을 위한 많은 솔루션이 존재 □ Oracle, MySQL만 있다는 생각은 버려야 함 □ 먼저 시스템의 데이터 속성과 요구사항을 파악(CAP, ACID/BASE) □ 한 시스템에 여러 솔루션을 적용
소규모/복잡한 관계 데이터: RDBMS 대규모 실시간 처리 데이터: NoSQL, NewSQL 대규모 저장용 데이터: Hadoop 등
□적절한 솔루션 선택 □ 반드시 운영 중 발생할 수 있는 이슈에 대해 검증 후 도입 필요 □ 대부분의 NewSQL 솔루션은 베타 상태(섣부른 선택은 독이 될 수 있음) □ 솔루션의 프로그램 코드 수준으로 검증 필요
□NewSQL 솔루션에 대한 안정성 확보 □ 솔루션 자체의 안정성은 검증이 필요하며 현재의 DBMS 수준의 안정성은 지원하
지 않음 □ 반드시 안정적인 데이터 저장 방안 확보 후 적용 필요 □ 운영 및 개발 경험을 가진 개발자 확보 어려움 □ 요구사항에 부합되는 NewSQL 선정 필요
□처음부터 중요 시스템에 적용하기 보다는 시범 적용 필요 □ 선정된 솔루션 검증, 기술력 내재화
45
감사합니다.
46
Appendix.
47
Early – 2000s
http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
□All the big players were heavyweight and expensive.
Oracle, DB2, Sybase, SQL Server, etc.
□Open-source databases were missing important features.
Postgres, mSQL, and MySQL.
48
Early – 2000s : eBay Architecture
http://highscalability.com/ebay-architecture
49
Early – 2000s : eBay Architecture
http://highscalability.com/ebay-architecture
Push functionality to application: Joins Referential integrity Sorting done
No distributed transactions
50
Mid– 2000s
http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
□MySQL + InnoDB is widely adopted by new web companies:
Supported transactions, replication, recovery.
Still must use custom middleware to scale out across multiple machines.
Memcache for caching queries.
51
Mid – 2000s : Facebook Architecture
http://www.techthebest.com/2011/11/29/technology-used-in-facebook/
52
Mid – 2000s : Facebook Architecture
http://www.techthebest.com/2011/11/29/technology-used-in-facebook/
Scale out using custom middleware. Store ~75% of database in Memcache. No distributed transactions.
53
Late – 2000s
http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
□MySQL + InnoDB is widely adopted by new web companies:
Supported transactions, replication, recovery.
Still must use custom middleware to scale out across multiple machines.
Memcache for caching queries.
54
Late – 2000s : MongoDB Architecture
http://sett.ociweb.com/sett/settAug2011.html
55
Late – 2000s : MongoDB Architecture
http://sett.ociweb.com/sett/settAug2011.html
Easy to use. Becoming more like a DBMS over time. No transactions.
56
Early – 2010s
http://www.cs.brown.edu/courses/cs227/slides/newsql/newsql-intro.pdf
□New DBMSs that can scale across multiple machines natively and provide ACID guarantees.
MySQL Middleware
Brand New Architectures
57
Database SPRAIN
58
Database SPRAIN
□“An injury to ligaments... caused by being stretched beyond normal capacity”
□Six key drivers for NoSQL/NewSQL/DDG adoption Scalability
Performance
Relaxed consistency
Agility
Intricacy
Necessity
59
Database SPRAIN - Scalability
□Associated sub-driver: Hardware economics Scale-out across clusters of commodity servers
□Example project/service/vendor BigTable HBase Riak MongoDB Couchbase, Hadoop
Amazon RDS, Xeround, SQL Azure, NimbusDB
Data grid/cache
□Associated use case: Large-scale distributed data storage
Analysis of continuously updated data
Multi-tenant PaaS data layer
60
Database SPRAIN - Scalability
□User: StumbleUpon
□Problem: Scaling problems with recommendation engine on
MySQL
□Solution: HBase Started using Apache HBase to provide real-time
analytics on Su.pr
MySQL lacked the performance headroom and scale
Multiple benefits including avoiding declaring schema
Enables the data to be used for multiple applications and use cases
61
Database SPRAIN - Performance
□Associated sub-driver: MySQL limitations Inability to perform consistently at scale
□Example project/service/vendor Hypertable Couchbase Membrain MongoDB Redis
Data grid/cache
VoltDB, Clustrix
□Associated use case: Real time data processing of mixed read/write
workloads
Data caching
Large-scale data ingestion
62
Database SPRAIN - Performance
□User: AOL Advertising
□Problem: Real-time data processing to support targeted
advertising
□Solution: Membase Server Segmentation analysis runs in CDH, results passed into
Membase
Make use of its sub-millisecond data delivery
More time for analysis as part of a 40ms targeted and response time
Also real time log and event management
63
Database SPRAIN – Relaxed Consistency
□Associated sub-driver: CAP theorem The need to relax consistency in order to maintain
availability
□Example project/service/vendor: Dynamo, Voldemort, Cassandra
Amazon SimpleDB
□Associated use case: Multi-data center replication
Service availability
Non-transactional data off-load
64
Database SPRAIN – Relaxed Consistency
□User: Wordnik
□Problem: MySQL too consistent –blocked access to data during
inserts and created numerous temp files to stay consistent.
□Solution: MongoDB Single word definition contains multiple data items
from various sources
MongoDB stores data as a complete document
Reduced the complexity of data storage
65
Database SPRAIN – Agility
□ Associated sub-driver: Polyglot persistence Choose most appropriate storage technology for app
in development
□Example project/service/vendor MongoDB, CouchDB, Cassandra
Google App Engine, SimpleDB, SQL Azure
□Associated use case: Mobile/remote device synchronization
Agile development
Data caching
66
Database SPRAIN – Agility
□ User: Dimagi BHOMA (Better Health Outcomes through Mentoring and Assessments) project
□Problem: Deliver patient information to clinics despite a lack of
reliable Internet connections
□Solution: Apache CouchDB Replicates data from regional to national database
When Internet connection, and power, is available
Upload patient data from cell phones to local clinic
67
Database SPRAIN – Intricacy
□ Associated sub-driver: Big data, total data Rising data volume, variety and velocity
□Example project/service/vendor Neo4j GraphDB, InfiniteGraph
Apache Cassandra, Hadoop,
VoltDB, Clustrix
□Associated use case: Social networking applications
Geo-locational applications
Configuration management database
68
Database SPRAIN – Intricacy
□ User: Evident Software
□Problem: Mapping infrastructure dependencies for application
performance management
□Solution: Neo4j Apache Cassandra stores performance data
Neo4j used to map the correlations between different elements
Enables users to follow relationships between resources while investigating issues
69
Database SPRAIN – Necessity
□ Associated sub-driver: Open source The failure of existing suppliers to address the
performance, scalability and flexibility requirements of large-scale data processing
□ Example project/service/vendor BigTable, Dynamo, MapReduce, Memcached
Hadoop HBase, Hypertable, Cassandra, Membase
Voldemort, Riak, BigCouch
MongoDB, Redis, CouchDB, Neo4J
□Associated use case: All of the above
70
Database SPRAIN – Necessity
□BigTable: Google
□Dynamo: Amazon
□Cassandra: Facebook
□HBase: Powerset
□Voldemort: LinkedIn
□Hypertable: Zvents
□Neo4j: Windh Technologies
Yahoo: Apache Hadoop and Apache HBase
Digg: Apache Cassandra
Twitter: Apache Cassandra, Apache Hadoop and FlockDB