Wednesday, March 20, 13
Win some cool stuff, send a mail to
with NORMANDYJUG in the subject.
Wednesday, March 20, 13
Technical Evangelist
twi0er: @tgrallemail: [email protected]
Tugdual “Tug” Grall
Introduc)on to NoSQLwith Couchbase
Normandy JUG -‐ March 19th 2013
Wednesday, March 20, 13
About me
• Tugdual “Tug” Grall
-‐ Couchbase-‐ Technical Evangelist
-‐ eXo-‐ CTO
-‐ Oracle-‐ Developer/Product Manager
-‐ Mainly Java/SOA
-‐ Developer in consul)ng firms
•Web
-‐ @tgrall
-‐ hQp://blog.grallandco.com-‐ tgrall• NantesJUG co-‐founder
• Pet Project :
• hQp://www.resultri.com
Wednesday, March 20, 13
INTRO TO NOSQL
Wednesday, March 20, 13
RDBMS ARE NOT ENOUGH?
Wednesday, March 20, 13
Growth is the New Reality
• Instagram gained nearly 1 million users overnight when then expanded to Android
Wednesday, March 20, 13
Draw Something -‐ Social Game
35 million monthly active users in 1 month !!!
Wednesday, March 20, 13
By contrast....
The Simpson’s : Tapped OutDaily Active Users (Millions)
Wednesday, March 20, 13
How do you take this growth?
RDBMS is good for many thing, but hard to scale
RDBMS Scales UpGet a bigger, more complex server
Users
Applica@on Scales OutJust add more commodity web servers
Users
System CostApplica)on Performance
Rela@onal Database
Web/App Server Tier
System CostApplica)on Performance
Won’t scale beyond this point
Wednesday, March 20, 13
Scaling out RDBMS
• Run Many SQL Servers
•Data could be shared
-‐ Done by the applica)on code
• Caching for faster response )me
Web/App Server Tier
Memcached Tier
MySQL Tier
Wednesday, March 20, 13
NoSQL Technology Scales Out
Scaling out fla;ens the cost and performance curves
NoSQL Database Scales OutCost and performance mirrors app @er
Users
NoSQL Distributed Data Store
Web/App Server Tier
Applica@on Scales OutJust add more commodity web servers
Users
System CostApplica)on Performance
Applica)on Performance System Cost
Wednesday, March 20, 13
A New Technology?
Building new database to answer the following requirements
•No schema required before inser)ng data
•No schema change required to change data format
• Auto-‐sharding without applica)on par)cipa)on
•Distributed queries
• Integrated main memory caching
•Data synchroniza)on ( mul)-‐datacenter)
DynamoOctober 2007
CassandraAugust 2008
BigtableNovember 2006
VoldemortFebruary 2009
Very few organiza@ons want to (fewer can) build and maintain database soQware technology.But every organiza@on building interac@ve web applica@ons needs this technology.
Wednesday, March 20, 13
What Is Biggest Data Management Problem Driving Use of NoSQL in Coming Year?
Lack of flexibility/rigid schemas
Inability to scale out data
Performance challenges
Cost All of these Other
49%
35%
29%
16% 12% 11%
Source: Couchbase Survey, December 2011, n = 1351.
Wednesday, March 20, 13
NO SQL TAXONOMIES
Wednesday, March 20, 13
NoSQL CatalogKey-‐Value
Memcached
Membase
Redis
Data Structure Document Column Graph
MongoDB
Couchbase Cassandra
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
Neo4j
Wednesday, March 20, 13
NoSQL CatalogKey-‐Value
Memcached
Membase
Redis
Data Structure Document Column Graph
MongoDB
Couchbase Cassandra
Cache
(mem
ory on
ly)
Database
(mem
ory/disk)
Neo4j
HBase InfiniteGraph
Coherence
Wednesday, March 20, 13
Hadoop ?
Wednesday, March 20, 13
ClouderaHortonworks
Mapr
OperaIonal vs. AnalyIc Databases
CouchbaseMongoDB
CassandraHbase
AnalyCcDatabases
Get insights from data
Real-‐Cme, InteracCve Databases
Fast access to data
NoSQL
Wednesday, March 20, 13
COUCHBASE
Wednesday, March 20, 13
Couchbase Server Core Principles
Easy Scalability
Consistent High Performance
Always On 24x365
Grow cluster without applica^on changes, without down^me with a single click
Consistent sub-‐millisecond read and write response ^mes with consistent high throughput
No down^me for so`ware upgrades, hardware maintenance, etc.
Flexible Data Model
JSON document model with no fixed schema.
JSONJSONJSON
JSONJSON
PERFORMANCE
Wednesday, March 20, 13
Couchbase 2.0 New Features
JSON support Indexing and Querying
Cross data center replication
Incremental Map Reduce
Wednesday, March 20, 13
Couchbase Handles Real World Scale
Wednesday, March 20, 13
Sub^tleCouchbase Server 2.0 Architecture
Heartbeat
Process m
onito
r
Glob
al singleton supe
rviso
r
Confi
gura)o
n manager
on each node
Rebalance orchestrator
Nod
e he
alth m
onito
r
one per cluster
vBucket state and
replica)
on m
anager
hdpRE
ST m
anagem
ent A
PI/W
eb UI
HTTP8091
Erlang port mapper4369
Distributed Erlang21100 -‐ 21199
Erlang/OTP
storage interface
Couchbase EP Engine
11210Memcapable 2.0
Moxi
11211Memcapable 1.0
Memcached
New Persistence Layer
8092Query API
Que
ry Engine
Data Manager Cluster Manager
Wednesday, March 20, 13
Couchbase Server 2.0 Architecture
New Persistence Layer
storage interface
Couchbase EP Engine
11210Memcapable 2.0
Moxi
11211Memcapable 1.0
Object-‐level Cache
Disk Persistence
8092Query API
Que
ry Engine
HTTP8091
Erlang port mapper4369
Distributed Erlang21100 -‐ 21199
Heartbeat
Process m
onito
r
Glob
al singleton supe
rviso
r
Confi
gura)o
n manager
on each node
Rebalance orchestrator
Nod
e he
alth m
onito
r
one per cluster
vBucket state and
replica)
on m
anager
hdp
REST m
anagem
ent A
PI/W
eb UI
Erlang/OTP
Server/Cluster Management & Communica@on
(Erlang)
RAM Cache, Indexing & Persistence Management
(C & V8)
The Unreasonable Effectiveness of C by Damien Katz
Wednesday, March 20, 13
Apache 2.0Open Source Project
hQps://github.com/couchbase/
hQps://github.com/couchbaselabs/
hQp://review.couchbase.org/Gerrit:
Wednesday, March 20, 13
SETTING UP TO DEVELOP
Wednesday, March 20, 13
Install Couchbase Server 2.0
Ubuntu
RedHat
Mac OS X
Windows
or build from sources
Wednesday, March 20, 13
Official SDKs
www.couchbase.com/develop
Clojure
Python
Ruby
libcouchbase
Go
Wednesday, March 20, 13
Client SDKs
Couchbase ClientApp Server
make connec)
on
receive top
ologyCouchbase TopologyUpdate
Wednesday, March 20, 13
COUCHBASE OPERATIONS
Wednesday, March 20, 13
Write OperaIon
33 2Managed Cache
Disk Que
ue
Disk
Replica^on Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
Wednesday, March 20, 13
Basic OperaIons
COUCHBASE SERVER CLUSTER
• Docs distributed evenly across servers
• Each server stores both ac@ve and replica docsOnly one doc ac)ve at a )me
• Client library provides app with simple interface to database
• Cluster map provides map to which server doc is onApp never needs to know
• App reads, writes, updates docs
• Mul@ple app servers can access same document at same @me
READ/WRITE/UPDATE
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 1
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc
SERVER 2
Doc 8
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
REPLICA
Doc 6
Doc 3
Doc 2
Doc
Doc
Doc
REPLICA
Doc 7
Doc 9
Doc 5
Doc
Doc
Doc
SERVER 3
Doc 6
APP SERVER 1
COUCHBASE Client LibraryCLUSTER MAP
COUCHBASE Client LibraryCLUSTER MAP
APP SERVER 2
Doc 9
Wednesday, March 20, 13
• get (key)– Retrieve a document
• set (key, value)– Store a document, overwrites if exists
• add (key, value)– Store a document, error/excep^on if exists
• replace (key, value)– Store a document, error/excep^on if doesn’t exist
• cas (key, value, cas)– Compare and swap, mutate document only if it hasn’t changed while execu^ng this opera^on
Store & Retrieve OperaIons
Wednesday, March 20, 13
Atomic Counter OperaIonsThese opera^ons are always executed in order atomically.
• set (key, value)– Use set to ini^alize the counter
• cb.set(“my_counter”, 1)
• incr (key)– Increase an atomic counter value, default by 1
• cb.incr(“my_counter”) # now it’s 2
• decr (key)– Decrease an atomic counter value, default by 1
• cb.decr(“my_counter”) # now it’s 1
Wednesday, March 20, 13
Mental Adjustments
• In SQL we tend to want to avoid hilng the database as much as possible– Even with caching and indexing tricks, and massive improvements over the years, SQL s^ll gets bogged down by complex joins and huge indexes, so we avoid making database calls
• In Couchbase, get’s and set’s are so fast they are trivial, not bo0lenecks, this is hard for many people to accept at first; Mul^ple get statements are commonplace, don’t avoid it!
Wednesday, March 20, 13
JSON Document Structuremeta{
“id”: “u::[email protected]”,“rev”: “1-‐0002bce0000000000”,“flags”: 0,“expira@on”: 0,“type”: “json”
}
document{
“uid”: 123456,“firstname”: “jasdeep”,“lastname”: “Jaitla”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”
}
Meta Informa@on Including Key
All Keys Unique and Kept in RAM
Document Value
Most Recent In Ram And Persisted To Disk
Wednesday, March 20, 13
DEMONSTRATION
Wednesday, March 20, 13
Add Nodes to Cluster
• Two servers addedOne-‐click opera@on
• Docs automa@cally rebalanced across clusterEven distribu)on of docsMinimum doc movement
• Cluster map updated
• App database calls now distributed over larger number of servers
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc
Doc 8 Doc
Doc 9 Doc
Doc 2 Doc
Doc 8 Doc
Doc 5 Doc
Doc 6
READ/WRITE/UPDATE READ/WRITE/UPDATE
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1
Wednesday, March 20, 13
Fail Over Node
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc 9
Doc 8
Doc Doc 6 Doc
Doc
Doc 5 Doc
Doc 2
Doc 8 Doc
Doc
• App servers accessing docs
• Requests to Server 3 fail
• Cluster detects server failedPromotes replicas of docs to ac)veUpdates cluster map
• Requests for docs now go to appropriate server
• Typically rebalance would follow
Doc
Doc 1 Doc 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
User Configured Replica Count = 1
COUCHBASE SERVER CLUSTER
Wednesday, March 20, 13
Indexing and Querying
COUCHBASE SERVER CLUSTER
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 1
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Doc 9
• Indexing work is distributed amongst nodes
• Large data set possible
• Parallelize the effort
• Each node has index for data stored on it
• Queries combine the results from required nodes
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 2
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
Doc 9
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 3
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
Doc 9
Query
Wednesday, March 20, 13
DEMONSTRATION
Wednesday, March 20, 13
Cross Data Center ReplicaIon (XDCR)
RAM CACHE
Doc 1
Doc 2
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
SERVER 1
Doc 6
DISK
RAM CACHE
Doc 1
Doc 2
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
SERVER 2
Doc 6
DISK
RAM CACHE
Doc 1
Doc 2
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
SERVER 3
Doc 6
DISK
Couchbase ClusterWest Coast Data Center
RAM CACHE
Doc 1
Doc 2
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
SERVER 1
Doc 6
DISK
RAM CACHE
Doc 1
Doc 2
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
SERVER 2
Doc 6
DISK
RAM CACHE
Doc 1
Doc 2
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
Doc
SERVER 3
Doc 6
DISK
Couchbase ClusterEast Coast Data Center
Wednesday, March 20, 13
Map FuncIon
Wednesday, March 20, 13
DEMONSTRATION
Wednesday, March 20, 13
ElasIc Search Adaptor
• Elastic Search is good for ad-hoc queries and faceted browsing• Our adapter is aware of changing Couchbase topology• Indexed by Elastic Search after stored to disk in Couchbase
Elas@cSearch
Wednesday, March 20, 13
I’m Excited to See What You Build,Q & A
Contact me on Twider@tgrall
Contact me by [email protected]
Learn More About Design PadernsCouchbaseModels.com
Senng up for Ruby on RailsCouchbaseOnRails.com
Couchbase Docswww.couchbase.com/docs/index-‐full.html
Couchbase Forumswww.couchbase.com/forums
IRC#couchbase#libcouchbase
Wednesday, March 20, 13
Win some cool stuff, send a mail to
with NORMANDYJUG in the subject.
Wednesday, March 20, 13
Q&A
Wednesday, March 20, 13