Transcript
Page 1: Couche Base par Tugdual Grall

Wednesday, March 20, 13

Page 2: Couche Base par Tugdual Grall

Win some cool stuff, send a mail to

[email protected]

with NORMANDYJUG in the subject.

Wednesday, March 20, 13

Page 3: Couche Base par Tugdual Grall

Technical  Evangelist

twi0er:  @tgrallemail:  [email protected]

Tugdual  “Tug”  Grall

Introduc)on  to  NoSQLwith  Couchbase

Normandy  JUG  -­‐  March  19th  2013

Wednesday, March 20, 13

Page 4: Couche Base par Tugdual Grall

About  me

• Tugdual  “Tug”  Grall

-­‐ Couchbase-­‐ Technical  Evangelist

-­‐ eXo-­‐ CTO

-­‐ Oracle-­‐ Developer/Product  Manager

-­‐ Mainly  Java/SOA

-­‐ Developer  in  consul)ng  firms

•Web

-­‐    @tgrall

-­‐      hQp://blog.grallandco.com-­‐      tgrall• NantesJUG  co-­‐founder

• Pet  Project  :

• hQp://www.resultri.com

Wednesday, March 20, 13

Page 5: Couche Base par Tugdual Grall

INTRO  TO  NOSQL

Wednesday, March 20, 13

Page 6: Couche Base par Tugdual Grall

RDBMS  ARE  NOT  ENOUGH?

Wednesday, March 20, 13

Page 7: Couche Base par Tugdual Grall

Growth  is  the  New  Reality

• Instagram gained nearly 1 million users overnight when then expanded to Android

Wednesday, March 20, 13

Page 8: Couche Base par Tugdual Grall

Draw  Something  -­‐  Social  Game

35 million monthly active users in 1 month !!!

Wednesday, March 20, 13

Page 9: Couche Base par Tugdual Grall

By  contrast....

The Simpson’s : Tapped OutDaily Active Users (Millions)

Wednesday, March 20, 13

Page 10: Couche Base par Tugdual Grall

How  do  you  take  this  growth?

RDBMS  is  good  for  many  thing,  but  hard  to  scale

RDBMS  Scales  UpGet  a  bigger,  more  complex  server

Users

Applica@on  Scales  OutJust  add  more  commodity  web  servers

Users

System  CostApplica)on  Performance  

Rela@onal  Database

Web/App  Server  Tier

System  CostApplica)on  Performance  

Won’t  scale  beyond  this  point

Wednesday, March 20, 13

Page 11: Couche Base par Tugdual Grall

Scaling  out  RDBMS

• Run  Many  SQL  Servers

•Data  could  be  shared

-­‐ Done  by  the  applica)on  code

• Caching  for  faster  response  )me

Web/App  Server  Tier

Memcached  Tier

MySQL  Tier

Wednesday, March 20, 13

Page 12: Couche Base par Tugdual Grall

NoSQL  Technology  Scales  Out

Scaling  out  fla;ens  the  cost  and  performance  curves

NoSQL  Database  Scales  OutCost  and  performance  mirrors  app  @er

Users

NoSQL  Distributed  Data  Store

Web/App  Server  Tier

Applica@on  Scales  OutJust  add  more  commodity  web  servers

Users

System  CostApplica)on  Performance  

Applica)on  Performance  System  Cost

Wednesday, March 20, 13

Page 13: Couche Base par Tugdual Grall

A  New  Technology?

Building  new  database  to  answer  the  following  requirements

•No  schema  required  before  inser)ng  data

•No  schema  change  required  to  change  data  format

• Auto-­‐sharding  without  applica)on  par)cipa)on

•Distributed  queries

• Integrated  main  memory  caching

•Data  synchroniza)on  (  mul)-­‐datacenter)

DynamoOctober  2007

CassandraAugust  2008

BigtableNovember  2006

VoldemortFebruary  2009

Very  few  organiza@ons  want  to  (fewer  can)  build  and  maintain  database  soQware  technology.But  every  organiza@on  building  interac@ve  web  applica@ons  needs  this  technology.

Wednesday, March 20, 13

Page 14: Couche Base par Tugdual Grall

What  Is  Biggest  Data  Management  Problem  Driving  Use  of  NoSQL  in  Coming  Year?

Lack  of  flexibility/rigid  schemas

Inability  to  scale  out  data

Performance  challenges

Cost All  of  these Other

49%

35%

29%

16% 12% 11%

Source:  Couchbase  Survey,  December  2011,  n  =  1351.

Wednesday, March 20, 13

Page 15: Couche Base par Tugdual Grall

NO  SQL  TAXONOMIES

Wednesday, March 20, 13

Page 16: Couche Base par Tugdual Grall

NoSQL  CatalogKey-­‐Value

Memcached

Membase

Redis

Data  Structure Document Column Graph

MongoDB

Couchbase Cassandra

Cache

(mem

ory  on

ly)

Database

(mem

ory/disk)

Neo4j

Wednesday, March 20, 13

Page 17: Couche Base par Tugdual Grall

NoSQL  CatalogKey-­‐Value

Memcached

Membase

Redis

Data  Structure Document Column Graph

MongoDB

Couchbase Cassandra

Cache

(mem

ory  on

ly)

Database

(mem

ory/disk)

Neo4j

HBase InfiniteGraph

Coherence

Wednesday, March 20, 13

Page 18: Couche Base par Tugdual Grall

Hadoop  ?

Wednesday, March 20, 13

Page 19: Couche Base par Tugdual Grall

ClouderaHortonworks

Mapr

OperaIonal  vs.  AnalyIc  Databases

CouchbaseMongoDB

CassandraHbase

AnalyCcDatabases

Get  insights  from  data

Real-­‐Cme,  InteracCve  Databases

Fast  access  to  data

NoSQL

Wednesday, March 20, 13

Page 20: Couche Base par Tugdual Grall

COUCHBASE

Wednesday, March 20, 13

Page 21: Couche Base par Tugdual Grall

Couchbase  Server  Core  Principles

Easy  Scalability

Consistent  High  Performance

Always  On  24x365

Grow  cluster  without  applica^on  changes,  without  down^me  with  a  single  click

Consistent  sub-­‐millisecond  read  and  write  response  ^mes  with  consistent  high  throughput

No  down^me  for  so`ware  upgrades,  hardware  maintenance,  etc.

Flexible  Data  Model

JSON  document  model  with  no  fixed  schema.

JSONJSONJSON

JSONJSON

PERFORMANCE

Wednesday, March 20, 13

Page 22: Couche Base par Tugdual Grall

Couchbase  2.0  New  Features

JSON support Indexing and Querying

Cross data center replication

Incremental Map Reduce

Wednesday, March 20, 13

Page 23: Couche Base par Tugdual Grall

Couchbase  Handles  Real  World  Scale

Wednesday, March 20, 13

Page 24: Couche Base par Tugdual Grall

Sub^tleCouchbase  Server  2.0  Architecture

Heartbeat

Process  m

onito

r

Glob

al  singleton  supe

rviso

r

Confi

gura)o

n  manager

on  each  node

Rebalance  orchestrator

Nod

e  he

alth  m

onito

r

one  per  cluster

vBucket  state  and

 replica)

on  m

anager

hdpRE

ST  m

anagem

ent  A

PI/W

eb  UI

HTTP8091

Erlang  port  mapper4369

Distributed  Erlang21100  -­‐  21199

Erlang/OTP

storage  interface

Couchbase  EP  Engine

11210Memcapable    2.0

Moxi

11211Memcapable    1.0

Memcached

New  Persistence  Layer

8092Query  API

Que

ry  Engine

Data  Manager Cluster  Manager

Wednesday, March 20, 13

Page 25: Couche Base par Tugdual Grall

Couchbase  Server  2.0  Architecture

New  Persistence  Layer

storage  interface

Couchbase  EP  Engine

11210Memcapable    2.0

Moxi

11211Memcapable    1.0

Object-­‐level  Cache

Disk  Persistence

8092Query  API

Que

ry  Engine

HTTP8091

Erlang  port  mapper4369

Distributed  Erlang21100  -­‐  21199

Heartbeat

Process  m

onito

r

Glob

al  singleton  supe

rviso

r

Confi

gura)o

n  manager

on  each  node

Rebalance  orchestrator

Nod

e  he

alth  m

onito

r

one  per  cluster

vBucket  state  and

 replica)

on  m

anager

hdp

REST  m

anagem

ent  A

PI/W

eb  UI

Erlang/OTP

Server/Cluster  Management  &  Communica@on

(Erlang)

RAM  Cache,  Indexing  &  Persistence  Management

(C  &  V8)

The Unreasonable Effectiveness of C by Damien Katz

Wednesday, March 20, 13

Page 26: Couche Base par Tugdual Grall

Apache  2.0Open  Source  Project

hQps://github.com/couchbase/

hQps://github.com/couchbaselabs/

hQp://review.couchbase.org/Gerrit:

Wednesday, March 20, 13

Page 27: Couche Base par Tugdual Grall

SETTING  UP  TO  DEVELOP

Wednesday, March 20, 13

Page 28: Couche Base par Tugdual Grall

Install  Couchbase  Server  2.0

Ubuntu

RedHat

Mac  OS  X

Windows

or  build  from  sources

Wednesday, March 20, 13

Page 29: Couche Base par Tugdual Grall

Official  SDKs

www.couchbase.com/develop

Clojure

Python

Ruby

libcouchbase

Go

Wednesday, March 20, 13

Page 30: Couche Base par Tugdual Grall

Client  SDKs

Couchbase  ClientApp  Server

make  connec)

on

receive  top

ologyCouchbase  TopologyUpdate

Wednesday, March 20, 13

Page 31: Couche Base par Tugdual Grall

COUCHBASE  OPERATIONS

Wednesday, March 20, 13

Page 32: Couche Base par Tugdual Grall

Write  OperaIon

33 2Managed  Cache

Disk  Que

ue

Disk

Replica^on  Queue

App  Server

Couchbase  Server  Node

Doc  1Doc  1

Doc  1

To  other  node

Wednesday, March 20, 13

Page 33: Couche Base par Tugdual Grall

Basic  OperaIons

COUCHBASE  SERVER    CLUSTER

• Docs  distributed  evenly  across  servers  

• Each  server  stores  both  ac@ve  and  replica  docsOnly  one  doc  ac)ve  at  a  )me

• Client  library  provides  app  with  simple  interface  to  database

• Cluster  map  provides  map  to  which  server  doc  is  onApp  never  needs  to  know

• App  reads,  writes,  updates  docs

• Mul@ple  app  servers  can  access  same  document  at  same  @me

READ/WRITE/UPDATE

ACTIVE

Doc  5

Doc  2

Doc

Doc

Doc

SERVER  1

ACTIVE

Doc  4

Doc  7

Doc

Doc

Doc

SERVER  2

Doc  8

ACTIVE

Doc  1

Doc  2

Doc

Doc

Doc

REPLICA

Doc  4

Doc  1

Doc  8

Doc

Doc

Doc

REPLICA

Doc  6

Doc  3

Doc  2

Doc

Doc

Doc

REPLICA

Doc  7

Doc  9

Doc  5

Doc

Doc

Doc

SERVER  3

Doc  6

APP  SERVER  1

COUCHBASE  Client  LibraryCLUSTER  MAP

COUCHBASE  Client  LibraryCLUSTER  MAP

APP  SERVER  2

Doc  9

Wednesday, March 20, 13

Page 34: Couche Base par Tugdual Grall

• get  (key)–  Retrieve  a  document

• set  (key,  value)–  Store  a  document,  overwrites  if  exists

• add  (key,  value)–  Store  a  document,  error/excep^on  if  exists

• replace  (key,  value)–  Store  a  document,  error/excep^on  if  doesn’t  exist

• cas  (key,  value,  cas)–  Compare  and  swap,  mutate  document  only  if  it  hasn’t  changed  while  execu^ng  this  opera^on

Store  &  Retrieve  OperaIons

Wednesday, March 20, 13

Page 35: Couche Base par Tugdual Grall

Atomic  Counter  OperaIonsThese  opera^ons  are  always  executed  in  order  atomically.

• set  (key,  value)–  Use  set  to  ini^alize  the  counter

• cb.set(“my_counter”,  1)

• incr  (key)–  Increase  an  atomic  counter  value,  default  by  1

• cb.incr(“my_counter”)  #  now  it’s  2

• decr  (key)–  Decrease  an  atomic  counter  value,  default  by  1

• cb.decr(“my_counter”)  #  now  it’s  1

Wednesday, March 20, 13

Page 36: Couche Base par Tugdual Grall

Mental  Adjustments

• In  SQL  we  tend  to  want  to  avoid  hilng  the  database  as  much  as  possible– Even  with  caching  and  indexing  tricks,  and  massive  improvements  over  the  years,  SQL  s^ll  gets  bogged  down  by  complex  joins  and  huge  indexes,  so  we  avoid  making  database  calls

• In  Couchbase,  get’s  and  set’s  are  so  fast  they  are  trivial,  not  bo0lenecks,  this  is  hard  for  many  people  to  accept  at  first;  Mul^ple  get  statements  are  commonplace,  don’t  avoid  it!  

Wednesday, March 20, 13

Page 37: Couche Base par Tugdual Grall

JSON  Document  Structuremeta{

“id”:  “u::[email protected]”,“rev”:  “1-­‐0002bce0000000000”,“flags”:  0,“expira@on”:  0,“type”:  “json”

}

document{

“uid”:  123456,“firstname”:  “jasdeep”,“lastname”:  “Jaitla”,“age”:  22,“favorite_colors”:  [“blue”,  “black”],“email”:  “[email protected]

}

Meta  Informa@on  Including  Key

All  Keys  Unique  and  Kept  in  RAM

Document  Value

Most  Recent  In  Ram  And  Persisted  To  Disk

Wednesday, March 20, 13

Page 38: Couche Base par Tugdual Grall

DEMONSTRATION

Wednesday, March 20, 13

Page 39: Couche Base par Tugdual Grall

Add  Nodes  to  Cluster

• Two  servers  addedOne-­‐click  opera@on

• Docs  automa@cally  rebalanced  across  clusterEven  distribu)on  of  docsMinimum  doc  movement

• Cluster  map  updated

• App  database  calls  now  distributed  over  larger  number  of  servers

REPLICA

ACTIVE

Doc  5

Doc  2

Doc

Doc

Doc  4

Doc  1

Doc

Doc

SERVER  1

REPLICA

ACTIVE

Doc  4

Doc  7

Doc

Doc

Doc  6

Doc  3

Doc

Doc

SERVER  2

REPLICA

ACTIVE

Doc  1

Doc  2

Doc

Doc

Doc  7

Doc  9

Doc

Doc

SERVER  3 SERVER  4 SERVER  5

REPLICA

ACTIVE

REPLICA

ACTIVE

Doc

Doc  8 Doc

Doc  9 Doc

Doc  2 Doc

Doc  8 Doc

Doc  5 Doc

Doc  6

READ/WRITE/UPDATE READ/WRITE/UPDATE

APP  SERVER  1

COUCHBASE  Client  Library

CLUSTER  MAP

COUCHBASE  Client  Library

CLUSTER  MAP

APP  SERVER  2

COUCHBASE  SERVER    CLUSTER

User  Configured  Replica  Count  =  1

Wednesday, March 20, 13

Page 40: Couche Base par Tugdual Grall

Fail  Over  Node

REPLICA

ACTIVE

Doc  5

Doc  2

Doc

Doc

Doc  4

Doc  1

Doc

Doc

SERVER  1

REPLICA

ACTIVE

Doc  4

Doc  7

Doc

Doc

Doc  6

Doc  3

Doc

Doc

SERVER  2

REPLICA

ACTIVE

Doc  1

Doc  2

Doc

Doc

Doc  7

Doc  9

Doc

Doc

SERVER  3 SERVER  4 SERVER  5

REPLICA

ACTIVE

REPLICA

ACTIVE

Doc  9

Doc  8

Doc Doc  6 Doc

Doc

Doc  5 Doc

Doc  2

Doc  8 Doc

Doc

• App  servers  accessing  docs

• Requests  to  Server  3  fail

• Cluster  detects  server  failedPromotes  replicas  of  docs  to  ac)veUpdates  cluster  map

• Requests  for  docs  now  go  to  appropriate  server

• Typically  rebalance  would  follow

Doc

Doc  1 Doc  3

APP  SERVER  1

COUCHBASE  Client  Library

CLUSTER  MAP

COUCHBASE  Client  Library

CLUSTER  MAP

APP  SERVER  2

User  Configured  Replica  Count  =  1

COUCHBASE  SERVER    CLUSTER

Wednesday, March 20, 13

Page 41: Couche Base par Tugdual Grall

Indexing  and  Querying

COUCHBASE  SERVER    CLUSTER

ACTIVE

Doc  5

Doc  2

Doc

Doc

Doc

SERVER  1

REPLICA

Doc  4

Doc  1

Doc  8

Doc

Doc

Doc

APP  SERVER  1

COUCHBASE  Client  Library

CLUSTER  MAP

COUCHBASE  Client  Library

CLUSTER  MAP

APP  SERVER  2

Doc  9

• Indexing  work  is  distributed  amongst  nodes

• Large  data  set  possible

• Parallelize  the  effort

• Each  node  has  index  for  data  stored  on  it

• Queries  combine  the  results  from  required  nodes

ACTIVE

Doc  5

Doc  2

Doc

Doc

Doc

SERVER  2

REPLICA

Doc  4

Doc  1

Doc  8

Doc

Doc

Doc

Doc  9

ACTIVE

Doc  5

Doc  2

Doc

Doc

Doc

SERVER  3

REPLICA

Doc  4

Doc  1

Doc  8

Doc

Doc

Doc

Doc  9

Query

Wednesday, March 20, 13

Page 42: Couche Base par Tugdual Grall

DEMONSTRATION

Wednesday, March 20, 13

Page 43: Couche Base par Tugdual Grall

Cross  Data  Center  ReplicaIon  (XDCR)

RAM  CACHE

Doc  1

Doc  2

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

SERVER  1

Doc  6

DISK

RAM  CACHE

Doc  1

Doc  2

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

SERVER  2

Doc  6

DISK

RAM  CACHE

Doc  1

Doc  2

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

SERVER  3

Doc  6

DISK

Couchbase  ClusterWest  Coast  Data  Center

RAM  CACHE

Doc  1

Doc  2

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

SERVER  1

Doc  6

DISK

RAM  CACHE

Doc  1

Doc  2

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

SERVER  2

Doc  6

DISK

RAM  CACHE

Doc  1

Doc  2

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

Doc

SERVER  3

Doc  6

DISK

Couchbase  ClusterEast  Coast  Data  Center

Wednesday, March 20, 13

Page 44: Couche Base par Tugdual Grall

Map  FuncIon

Wednesday, March 20, 13

Page 45: Couche Base par Tugdual Grall

DEMONSTRATION

Wednesday, March 20, 13

Page 46: Couche Base par Tugdual Grall

ElasIc  Search  Adaptor

• Elastic Search is good for ad-hoc queries and faceted browsing• Our adapter is aware of changing Couchbase topology• Indexed by Elastic Search after stored to disk in Couchbase

Elas@cSearch

Wednesday, March 20, 13

Page 47: Couche Base par Tugdual Grall

I’m  Excited  to  See  What  You  Build,Q  &  A

Contact  me  on  Twider@tgrall

Contact  me  by  [email protected]

Learn  More  About  Design  PadernsCouchbaseModels.com

Senng  up  for  Ruby  on  RailsCouchbaseOnRails.com

Couchbase  Docswww.couchbase.com/docs/index-­‐full.html

Couchbase  Forumswww.couchbase.com/forums

IRC#couchbase#libcouchbase

Wednesday, March 20, 13

Page 48: Couche Base par Tugdual Grall

Win some cool stuff, send a mail to

[email protected]

with NORMANDYJUG in the subject.

Wednesday, March 20, 13

Page 49: Couche Base par Tugdual Grall

Q&A

Wednesday, March 20, 13


Recommended