174
CouchDB Can we be comfortable without SQL? ACCU • 14th April 2010 Guy Bolton King • [email protected] Sunday, 18 April 2010

CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

  • Upload
    lengoc

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CouchDB

Can we be comfortable without SQL?

ACCU • 14th April 2010Guy Bolton King • [email protected]

Sunday, 18 April 2010

Page 2: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

No SQL?

Sunday, 18 April 2010

Page 3: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

No SQL?Sunday, 18 April 2010

Page 4: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

NOSQL

• Not Only SQL

• A broad church!

Sunday, 18 April 2010

Page 5: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

NoSQL DEFINITION: Next Generation Databases mostly address some of the points: being non-relational, distributed, open-source and horizontal scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply as: schema-free, replication support, easy API, eventually consistency / BASE (not ACID), and more.

http://nosql-database.org/

Definition by fiat

Sunday, 18 April 2010

Page 6: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Definition by example

Sunday, 18 April 2010

Page 7: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

• Key-Value Stores

• Amazon SimpleDB, Redis, Scalaris, Tokyo Cabinet, Berkeley DB

Definition by example

Sunday, 18 April 2010

Page 8: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

• Key-Value Stores

• Amazon SimpleDB, Redis, Scalaris, Tokyo Cabinet, Berkeley DB

• Distributed Key-Value Stores

• Amazon Dynamo, Project Voldemort

Definition by example

Sunday, 18 April 2010

Page 9: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

• Key-Value Stores

• Amazon SimpleDB, Redis, Scalaris, Tokyo Cabinet, Berkeley DB

• Distributed Key-Value Stores

• Amazon Dynamo, Project Voldemort

• BigTable

• Google BigTable, HBase/Hadoop, Cassandra, Hypertable

Definition by example

Sunday, 18 April 2010

Page 10: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

• Key-Value Stores

• Amazon SimpleDB, Redis, Scalaris, Tokyo Cabinet, Berkeley DB

• Distributed Key-Value Stores

• Amazon Dynamo, Project Voldemort

• BigTable

• Google BigTable, HBase/Hadoop, Cassandra, Hypertable

• Graph

• Neo4J, VertexDB

Definition by example

Sunday, 18 April 2010

Page 11: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

• Key-Value Stores

• Amazon SimpleDB, Redis, Scalaris, Tokyo Cabinet, Berkeley DB

• Distributed Key-Value Stores

• Amazon Dynamo, Project Voldemort

• BigTable

• Google BigTable, HBase/Hadoop, Cassandra, Hypertable

• Graph

• Neo4J, VertexDB

• Document Store

• CouchDB, MongoDB, Riak

Definition by example

Sunday, 18 April 2010

Page 12: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

Sunday, 18 April 2010

Page 13: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

Sunday, 18 April 2010

Page 14: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

Sunday, 18 April 2010

Page 15: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

Sunday, 18 April 2010

Page 16: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

• Decomposing data too early

Sunday, 18 April 2010

Page 17: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

• Decomposing data too early

• Hard to change

Sunday, 18 April 2010

Page 18: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

• Decomposing data too early

• Hard to change

• Integrity rules often surplus to requirements and defeat…

Sunday, 18 April 2010

Page 19: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

• Decomposing data too early

• Hard to change

• Integrity rules often surplus to requirements and defeat…

• ORMs

Sunday, 18 April 2010

Page 20: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

• Decomposing data too early

• Hard to change

• Integrity rules often surplus to requirements and defeat…

• ORMs

• Brrr.

Sunday, 18 April 2010

Page 21: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's wrong with RDBMSs?

• Relationships don't survive partitioning:

• Integrity rules and Joins

• Schema rigidity

• Decomposing data too early

• Hard to change

• Integrity rules often surplus to requirements and defeat…

• ORMs

• Brrr.

• Mordac

Sunday, 18 April 2010

Page 22: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

Sunday, 18 April 2010

Page 23: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

Sunday, 18 April 2010

Page 24: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

• All clients see the same data, even with concurrent updates

Sunday, 18 April 2010

Page 25: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

• All clients see the same data, even with concurrent updates

• Availability

Sunday, 18 April 2010

Page 26: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

• All clients see the same data, even with concurrent updates

• Availability

• All clients can read & write some version of the data

Sunday, 18 April 2010

Page 27: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

• All clients see the same data, even with concurrent updates

• Availability

• All clients can read & write some version of the data

• Partition Tolerance

Sunday, 18 April 2010

Page 28: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

• All clients see the same data, even with concurrent updates

• Availability

• All clients can read & write some version of the data

• Partition Tolerance

• Scaling: we can partition the data across multiple nodes

Sunday, 18 April 2010

Page 29: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CAP

• Consistency

• All clients see the same data, even with concurrent updates

• Availability

• All clients can read & write some version of the data

• Partition Tolerance

• Scaling: we can partition the data across multiple nodes

• Pick Two

Sunday, 18 April 2010

Page 30: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Consistency

Availability Partition Tolerance

Paxos-based solutionsRDBMS

CouchDB

Sunday, 18 April 2010

Page 31: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CouchDB

Sunday, 18 April 2010

Page 32: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CouchDBCluster Of Unreliable Commodity Hardware

Sunday, 18 April 2010

Page 33: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Features

Sunday, 18 April 2010

Page 34: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Concurrency

Sunday, 18 April 2010

Page 35: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Concurrency

• Written in Erlang

Sunday, 18 April 2010

Page 36: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Concurrency

• Written in Erlang

• Scales to multiple cores smoothly

Sunday, 18 April 2010

Page 37: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Concurrency

• Written in Erlang

• Scales to multiple cores smoothly

• One Erlang process per request

Sunday, 18 April 2010

Page 38: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Concurrency

• Written in Erlang

• Scales to multiple cores smoothly

• One Erlang process per request

• Lock-free design: MVCC

Sunday, 18 April 2010

Page 39: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Concurrency

• Written in Erlang

• Scales to multiple cores smoothly

• One Erlang process per request

• Lock-free design: MVCC

• Each process sees the version of the database that existed at the beginning of the request

Sunday, 18 April 2010

Page 40: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Robustness

Sunday, 18 April 2010

Page 41: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Robustness

• Erlang's crash-only design

Sunday, 18 April 2010

Page 42: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Robustness

• Erlang's crash-only design

• No DB repair necessary

Sunday, 18 April 2010

Page 43: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Robustness

• Erlang's crash-only design

• No DB repair necessary

• Snapshots/hot backups trivial: just copy the files

Sunday, 18 April 2010

Page 44: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Fault Tolerance

Sunday, 18 April 2010

Page 45: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Fault Tolerance

• Continuous replication

• Load-balancing & sharding via couchdb-lounge

Sunday, 18 April 2010

Page 46: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who's using it?

• BBC

• Ubuntu One

• Opscode Chef

• Mozilla Raindrop

• Couchio

• Cloudant

• myspace, IBM, Apple, ebay

• …and more

Sunday, 18 April 2010

Page 47: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Architecture

diagram from couchdb.apache.org

Sunday, 18 April 2010

Page 48: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Storage Engine

Sunday, 18 April 2010

Page 49: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Storage Engine

• B+Tree

• Append-only design

• Like SVN

• Values are structured JSON documents, with free-form binary attachments if needed

• Each document has a unique _id key

Sunday, 18 April 2010

Page 50: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

G MA 3

Sunday, 18 April 2010

Page 51: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

G MA M'3

Sunday, 18 April 2010

Page 52: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

M'GM A 3

Sunday, 18 April 2010

Page 53: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

View Servers

Sunday, 18 April 2010

Page 54: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

View Servers

• Map/Reduce design

• Map/Reduce functions stored in special "_design" documents

• View server protocol allows any language to be used

• JavaScript by default

• Python, Scala, Ruby and others available

• Native Erlang views possible too

Sunday, 18 April 2010

Page 55: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

View results

Sunday, 18 April 2010

Page 56: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

View results

• Each view's results stored as a B+Tree

• Map function results as leaf nodes

• Reduce function results in inner nodes

• Results are only updated when view used

• …and only for updated documents

Sunday, 18 April 2010

Page 57: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

r( [r([A]), r([G, M]), 3] )

3r([G, M])r([A])

G MA

Sunday, 18 April 2010

Page 58: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

JavaScript

• Objects are hashes:var obj = { "k1": "value", "k2": 12 };

• Values retrieved via index:var val = obj["value"];

• …or dot-notation (iff key is not a keyword):var val = obj.value;

• Arrays are objects with integer keys and a length member:var ar = ["foo", 12];ar[0] === "foo";ar.length === 2;

Sunday, 18 April 2010

Page 59: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Functions

• Functions are declared like this:function f(a, b) = { return a + b; }

• They're values; this is better:var f = function(a, b) { return a + b; }

Sunday, 18 April 2010

Page 60: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Iteration

• Iteration using for:for (k in obj) { print(obj[k]); }

Sunday, 18 April 2010

Page 61: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

JSON

• JavaScript Object Notation

• Doug Crockford

• Serialise data as JavaScript representation

• That's all!

Sunday, 18 April 2010

Page 62: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Client API

Sunday, 18 April 2010

Page 63: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Client API

• RESTful HTTP transport

Sunday, 18 April 2010

Page 64: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Client API

• RESTful HTTP transport

• JSON request and response body

Sunday, 18 April 2010

Page 65: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Client API

• RESTful HTTP transport

• JSON request and response body

• CouchDB listens on 127.0.0.1:5984 by default

Sunday, 18 April 2010

Page 66: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

RESTful HTTP

• PUT path/new-resource data

• POST path/resource data

• GET path/resource

• DELETE path/resource

Sunday, 18 April 2010

Page 67: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Databases

• PUT db

• DELETE db

• GET _all_dbs

Sunday, 18 April 2010

Page 68: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Documents

• Create + Update

• PUT db/id data

• Read

• GET db/id

• Delete

• DELETE db/id

Sunday, 18 April 2010

Page 69: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Documents

• Updates for HTML <form> users

• POST db/id data

• Requires Content-Type: multipart/form-data

Sunday, 18 April 2010

Page 70: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Creating Documents

• Create returns ok, id, rev

• Retrieve shows the same id and rev as _id and _rev members

Sunday, 18 April 2010

Page 71: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Updating Documents

• Remember MVCC?

• Must specify latest _rev when updating

• If you don't have latest _rev, request will fail

• Need to get latest _rev again

• Resolve any conflicts in client

Sunday, 18 April 2010

Page 72: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Attachments• Create

• PUT db/id/attachment

Content-Type: type

• Read

• GET db/id/attachment

• Update

• PUT db/id/attachment?rev=old-rev

• Delete

• DELETE db/id/attachment

• Attachment meta-data is stored in a document's _attachment key

Sunday, 18 April 2010

Page 73: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Tedious

• Specifying _rev gets a bit tedious when using curl

• There must be an easier way to experiment…

Sunday, 18 April 2010

Page 74: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Futonhttp://localhost:5984/_utils

Sunday, 18 April 2010

Page 75: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Document IDs

Sunday, 18 April 2010

Page 76: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Document IDs

• Must be unique

Sunday, 18 April 2010

Page 77: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Document IDs

• Must be unique

• Ensure uniqueness using your own scheme OR

Sunday, 18 April 2010

Page 78: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Document IDs

• Must be unique

• Ensure uniqueness using your own scheme OR

• Ask CouchDB for some IDs:

• GET _uuids?count=n

Sunday, 18 April 2010

Page 79: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Document IDs

• Must be unique

• Ensure uniqueness using your own scheme OR

• Ask CouchDB for some IDs:

• GET _uuids?count=n

• OR use on-the-fly id creation (inefficient):

• POST db data

Sunday, 18 April 2010

Page 80: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Retrieving All Documents

• GET db/_all_docs?include_docs=true

Sunday, 18 April 2010

Page 81: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Limiting

• Use parameters to _all_docs

• startkey

• endkey

• limit

• skip

• GET db/_all_docs?startkey="s"&limit=n

Sunday, 18 April 2010

Page 82: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Querying with Views

• So far, just looked at CouchDB as KV store

• Views are how we do queries

• _all_docs is a built-in view

• We need some data first

Sunday, 18 April 2010

Page 83: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Food Example

• Each document represents a person and their fruit & veg preferences

Sunday, 18 April 2010

Page 84: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?

Sunday, 18 April 2010

Page 85: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

Sunday, 18 April 2010

Page 86: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) {

Sunday, 18 April 2010

Page 87: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) {

Sunday, 18 April 2010

Page 88: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) {

Sunday, 18 April 2010

Page 89: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) { emit(doc.preferences.fruit[i], doc._id);

Sunday, 18 April 2010

Page 90: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) { emit(doc.preferences.fruit[i], doc._id); }

Sunday, 18 April 2010

Page 91: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) { emit(doc.preferences.fruit[i], doc._id); } }

Sunday, 18 April 2010

Page 92: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) { emit(doc.preferences.fruit[i], doc._id); } }}

Sunday, 18 April 2010

Page 93: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) { emit(doc.preferences.fruit[i], doc._id); } }}

• emit(key, value) adds a row to the result

Sunday, 18 April 2010

Page 94: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?• people-by-fruit:

function(doc) { if (doc.preferences.fruit) { for (i in doc.preferences.fruit) { emit(doc.preferences.fruit[i], doc._id); } }}

• emit(key, value) adds a row to the result

• i.e. it stores value under key in the view's B-tree

Sunday, 18 April 2010

Page 95: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Design Documents

• Begin with _design/

• _design/food:{ "views": { "people-by-fruit": { "map": "function(doc) { ... }" } }}

• (curl not much use for this: use Futon)

Sunday, 18 April 2010

Page 96: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Running views

• GET db/_design/view

• Returns list of doc-id, key, value

• Can limit using key, startkey, endkey, limit and skip

• Show full containing doc with include_docs=true

Sunday, 18 April 2010

Page 97: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears?

$ curl -X GET 'http://localhost:5984/accu-food/_design/food/_view/people-by-fruit?key="pears"'

{"total_rows":10,"offset":6,"rows":[{"id":"alice","key":"pears","value":"alice"},{"id":"eve","key":"pears","value":"eve"},{"id":"harry","key":"pears","value":"harry"},{"id":"tom","key":"pears","value":"tom"}]}

Sunday, 18 April 2010

Page 98: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's happening?

Sunday, 18 April 2010

Page 99: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's happening?

• The map function is applied to any updated documents, and the view's B-tree updated

Sunday, 18 April 2010

Page 100: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's happening?

• The map function is applied to any updated documents, and the view's B-tree updated

• We then index into the results with key

Sunday, 18 April 2010

Page 101: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's happening?

• The map function is applied to any updated documents, and the view's B-tree updated

• We then index into the results with key

• If no documents are updated before the next use of the view, it will just index into the B-tree of results

Sunday, 18 April 2010

Page 102: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes carrots?

• Could do the same for veg as we did for fruit

• Can do better

• Modify people-by-fruit to emit a complex key

Sunday, 18 April 2010

Page 103: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes carrots?

$ curl -g -X GET 'http://localhost:5984/accu-food/_design/food/_view/people-by-food?key=["veg","carrots"]'

{"total_rows":16,"offset":11,"rows":[{"id":"bob","key":["veg","carrots"],"value":"bob"},{"id":"dick","key":["veg","carrots"],"value":"dick"}]}

Sunday, 18 April 2010

Page 104: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears and carrots?

Sunday, 18 April 2010

Page 105: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears and carrots?

• Can't really be done in CouchDB alone

Sunday, 18 April 2010

Page 106: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears and carrots?

• Can't really be done in CouchDB alone

• Need to run two queries

Sunday, 18 April 2010

Page 107: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears and carrots?

• Can't really be done in CouchDB alone

• Need to run two queries

• Make the intersection in client software

Sunday, 18 April 2010

Page 108: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes pears and carrots?

• Can't really be done in CouchDB alone

• Need to run two queries

• Make the intersection in client software

• Not every classic RDBM problem can be solved!

Sunday, 18 April 2010

Page 109: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes statistics?

Sunday, 18 April 2010

Page 110: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes statistics?

• We all do!

Sunday, 18 April 2010

Page 111: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes statistics?

• We all do!

• How do we count

Sunday, 18 April 2010

Page 112: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes statistics?

• We all do!

• How do we count

• The number of people who like pears?

Sunday, 18 April 2010

Page 113: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Who likes statistics?

• We all do!

• How do we count

• The number of people who like pears?

• The number of people who like veg?

Sunday, 18 April 2010

Page 114: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Reduce

• Modify people-by-food to return a count instead of doc._id

• Add a reduce function:{ "views": { "people-by-fruit": { "map": "function(doc) { ... }" "reduce": "function(keys, values, rereduce) { ... }" } }}

Sunday, 18 April 2010

Page 115: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

How many people like pears?

$ curl -g -X GET 'http://localhost:5984/accu-food/_design/food/_view/food-count?key=["fruit","pears"]'

{"rows":[{"key":null,"value":4}]}

Sunday, 18 April 2010

Page 116: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

How many people like veg?

$ curl -g -X GET 'http://localhost:5984/accu-food/_design/food/_view/food-count?startkey=["veg","a"]&endkey=["veg","z"]'

{"rows":[{"key":null,"value":7}]}

• Collation is unicode order

• Uses IBM's ICU library

• Can skip ICU collation using raw=true

Sunday, 18 April 2010

Page 117: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Grouping

• Querying a reduce view normally shows final reduction i.e. that at the root node of the B-tree

• Can show "lower" results by adding group=true and group_level

Sunday, 18 April 2010

Page 118: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Final scores$ curl -g -X GET 'http://localhost:5984/accu-food/_design/food/_view/food-count?group=true'

{"rows":[{"key":["fruit","apples"],"value":2},{"key":["fruit","bananas"],"value":1},{"key":["fruit","durian"],"value":1},{"key":["fruit","kiwis"],"value":1},{"key":["fruit","kumquats"],"value":1},{"key":["fruit","pears"],"value":4},{"key":["veg","broccoli"],"value":1},{"key":["veg","carrots"],"value":3},{"key":["veg","celeriac"],"value":1},{"key":["veg","potatoes"],"value":2}]}

$ curl -g -X GET 'http://localhost:5984/accu-food/_design/food/_view/food-count?group=true&group_level=1'

{"rows":[{"key":["fruit"],"value":10},{"key":["veg"],"value":7}]}

Sunday, 18 April 2010

Page 119: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Reduce Parameters

• keys: n keys that correspond to the…

• values: n values of the map results to be reduced.

• rereduce: true if we're reducing non-leaf nodes

Sunday, 18 April 2010

Page 120: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Reduce Caveats

• Reduce should produce a scalar value

• Don't try and produce a complex value

• e.g. a hash of unique keys and their counts

• Complex values munch storage, often becoming bigger than the document B-Tree itself

Sunday, 18 April 2010

Page 121: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Map/Reduce wrap-up

Sunday, 18 April 2010

Page 122: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Map/Reduce wrap-up

• Map/Reduce functions are side-effect free

Sunday, 18 April 2010

Page 123: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Map/Reduce wrap-up

• Map/Reduce functions are side-effect free

• Only scratched the surface in terms of techniques

Sunday, 18 April 2010

Page 124: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Map/Reduce wrap-up

• Map/Reduce functions are side-effect free

• Only scratched the surface in terms of techniques

• Haven't covered handling different document types (although we've hinted at it)

Sunday, 18 April 2010

Page 125: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Map/Reduce wrap-up

• Map/Reduce functions are side-effect free

• Only scratched the surface in terms of techniques

• Haven't covered handling different document types (although we've hinted at it)

• 0.11 includes linked documents

Sunday, 18 April 2010

Page 126: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Formatting results

• Output seen so far

• JSON

• Futon-formatted

• Wouldn't it be nice if we could get CouchDB to format our data?

Sunday, 18 April 2010

Page 127: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Show Functions

• Format a document

• Live in the shows key of a design document

• function(doc, req)

• doc is the doc

• req is the HTML Request object

• return HTTP Response object or body string

• GET db/_design/design/_show/show/id

Sunday, 18 April 2010

Page 128: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

function(doc, req) { log(req); send("<!DOCTYPE html>"); send("<html><head><title>" + doc._id + "</title></head><body>"); send("<h1>" + doc._id + "</h1>"); for (type in doc.preferences) { send("<h2>" + type + "</h2><ul>"); for (i in doc.preferences[type]) { send("<li>" + doc.preferences[type][i] + "</li>"); } send("</ul>"); } send("</body></html>"); return "";}

Sunday, 18 April 2010

Page 129: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

List Functions

• Format a view

• Lives in the lists key of a _design document

• function(head, req)

• head is {total_rows: n, offset: n}

• req is the HTTP Request object

• return HTTP Response object or body string

Sunday, 18 April 2010

Page 130: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

function(head, req) { start({"headers": {"Content-Type": "text/html"}});

send("<!DOCTYPE html>\n"); send("<html><head><title>People</title></head><body>");

var row = null; var curtype = null; var curitem = null;

while (row = getRow()) { if (row.key[0] !== curtype) { curtype = row.key[0]; curitem = null; send("<h1>" + curtype + "</h1>"); } if (row.key[1] !== curitem) { curitem = row.key[1]; send("<h2>" + curitem + " are enjoyed by</h2>"); } send("<p>" + row.value + "</p>"); }

send("</body></html>"); return "";}

Sunday, 18 April 2010

Page 131: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Helper functions

• toJSON(obj)

• Converts obj to a JSON string

• JSON.parse(string)

• Converts JSON string to an object

• send(chunk)

• Append chunk to the body of the result

• start(obj)

• Like calling return obj

Sunday, 18 April 2010

Page 132: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Format Helpers

• registerType(name, content-type, …)

• registers name as a format mapped to a list of content-types

• provides(name, function() { … })

• registers function as providing the rendering code for the given format. Formats can be specified explicitly as a query parameter ?format=name or by the HTTP Accepts header.

Sunday, 18 April 2010

Page 133: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Result object keys

• code

• headers

• body

• stop

• Returning {"stop": true} terminates list function early

Sunday, 18 April 2010

Page 134: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Update functions

• Can change document data!

• Useful for adding timestamps

• Live in updates key of design document

• GET db/_design/design/_update/update/id

Sunday, 18 April 2010

Page 135: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Update functions

• function(curdoc, req)

• curdoc is the current contents of the document identified by req.id

• may be null if no such document exists

• can create one!

• return [newdoc, resp]

• resp is as the return value in show and list functions

Sunday, 18 April 2010

Page 136: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Validation

• validate_doc_update key in design doc

• One per design doc

• All validation functions in db are called on an update

Sunday, 18 April 2010

Page 137: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Validation

• function(new-doc, old-doc, user-context)

• user-context contains {db, name, roles}

• throw({"forbidden": reason}) on failure

Sunday, 18 April 2010

Page 138: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Replication

Sunday, 18 April 2010

Page 139: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Replication API

Sunday, 18 April 2010

Page 140: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Replication API

• One-shot

• POST db/_replicate{"source": "http://host-a/db", "target": "http://host-b/db"}

• Continuous

• POST db/_replicate{"source": "…", "target": "…", "continuous": true}

Sunday, 18 April 2010

Page 141: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Changes API

• Continuous replication requires host-b to monitor host-a

• Done via Changes API

Sunday, 18 April 2010

Page 142: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Changes API

• GET db/_changes

• since=db-seq

• feed=[longpoll|continous]

• default is immediate return; longpoll waits for 1 change; continuous keeps on truckin'

• heartbeat=millis

• returns{"seq": db-seq, "id": doc-id, "changes": [{"rev": rev, …}, …]

Sunday, 18 April 2010

Page 143: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Conflicts

• Occur when a document id is changed on host-a and host-b

• Conflicting revs in the invisible _conflicts key

• Need a conflicts view to find them

• CouchDB always resolves conflicts automatically using it's own algorithm

• We may want to do something else

Sunday, 18 April 2010

Page 144: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Resolving Conflicts

• Look at the conflicting and current revs

• GET db/id

• GET db/id?rev=conflict-rev

• Merge/replace in your client code

• Put the winner back

• PUT db/id

• Delete the conflicting revs

• DELETE db/id?rev=conflict-rev

Sunday, 18 April 2010

Page 145: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Security

Sunday, 18 April 2010

Page 146: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Authentication

• Both HTTP BasicAuth and Cookie-based authentication is available

Sunday, 18 April 2010

Page 147: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Permissions & Roles

• Users have read or read/write privs.

• Also have roles

• Available in user-context parameter in validate_doc_update, so…

• You can implement your own write security scheme

Sunday, 18 April 2010

Page 148: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

What's not there

• No SSL

• Everyone can read

• …so you may want to reverse-proxy CouchDB and apply your own stricter rules

Sunday, 18 April 2010

Page 149: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Performance & Tuning

Sunday, 18 April 2010

Page 150: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

View speed

• Allow use of old view data with stale=ok

• If all requests use stale=ok view will never update!

• Refresh views with externally scheduled job

Sunday, 18 April 2010

Page 151: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Update speed

• Use the _bulk_docs endpoint with user-defined, monotonic document IDs

• Queue updates with batch=ok

• Commit to disk controlled by batch_save_size/interval .ini params or…

• POST db/_ensure_full_commit

• Require full sync-on-commit (more reliable, but slower): delayed_commits .ini param

Sunday, 18 April 2010

Page 152: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

CouchDB benchmark

$ test/bench/run1..6# Single doc inserts: 162.75 docs/secondok 1 single_doc_insert# Single doc inserts with batch=ok: 466.09 docs/secondok 2 batch_ok_doc_insert# Bulk docs - 100: 4115.23 docs/secondok 3 bulk_doc_100# Bulk docs - 1000: 4202.56 docs/secondok 4 bulk_doc_1000# Bulk docs - 5000: 3934.68 docs/secondok 5 bulk_doc_5000# Bulk docs - 10000: 3942.67 docs/secondok 6 bulk_doc_10000

Sunday, 18 April 2010

Page 153: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

View Servers

• Some view servers are faster than others

• Python is faster than JavaScript[http://www.mikealrogers.com/archives/673]

• JSON serialisation is a bottleneck

• Can use Erlang native view server

• …but only as last resort!BIG trade-off in convenience.

Sunday, 18 April 2010

Page 154: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Client API

• Native Erlang access via hovercraft

• Bypasses HTTP transport overhead

Sunday, 18 April 2010

Page 155: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

DB Space

• B-Tree grows and grows

• Need to remove old revs

• Compaction must be triggered manually

• POST db/_compact

• POST db/_compact/design

Sunday, 18 April 2010

Page 156: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Old View Data

• Changing a view implementation creates a new view B-Tree

• Old B-Trees are left hanging around

• Manual cleanup is required

• POST db/_view_cleanup

Sunday, 18 April 2010

Page 157: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

Sunday, 18 April 2010

Page 158: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

• You can host a full application stack in CouchDB alone

Sunday, 18 April 2010

Page 159: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

• You can host a full application stack in CouchDB alone

• Futon

Sunday, 18 April 2010

Page 160: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

• You can host a full application stack in CouchDB alone

• Futon

• MVC perspective

Sunday, 18 April 2010

Page 161: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

• You can host a full application stack in CouchDB alone

• Futon

• MVC perspective

• Controller: JavaScript in the browser

Sunday, 18 April 2010

Page 162: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

• You can host a full application stack in CouchDB alone

• Futon

• MVC perspective

• Controller: JavaScript in the browser

• Model: documents + views

Sunday, 18 April 2010

Page 163: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Couch Apps

• You can host a full application stack in CouchDB alone

• Futon

• MVC perspective

• Controller: JavaScript in the browser

• Model: documents + views

• View: show + list + static attachments (CSS, JS)

Sunday, 18 April 2010

Page 164: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Problems

Sunday, 18 April 2010

Page 165: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Problems

• No way to re-use code in map/reduce/show/list

Sunday, 18 April 2010

Page 166: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Problems

• No way to re-use code in map/reduce/show/list

• This is unlikely to happen on the server side, since it will hamper internal caching of design documents

Sunday, 18 April 2010

Page 167: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Problems

• No way to re-use code in map/reduce/show/list

• This is unlikely to happen on the server side, since it will hamper internal caching of design documents

• Even unsophisticated list/show functions will duplicate big chunks of code

Sunday, 18 April 2010

Page 168: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

couchapp

• Build a design document in the filesystem

• couchapp push http: //host/db

• ...will create a design document at _id with directory hierarchy mapped to JSON object hierarchy

• json files become JSON objects

• other files become strings

Sunday, 18 April 2010

Page 169: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Code re-use

• Macros in .js files are expanded prior to upload

• !code path/to/file.js

• Includes the specified javascript in your function

• !json path.to.file

• Creates an object hierarchy with only the leafmost node populated

Sunday, 18 April 2010

Page 170: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Cloning

• You can extract a pushed couchapp back into the filesystem

• This opens up the idea of easily sharing your couchapps

• Not a substitute for proper DVCS though!

Sunday, 18 April 2010

Page 171: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Clustering

• Roll-your-own, or…

• Use couchdb-lounge

• Sharded CouchDB instances

• python+twisted proxy

• Delegates requests

• Joins results

• Still need to roll-your-own replication

Sunday, 18 April 2010

Page 172: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Hosting

• Cloudant have a big clustering solution in private beta

• Couchio offer individual CouchDB instances

Sunday, 18 April 2010

Page 173: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

Why?

• You have a good idea of how the relationships in your model work

• You want to be able to deploy on desktop and server, and synchronise between them

• You may need to scale to handle big data at some point

• You want a flexible B-tree to build your own clustering solution on: Cloudant

Sunday, 18 April 2010

Page 174: CouchDB - ACCU · PDF fileCouchDB Can we be comfortable without SQL? ... • Google BigTable, HBase/Hadoop, Cassandra, ... IBM, Apple, ebay

flip

Bowdlerised from http://browsertoolkit.com/fault-tolerance.png

Sunday, 18 April 2010