Not only SQL

Preview:

DESCRIPTION

Introduction to NoSQL with some focus on Cassandra, CouchDB and Neo4j

Citation preview

Not only SQL

Niklas Gustavssonniklas.gustavsson@callistaenterprise.se@protocol7With thanks to Mårten Gustafson (@martengustafson) for the original presentation

Callista Enterprise | www.callistaenterprise.seNot only SQL

What?

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia

Callista Enterprise | www.callistaenterprise.seNot only SQL

What?

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia

Not a single techniqueNot a single type of dataNot a single type of use case

Callista Enterprise | www.callistaenterprise.seNot only SQL

Why?

Non-relationalSchema-less“Easily” scalableREST/JSON API = web friendlyEmphasize P in the CAP theorem

Callista Enterprise | www.callistaenterprise.seNot only SQL

CAP what?

Brewer’s theorem

ConsistencyAvailabilityPartition tolerance

Eventual consistency

Callista Enterprise | www.callistaenterprise.seNot only SQL

Consistency

N = number of nodes to store each data onW=number of nodes for each writeR=number of nodes for each read

R + W > N

Callista Enterprise | www.callistaenterprise.seNot only SQL

What’s out there?

Storage type LicenseImplemented

in

Amazon Dynamo

Key/Value n/a ?

Cassandra Columnfamily ASL 2.0 Java

CouchDB Document ASL 2.0 Erlang

Dynomite Key/Value BSD/MIT-style Erlang

HBase Columnfamily ASL 2.0 Java

MongoDB Document AGPL v3.0 C++

Neo4J GraphAGPL v3.0 /

CommJava

Riak Key/Value ASL 2.0 Erlang

Redis Key/Value BSD/MIT-style C

Scalaris Key/Value ASL 2.0 Erlang

Tokyo Cabinet Key/Value LGPL C

Voldemort Key/Value ASL 2.0 Java

Callista Enterprise | www.callistaenterprise.seNot only SQL

Distribution

Master / SlaveMaster / Slave(s)Masterless (Master / Master)

Callista Enterprise | www.callistaenterprise.seNot only SQL

DistributionMasterless Master/Slave Hot standby

Amazon Dynamo

X

Cassandra X

CouchDB X

Dynomite X

HBase ?

MongoDB X X

Neo4J*

Riak X

Redis X

Scalaris X

Tokyo Cabinet

Voldemort X* Neo4J HA coming “soon”

This is

a very simplifi

ed

view

Callista Enterprise | www.callistaenterprise.seNot only SQL

Common factor

“...of the web...”

Of the who?!

Callista Enterprise | www.callistaenterprise.seNot only SQL

Of the web

“...Django may be built for the Web, but CouchDB is built of the Web. I’ve never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated”

- http://jacobian.org/writing/of-the-web/

Callista Enterprise | www.callistaenterprise.seNot only SQL

Of the web

“...CouchDB may succeeded, and it may fail; who knows. I’m sure of one thing, though — this is what the software of the future looks like”

- http://jacobian.org/writing/of-the-web/

Callista Enterprise | www.callistaenterprise.seNot only SQL

So freakin’ what?!

All your webish skillz and tools apply...

proxiesload balancers

caches

HTTP client libs (etag, if-modified-since, etc)

language-, platform- and OS-neutral

MIME / Content-Type

Callista Enterprise | www.callistaenterprise.seNot only SQL

These guys can just suck it

HTTP/REST is integration that works

Callista Enterprise | www.callistaenterprise.seNot only SQL

ColumnFamily

Callista Enterprise | www.callistaenterprise.seNot only SQL

Cassandra

Origins at FacebookApache projectThrift API

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

KeyStore, like a databaseColunmFamily, like a table. Have infinite number of columnsColumn, named, stores binary dataSuperColumn, a column of columnsRow, identified by key, contains columnes No schemaSparse

Callista Enterprise | www.callistaenterprise.seNot only SQL

The Ring

12

3

4

567

8

9

10

1112

Callista Enterprise | www.callistaenterprise.seNot only SQL

Partitioning

TokenSnitchingPlacement

This slide intentionally left blank

Callista Enterprise | www.callistaenterprise.seNot only SQL

Document Store

Relax

Callista Enterprise | www.callistaenterprise.seNot only SQL

CouchDB

Document oriented databaseKick ass replicationHTTP/JSON APIMap/reduce view (index) definitions

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

One document == JSONOne document == One recordMany documents == One databaseMany databases == One instanceNo schema

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

Documents canhave attachments (binary + mime type)be rendered differently (HTML, XML)

Callista Enterprise | www.callistaenterprise.seNot only SQL

A document

{ "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337}

Key, either you Key, either you choose it or choose it or

CouchDB does CouchDB does it for you it for you

Revision Revision numbernumber

Callista Enterprise | www.callistaenterprise.seNot only SQL

Views

FilterCollateAggregate

Callista Enterprise | www.callistaenterprise.seNot only SQL

Views{ "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337} +function(doc) { emit(doc.platform.browser, doc.platform.version);} ={ "total_rows": 1, "offset": 0, "rows": [ "id": "b098445d587b1f347e48e1a79301de02", "key": "mozilla", "value": "1.9.1.8" ]}

Callista Enterprise | www.callistaenterprise.seNot only SQL

Views

Views are storedas an accessible web resourceon diskand incrementally updatedas well as replicated with the database

Callista Enterprise | www.callistaenterprise.seNot only SQL

Replication

Peer to peerOnline/OfflineConflict detection and resolutionAny number of nodes

Callista Enterprise | www.callistaenterprise.seNot only SQL

CouchDB - Takeaways

Kick ass replicationViews are fastCan host and serve complete webapps

This slide intentionally left blank

Callista Enterprise | www.callistaenterprise.seNot only SQL

Graphs

How to persist a network?

Callista Enterprise | www.callistaenterprise.seNot only SQL

Neo4J

Graph databaseEmbeddedJava (and other languages) APIREST APITraversal and indexes

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

Nodes and relationshipsTransactional (!)No schema

Callista Enterprise | www.callistaenterprise.seNot only SQL

Traversing

1. Start at a node2. Walk the graph3. Filter nodes4. Decide when to stop

Callista Enterprise | www.callistaenterprise.seNot only SQL

Indexing

Indexing done at insert, or as a batchLucene based indexer for full-text search

Callista Enterprise | www.callistaenterprise.seNot only SQL

Replication

Neo4J HA coming soon, P2P replicationMaster-slave replication

Callista Enterprise | www.callistaenterprise.seNot only SQL

Outro

Test one or more NoSQL thingysGet familiar with Brewers CAP theoremGet familiar with the Dynamo paper

Callista Enterprise | www.callistaenterprise.seNot only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Slides/code

Slides: http://www.slideshare.net/protocol7, CC-BY-SA

Code: http://github.com/protocol7

Recommended