40
Not only SQL Niklas Gustavsson [email protected] @protocol7 With thanks to Mårten Gustafson (@martengustafson) for the original presentation

Not only SQL

Embed Size (px)

DESCRIPTION

Introduction to NoSQL with some focus on Cassandra, CouchDB and Neo4j

Citation preview

Page 1: Not only SQL

Not only SQL

Niklas [email protected]@protocol7With thanks to Mårten Gustafson (@martengustafson) for the original presentation

Page 2: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

What?

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia

Page 3: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

What?

“NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia

Not a single techniqueNot a single type of dataNot a single type of use case

Page 4: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Why?

Non-relationalSchema-less“Easily” scalableREST/JSON API = web friendlyEmphasize P in the CAP theorem

Page 5: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

CAP what?

Brewer’s theorem

ConsistencyAvailabilityPartition tolerance

Eventual consistency

Page 6: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Consistency

N = number of nodes to store each data onW=number of nodes for each writeR=number of nodes for each read

R + W > N

Page 7: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

What’s out there?

Storage type LicenseImplemented

in

Amazon Dynamo

Key/Value n/a ?

Cassandra Columnfamily ASL 2.0 Java

CouchDB Document ASL 2.0 Erlang

Dynomite Key/Value BSD/MIT-style Erlang

HBase Columnfamily ASL 2.0 Java

MongoDB Document AGPL v3.0 C++

Neo4J GraphAGPL v3.0 /

CommJava

Riak Key/Value ASL 2.0 Erlang

Redis Key/Value BSD/MIT-style C

Scalaris Key/Value ASL 2.0 Erlang

Tokyo Cabinet Key/Value LGPL C

Voldemort Key/Value ASL 2.0 Java

Page 8: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Distribution

Master / SlaveMaster / Slave(s)Masterless (Master / Master)

Page 9: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

DistributionMasterless Master/Slave Hot standby

Amazon Dynamo

X

Cassandra X

CouchDB X

Dynomite X

HBase ?

MongoDB X X

Neo4J*

Riak X

Redis X

Scalaris X

Tokyo Cabinet

Voldemort X* Neo4J HA coming “soon”

This is

a very simplifi

ed

view

Page 10: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Common factor

“...of the web...”

Of the who?!

Page 11: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Of the web

“...Django may be built for the Web, but CouchDB is built of the Web. I’ve never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated”

- http://jacobian.org/writing/of-the-web/

Page 12: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Of the web

“...CouchDB may succeeded, and it may fail; who knows. I’m sure of one thing, though — this is what the software of the future looks like”

- http://jacobian.org/writing/of-the-web/

Page 13: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

So freakin’ what?!

All your webish skillz and tools apply...

proxiesload balancers

caches

HTTP client libs (etag, if-modified-since, etc)

language-, platform- and OS-neutral

MIME / Content-Type

Page 14: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

These guys can just suck it

HTTP/REST is integration that works

Page 15: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

ColumnFamily

Page 16: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Cassandra

Origins at FacebookApache projectThrift API

Page 17: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

KeyStore, like a databaseColunmFamily, like a table. Have infinite number of columnsColumn, named, stores binary dataSuperColumn, a column of columnsRow, identified by key, contains columnes No schemaSparse

Page 18: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

The Ring

12

3

4

567

8

9

10

1112

Page 19: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Partitioning

TokenSnitchingPlacement

Page 20: Not only SQL

This slide intentionally left blank

Page 21: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Document Store

Relax

Page 22: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

CouchDB

Document oriented databaseKick ass replicationHTTP/JSON APIMap/reduce view (index) definitions

Page 23: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

One document == JSONOne document == One recordMany documents == One databaseMany databases == One instanceNo schema

Page 24: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

Documents canhave attachments (binary + mime type)be rendered differently (HTML, XML)

Page 25: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

A document

{ "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337}

Key, either you Key, either you choose it or choose it or

CouchDB does CouchDB does it for you it for you

Revision Revision numbernumber

Page 26: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Views

FilterCollateAggregate

Page 27: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Views{ "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337} +function(doc) { emit(doc.platform.browser, doc.platform.version);} ={ "total_rows": 1, "offset": 0, "rows": [ "id": "b098445d587b1f347e48e1a79301de02", "key": "mozilla", "value": "1.9.1.8" ]}

Page 28: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Views

Views are storedas an accessible web resourceon diskand incrementally updatedas well as replicated with the database

Page 29: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Replication

Peer to peerOnline/OfflineConflict detection and resolutionAny number of nodes

Page 30: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

CouchDB - Takeaways

Kick ass replicationViews are fastCan host and serve complete webapps

Page 31: Not only SQL

This slide intentionally left blank

Page 32: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Graphs

How to persist a network?

Page 33: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Neo4J

Graph databaseEmbeddedJava (and other languages) APIREST APITraversal and indexes

Page 34: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

World view

Nodes and relationshipsTransactional (!)No schema

Page 35: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Traversing

1. Start at a node2. Walk the graph3. Filter nodes4. Decide when to stop

Page 36: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Indexing

Indexing done at insert, or as a batchLucene based indexer for full-text search

Page 37: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Replication

Neo4J HA coming soon, P2P replicationMaster-slave replication

Page 38: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Outro

Test one or more NoSQL thingysGet familiar with Brewers CAP theoremGet familiar with the Dynamo paper

Page 39: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Page 40: Not only SQL

Callista Enterprise | www.callistaenterprise.seNot only SQL

Slides/code

Slides: http://www.slideshare.net/protocol7, CC-BY-SA

Code: http://github.com/protocol7