Download pdf - NoSQL Smackdown!

Transcript
Page 1: NoSQL Smackdown!

BerglundTim

NoSQLSMACKDOWN

1

Page 2: NoSQL Smackdown!

@tlberglund

#nosql2

Page 3: NoSQL Smackdown!

Voldemort

3

Page 4: NoSQL Smackdown!

negativavia

4

Page 5: NoSQL Smackdown!

SQL5

Page 6: NoSQL Smackdown!

tuple (n.)

relation (n.)An unordered set of tuples of the same type.

6

Page 7: NoSQL Smackdown!

tuple (n.)A function that maps attributes to values.

7

Page 8: NoSQL Smackdown!

tuple (n.)(A bundle of key-value pairs—but don’t tell anyone!)

8

Page 9: NoSQL Smackdown!

id username pwd_hash born_at monkey

1 mluther d8c82af9 Nov 1483 FALSE

2 aaugustine 329b8dae Nov 354 FALSE

3 gnyssa e50ec9e0 Jun 335 FALSE

4 bonzo 330e01f2 Apr 2007 TRUE

9

Page 10: NoSQL Smackdown!

Relations

10

Page 11: NoSQL Smackdown!

Comparing “NoSQL” to “relational” is a bit of a shell game.

—Eben Hewittauthor of Cassandra: The Definitive Guide

11

Page 12: NoSQL Smackdown!

Transactions

12

Page 13: NoSQL Smackdown!

C

PA

CAP Theorem

13

Page 14: NoSQL Smackdown!

Tradeoff Between

ConsistencyAvailability

Partition Tolerance

14

Page 15: NoSQL Smackdown!

Between what?

15

Page 16: NoSQL Smackdown!

consistency (n.)All clients always have the same view of the data.

16

Page 17: NoSQL Smackdown!

availability (n.)All clients can always read or write within some maximum latency.

17

Page 18: NoSQL Smackdown!

partition tolerance (n.)

No set of failures less than total network failure is allowed to cause the system to respond incorrectly.

18

Page 19: NoSQL Smackdown!

Cluster Node

Cluster Node

Cluster Node

Cluster Node

19

Page 20: NoSQL Smackdown!

Cluster Node

Cluster Node

Cluster Node

Cluster Node

Switch

20

Page 21: NoSQL Smackdown!

Cluster Node

Cluster Node

Cluster Node

Cluster Node

Switch

21

Page 22: NoSQL Smackdown!

C

PA

CAP Theorem

22

Page 23: NoSQL Smackdown!

Strongly Consistent

C

PA

MongoDBCassandra

23

Page 24: NoSQL Smackdown!

Always Available

C

PA

CouchDBRiak

VoldemortCassandra

24

Page 25: NoSQL Smackdown!

Partition Intolerant

C

PA

MySQLOracle

SQL ServerNeo4JRedis

25

Page 26: NoSQL Smackdown!

C

PA

26

Page 27: NoSQL Smackdown!

negativavia

a way forward

27

Page 28: NoSQL Smackdown!

NoSQL is a set of different approaches to storing and

retrieving data.

28

Page 29: NoSQL Smackdown!

What’s Different?

Data models

Querying

Approaches to scale

29

Page 30: NoSQL Smackdown!

Tradeoffs

Complex transactions vs. scalability

Consistency vs. availability (often)

Performance vs. durability

Horizontal vs. vertical scale

Cheap writes vs. cheap reads

30

Page 31: NoSQL Smackdown!

OriginLicenseImplementation language

Data model

How does it scale?

API/Query language

Deployments

Support and community

31

Page 32: NoSQL Smackdown!

32

Page 33: NoSQL Smackdown!

Voldemort

33

Page 34: NoSQL Smackdown!

34

Page 35: NoSQL Smackdown!

Origin-Facebook Inbox search

back in 2007

License-Apache Public License 2.0

35

Page 36: NoSQL Smackdown!

Implementation Language-

Java 6

Data Model-

It’s a Big-Table-based

“column store.”

36

Page 37: NoSQL Smackdown!

Column

TimestampValueName

37

Page 39: NoSQL Smackdown!

Row

Column

Key

Column

Column

Column

Column

39

Page 40: NoSQL Smackdown!

Column Family

ColumnColumnKey Column

ColumnColumnKey

ColumnKey

40

Page 41: NoSQL Smackdown!

“Contacts” Column Family

emailfull_name050fe74e2 mobile

emailfull_namebbf77f01d

full_name8b20d8f6

41

Page 42: NoSQL Smackdown!

SuperColumn

Name

Columnkey

Columnkey

Columnkey

42

Page 43: NoSQL Smackdown!

Contact Info SuperColumn

4145bfaf15f10c2e6033f8b9c3143297a36f5fe3

20101011T120502ZTim Berglundfull_name

[email protected]

19940217T145637Z[redacted]mobile

20101011T120452Z80123postal_code

full_name

email

mobile

postal_code

43

Page 44: NoSQL Smackdown!

SuperColumn Family

Key

SuperColumnKey

Key

SuperColumn

SuperColumn

SuperColumn

SuperColumn

44

Page 45: NoSQL Smackdown!

Keyspace

SuperColumn Family

SuperColumn Family

SuperColumn Family

Column Family

Column Family

45

Page 46: NoSQL Smackdown!

A what?

46

Page 47: NoSQL Smackdown!

Nested Hash Table

Cluster.Keyspace.ColumnFamily[key1][key2] = <column>

...SuperColumnFamily[key1][key2] = <row>

Cluster.Keyspace.ColumnFamily[key] = <row>

...SuperColumnFamily[key1][key2][key3] = <column>

...SuperColumnFamily[key] = <map of rows>

47

Page 48: NoSQL Smackdown!

Scalability-

Rock star!(see Amazon Dynamo)

2000

4000

6000

8000

A000

C000

E000

0000

48

Page 49: NoSQL Smackdown!

2000

4000

6000

8000

A000

C000

E000

0000

49

Page 50: NoSQL Smackdown!

Scalability

- Consistent hashing

- No distinguished nodes

- Add and remove nodes

on a live cluster

50

Page 51: NoSQL Smackdown!

API- Thrift RPC

- Easy to fetch columns

by key

- Hadoop integration- Native clients

51

Page 52: NoSQL Smackdown!

Deployments-

52

Page 54: NoSQL Smackdown!

Voldemort

54

Page 55: NoSQL Smackdown!

55

Page 56: NoSQL Smackdown!

Origin-

Founders of DoubleClick were totally going to

take over the Cloud

License-Database: GNU Affero 3.0

Drivers: APL 2

56

Page 57: NoSQL Smackdown!

Implementation Language-

C++

Data Model-

JSON document database

(this is so simple!)

57

Page 58: NoSQL Smackdown!

{ "_id" : ObjectId("4cbd00455280f73d395922a4"), "contact" : { "tags" : ["man", "", "", ""] "firstName" : "Myron", "lastName" : "Dalton", "address1" : "4322 Maple Street", "city" : "Santa Ana", "state" : "CA", "postalCode" : "92705", "email" : "[email protected]" }, "occupation" : "Long haul truck driver" }

58

Page 59: NoSQL Smackdown!

Does it scale?

Well...it shards!

59

Page 60: NoSQL Smackdown!

API- Native JavaScript

console

- Binary drivers

- Ad-hoc query language

(but it’s NOT SQL, okay?)

60

Page 61: NoSQL Smackdown!

db.address.find().limit(5)

db.contact.find({ “lastName”: “Berglund” })

db.address.find({ $query: { “stateProvince”: “CO” }, $orderBy: { “city”: 1 } })

db.address.find({ “contact.city”: “Chicago” })

db.address.remove({_id: ObjectId("4cbcfd7df72291161b1d1bf2")})

61

Page 62: NoSQL Smackdown!

API- Can write MapReduce

jobs in JavaScript

- Morphia for Java

- Mongoose for node.js

62

Page 63: NoSQL Smackdown!

Deployments-

63

Page 65: NoSQL Smackdown!

Concerns

- Write durability?

- Sharding performance

- But everyone still wants

to date her

Journaling comingin 1.8!

65

Page 66: NoSQL Smackdown!

Voldemort

66

Page 67: NoSQL Smackdown!

67

Page 68: NoSQL Smackdown!

Origin

-Neo Technologies in 2003

-Malmö and San Francisco

68

Page 69: NoSQL Smackdown!

License

- GPL3, full-featured

- Commercial

$49/mo antiviral

$499/mo advanced

$1,999/mo enterprise

69

Page 70: NoSQL Smackdown!

Maturity

- Production since 2003

- 1.0 in Feb 2010

- Java 6Implementation Language

- Easily embeddable!

70

Page 71: NoSQL Smackdown!

Data Metaphor

- Graph

- Nodes, relationships

71

Page 72: NoSQL Smackdown!

4CG

-;NNB?Q

"LC;H

(IFFSQII>4SJ?M

+HIQM

7LCN?M QCNB7ILEM QCNB

3J?;EM

QCNB%HA;A?M CH

>CMJON;NCIH QCNB

All nodes and relationships have arbitrary properties

72

Page 73: NoSQL Smackdown!

Query Model

- REST/JSON

- Java traversal API

- JTA/JTS XA

- Bindings in Clojure, Ruby,

Python, PHP, Scala, Grails

73

Page 74: NoSQL Smackdown!

Scale Idiom- Traditionally focused on

single-node performance

- Recent HA support

- Master/slave

- ZK master election

- Writeable slaves74

Page 75: NoSQL Smackdown!

Support

- Neo Technologies

Deployments

- Box.net

- Box.net

- ThoughtWorks

75

Page 76: NoSQL Smackdown!

Voldemort

76

Page 77: NoSQL Smackdown!

77

Page 78: NoSQL Smackdown!

Origin-

Internal datastore forBasho’s Salesforce.comapps

(Hey, it seemed like a good idea at the time!)

78

Page 79: NoSQL Smackdown!

License-

APL 2 for OSS version

Closed-source “Enterprise DS” version

79

Page 80: NoSQL Smackdown!

Implementation Language-

Erlang, C, SpiderMonkey

JavaScript VM

Data Model-

Key/value store, but

with buckets!

80

Page 81: NoSQL Smackdown!

ValueKey

That’s it.

81

Page 82: NoSQL Smackdown!

Bucket A

ValueKey

ValueKey

ValueKey

ValueKey

Bucket B

ValueKey

ValueKey

ValueKey

ValueKey

82

Page 83: NoSQL Smackdown!

Bucket A

Timname

Developeroccupation

061972birthday

Littletoncity

Bucket B

Aureliusname

Bishopoccupation

110354birthday

Hippocity

83

Page 84: NoSQL Smackdown!

Does it scale?

- Like a boss!

- No distinguished node

- Tunable consistency, replication

- Add nodes without taking the cluster down

84

Page 85: NoSQL Smackdown!

API- HTTP interface (slow,

but featureful)

- Protocol Buffers (a

performance beast)

85

Page 86: NoSQL Smackdown!

API

- Key CRUD

- MapReduce in

JavaScript

- Graph traversals

translate to MapReduce

86

Page 87: NoSQL Smackdown!

Deployments-

87

Page 89: NoSQL Smackdown!

Voldemort

89

Page 90: NoSQL Smackdown!

90

Page 91: NoSQL Smackdown!

Origin-

Salvatore Sanfilippo wrote it for his analytics

site, llogg.com

Open Source-Brand open source

License-

91

Page 92: NoSQL Smackdown!

Implementation Language

- ANSI C, baby

- Wants a POSIX OS

- 340kB download!

92

Page 93: NoSQL Smackdown!

Data Model

-Key/value store++

-Strings

-Hashes, Sets

-Lists

-Sorted Sets

93

Page 94: NoSQL Smackdown!

Does it scale?

- Vertically, sure

- Plus it’s really fast

- Master/slave options

- Technically a CA system

94

Page 95: NoSQL Smackdown!

API- Binary socket interface

- Commands look like assembly language

- Drivers for 22+ languages

95

Page 96: NoSQL Smackdown!

96

Page 97: NoSQL Smackdown!

97

Page 98: NoSQL Smackdown!

98

Page 99: NoSQL Smackdown!

99

Page 100: NoSQL Smackdown!

Deployments-

craigslist100

Page 101: NoSQL Smackdown!

Community/Support

- Officially sponsored by VMware

101

Page 102: NoSQL Smackdown!

Voldemort

102

Page 103: NoSQL Smackdown!

Do you need this?

Maybe.103

Page 104: NoSQL Smackdown!

104

Page 106: NoSQL Smackdown!

Further ReadingBrewer’s Conjecturehttp://www.podc.org/podc2000/

Proof of Brewer’s Conjecture (the “CAP Theorem”)http://bit.ly/cap-theorem-proof

Amazon Dynamohttp://bit.ly/amazon-dynamohttp://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

Google BigTablehttp://bit.ly/big-table

The CAP Theorem Explainedhttp://www.julianbrowne.com/article/viewer/brewers-cap-theorem

Visualzing NoSQL Databases on the CAP Venn Diagramhttp://blog.nahurst.com/visual-guide-to-nosql-systems

Redishttp://redis.io/

Cassandrahttp://cassandra.apache.org

MongoDBhttp://mongodb.org

106

Page 107: NoSQL Smackdown!

Further ReadingCouchDBhttp://couchdb.apache.org

Riakhttp://basho.com

Voldemorthttp://project-voldemort.com

Neo4Jhttp://neo4j.org

Pretty Much Everything About NoSQLhttp://nosql.mypopescu.com

107

Page 108: NoSQL Smackdown!

Photo CreditsWrestlershttp://www.flickr.com/photos/stigster/4573851095

Desert Roadhttp://www.flickr.com/photos/kenlund/2439199670

Kindergarten Graduationhttp://www.flickr.com/photos/moyermk/3102262394

Clipboardhttp://www.flickr.com/photos/wheatfields/264890076

Winning Wrestlerhttp://www.flickr.com/photos/jrandallc/2259174414

108