69
NoSQL Infrastructure

NoSQL Infrastructure

Embed Size (px)

DESCRIPTION

NoSQL databases are often touted for their performance and whilst it's true that they usually offer great performance out of the box, it still really depends on how you deploy your infrastructure. Dedicated vs cloud? In memory vs on disk? Spindal vs SSD? Replication lag. Multi data centre deployment. This talk considers all the infrastructure requirements of a successful high performance infrastructure with hints and tips that can be applied to any NoSQL technology. It includes things like OS tweaks, disk benchmarks, replication, monitoring and backups. Presented at NoSQL Roadshow Berlin 2013 by David Mytton.

Citation preview

Page 1: NoSQL Infrastructure

NoSQL Infrastructure

Page 2: NoSQL Infrastructure

~3333 write ops/s 0.07 - 0.05 ms response

Page 3: NoSQL Infrastructure

David Mytton

Woop Japan!

Page 4: NoSQL Infrastructure
Page 5: NoSQL Infrastructure

MongoDB at Server Density

Page 6: NoSQL Infrastructure

•27 nodes

MongoDB at Server Density

Page 7: NoSQL Infrastructure

• June 2009 - 4yrs

•27 nodes

MongoDB at Server Density

Page 8: NoSQL Infrastructure

•MySQL -> MongoDB

•27 nodes

MongoDB at Server Density

• June 2009 - 4yrs

Page 9: NoSQL Infrastructure

•MySQL -> MongoDB

•20TB data per month

•27 nodes

MongoDB at Server Density

• June 2009 - 4yrs

Page 10: NoSQL Infrastructure

Why?

Page 11: NoSQL Infrastructure

Why?

• Replication

Page 12: NoSQL Infrastructure

Why?

• Replication

• Official drivers

Page 13: NoSQL Infrastructure

Why?

• Replication

• Official drivers

• Easy deployment

Page 14: NoSQL Infrastructure

Why?

• Replication

• Official drivers

• Easy deployment

• Fast out of the box

Page 15: NoSQL Infrastructure

It’s a little different.

Page 16: NoSQL Infrastructure

Picture is unrelated! Mmm, ice cream.

• Fast network

Performance

Page 17: NoSQL Infrastructure

• Fast network

Performance

EC2 10 Gigabit Ethernet- Cluster Compute- High Memory Cluster- Cluster GPU- High I/O- High Storage

Page 18: NoSQL Infrastructure

• Fast network

Performance

Workload: Read/Write?

Result set size

What is being stored?

Page 19: NoSQL Infrastructure

• Fast network

Performance

Use Network Throughput

Normal 0-100Mb/s

Replication (Initial Sync) Burst +100Mb/s

Replication (Oplog) 0-100Mb/s

Backup Initial Sync + Oplog

Page 20: NoSQL Infrastructure

• Fast network

Performance

Inter-DC LAN

Page 21: NoSQL Infrastructure

• Fast network

Performance

Inter-DC LAN

Cross USA Washington, DC - San Jose, CA

Page 22: NoSQL Infrastructure

• Fast network

Performance

Location Ping RTT Latency

Within USA 40-80ms

Trans-Atlantic 100ms

Trans-Pacific 150ms

Europe - Japan 300ms

Page 23: NoSQL Infrastructure

Failover

•Replica sets

Page 24: NoSQL Infrastructure

Failover

•Replica sets

•Master/slave

Page 25: NoSQL Infrastructure

Failover

•Replica sets

•Min 3 nodes

•Master/slave

Page 26: NoSQL Infrastructure

Failover

•Replica sets

•Min 3 nodes

•Master/slave

•Automatic failover

Page 27: NoSQL Infrastructure

• Replication lag

Performance

Location Ping RTT Latency

Within USA 40-80ms

Trans-Atlantic 100ms

Trans-Pacific 150ms

Europe - Japan 300ms

Page 28: NoSQL Infrastructure

Replication Lag

1. Reads: eventual consistency

Page 29: NoSQL Infrastructure

Replication Lag

1. Reads: eventual consistency

2. Failover: slave behind

Page 30: NoSQL Infrastructure

Eventual Consistency

Stale data

Page 31: NoSQL Infrastructure

Eventual Consistency

Stale data

Inconsistent data

Page 32: NoSQL Infrastructure

Eventual Consistency

Stale data

Inconsistent data

Changing data

Page 33: NoSQL Infrastructure

Eventual Consistency

Use Case Needs consistency?

Graphs No

User profile Yes

Statistics Depends

Alert config Yes

Page 34: NoSQL Infrastructure

Replication Lag

1. Reads: eventual consistency

2. Failover: slave behind

Page 35: NoSQL Infrastructure

Slave behind

Failover: out of date master

Page 36: NoSQL Infrastructure

MongoDB WriteConcern

Changed Nov 27 2012

Page 37: NoSQL Infrastructure

• Safe by default

>>> from pymongo import MongoClient>>> connection = MongoClient(w=int/str)

Value Meaning

0 Unsafe

1 Primary

2 Primary + x1 secondary

3 Primary + x2 secondaries

MongoDB WriteConcern

Page 38: NoSQL Infrastructure
Page 39: NoSQL Infrastructure

Tags

{ _id : "someSet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}} ] settings : { getLastErrorModes : { veryImportant : {"dc" : 3}, sortOfImportant : {"dc" : 2} } }}> db.foo.insert({x:1})> db.runCommand({getLastError : 1, w : "veryImportant"})

Page 40: NoSQL Infrastructure

{ _id : "someSet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}} ] settings : { getLastErrorModes : { veryImportant : {"dc" : 3}, sortOfImportant : {"dc" : 2} } }}> db.foo.insert({x:1})> db.runCommand({getLastError : 1, w : "veryImportant"})

(A or B) + (C or D) + E

Tags

Page 41: NoSQL Infrastructure

{ _id : "someSet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}} ] settings : { getLastErrorModes : { veryImportant : {"dc" : 3}, sortOfImportant : {"dc" : 2} } }}> db.foo.insert({x:1})> db.runCommand({getLastError : 1, w : "sortOfImportant"})

(A + C) or (D + E) ...

Tags

Page 42: NoSQL Infrastructure

Picture is unrelated! Mmm, ice cream.

• Fast network

•More RAM

Performance

Page 46: NoSQL Infrastructure

More RAM = expensive Performance

x2 4GB RAM 12 month Prices

Page 47: NoSQL Infrastructure

RAM

SSDs

Spinning disk

Cost Speed

Page 48: NoSQL Infrastructure

Softlayer disk pricing Performance

Page 49: NoSQL Infrastructure

EC2 disk/RAM pricing Performance

$2232/m

$2520/m

$43/m

$295/m

Page 50: NoSQL Infrastructure

SSD vs Spinning Performance

Page 51: NoSQL Infrastructure

SSD vs Spinning Performance

Page 52: NoSQL Infrastructure

SSD vs Spinning Performance

Page 53: NoSQL Infrastructure

Tips: rand()

•Field names

Page 54: NoSQL Infrastructure

Tips: rand()

•Field names

•Covered indexes

Page 55: NoSQL Infrastructure

Tips: rand()

•Field names

•Covered indexes

•Collections / databases

Page 56: NoSQL Infrastructure

Backups

What is the use case?

Page 57: NoSQL Infrastructure

Backups

What is the use case?

Fixing user errors?

Point in time restore?

Disaster recovery?

Page 58: NoSQL Infrastructure

Backups

•Disaster recovery

Offsite

Page 59: NoSQL Infrastructure

Backups

•Disaster recovery

Age

Offsite

Page 60: NoSQL Infrastructure

Backups

•Disaster recovery

Age

Offsite

Restore time

Page 61: NoSQL Infrastructure

Backups

Frequency

Consistency

Verification

Page 62: NoSQL Infrastructure

www.flickr.com/photos/daddo83/3406962115/

Monitoring

•System

Disk i/o

Disk use

Page 63: NoSQL Infrastructure

www.flickr.com/photos/daddo83/3406962115/

Monitoring

Disk i/o

Disk use

•System

Swap

Page 64: NoSQL Infrastructure

www.flickr.com/photos/daddo83/3406962115/

Monitoring

Optime

State

•Replication

Page 65: NoSQL Infrastructure

mongostat

Page 66: NoSQL Infrastructure

Monitoring tools

Run yourself

Ganglia

Page 67: NoSQL Infrastructure

Monitoring tools

www.serverdensity.com/nosql

Page 68: NoSQL Infrastructure

www.serverdensity.com/nosql

Page 69: NoSQL Infrastructure

David Mytton

[email protected]

@davidmytton

Woop Japan!

www.serverdensity.com/nosql