How to Keep Your Data Safe in MongoDB

CTO, 10gen

Eliot Horowitz

#MongoDBDays

How to keep your data safe in MongoDB

What can go wrong?

• Network breaks in transit

• Server crashes while processing

• Server blows up after processing a write before replication

• Server processes, crashes, and then a conflicting write happens elsewhere

• All copies burn in a fire

• 20 years later, no one remembers how to read it

What is data safety?

Version 1

Probability that a given write or piece of data is accessible given human intervention and infinite time.

Version 2

Probability that a given write or piece of data is visible in a query.

Single Server – How a Write Works• Client sends a write operation to server

• Received by server’s tcp stack

• MongoDB process queues write

• Write happens in memory

• Depending on what Write Concern asks for

– Respond immediately– Wait for data to be journaled, then respond

Single Server – What can go wrong• Network can go down once message hits other side

• Client doesn’t know what happens without going back and checking

• Write could fail for logical reason (unique key exception)

• Server could crash before journaled

• Write is lost journaled • Server could crash after journaled

• When server is recovered, write is replayed and safe• Hard drive can crash irrecoverably

• Data center could lose power for large period of time

Any single server will failReplica Sets

Replica Set - Reminders

• N nodes

• Each node has a fully copy of the data

• Replication is asynchronous

Replica Set - Acknowledgements• “w” : how many servers must apply write

before acknowledged

• w=2 : do not acknowledge until write is on two servers– If primary fails, election guarantees new primary has all

writes acknowledge w=2

• w=majority : do not acknowledge until writes is on a majority of nodes in a replica set– If any primary is elected automatically, all writes

acknowledged with w=majority will be on primary.

Good, but not enough…What if I lose an entire data center?

Replica Set - tags

• A node can have a set of tags– region=us-east– color=blue

• Operator configures write level– Critical– has to be in 3 regions– Important – has to be in 2 regions

• w=critical– Do not acknowledge write until its in 3 data

centers– Losing an entire data center causes no data loss

What about sharding?

• Same rules apply

• Given a series of writes, they may go to different shards– A w=majority at the end means all writes on that

socket are acknowledge by a majority of the relevant replica set

• Config servers have no impact on fault tolerance/durability, only on admin uptime (or real uptime in a disaster)

Examples

Personal Blog

• Single server

• No replication

• Hourly backups

• If server crashes– Down until back up– All acknowledge writes safe

• If server is destroyed– Have to recover from backup– Lose up to 1 hour of writes

Departmental App

• Single replica set

• 3 nodes in a single server

• If any single node goes down– System is still readable/writeable– Writes done with w=2 are safe

• If 2 nodes go down at the same time– Only writes with w=3 are safe (bad idea)– No primary, last node is read-only

Core User Database

• Single replica set

• 3 data centers– Primary data center: 3 node (p=2)– 2 alternates with 2 nodes each (p=1)

• Different types of operations– Password change (w=majority)– Adds a “like” (w=2)– Login count (w=1)

Core User Database – cont’d• Lose any single server

– Can only lose a login count

• Lose any 2 servers– Could lose a “like” if you are unlucky

• Lose a data center– Still have a majority– All password changes are safe

Choice is a double edged sword

When to give a choice?

• Give choice over semantics– Developers and Operators know their needs

• Tuning parameters are dangerous– System should be smart enough to avoid

thousands of knobs

• Defaults should be– Intuitive and sensible– Changing is hard– Always changing a little

Semantics Knobs – too many?

Already have them in different architecture components

• Caching

• Worker queues

• Asynchronous replication

• Synchronous replication

• Two-phase commit

MongoDB gives you the choice of durability semantics from many systems in one.

• Control per write

• One source of truth in architecture

What should you do?

• Pick a default write level for your app

• Only deviate with good reason

• Test disaster scenario so you know what’s going to happen

CTO, 10gen

Eliot Horowitz

#MongoDBDays

Thank You

Documents

How to Keep Your Data Safe in MongoDB