CTO, 10gen
Eliot Horowitz
#MongoDBDays
How to keep your data safe in MongoDB
What can go wrong?
• Network breaks in transit
• Server crashes while processing
• Server blows up after processing a write before replication
• Server processes, crashes, and then a conflicting write happens elsewhere
• All copies burn in a fire
• 20 years later, no one remembers how to read it
What is data safety?
Version 1
Probability that a given write or piece of data is accessible given human intervention and infinite time.
Version 2
Probability that a given write or piece of data is visible in a query.
Single Server – How a Write Works• Client sends a write operation to server
• Received by server’s tcp stack
• MongoDB process queues write
• Write happens in memory
• Depending on what Write Concern asks for
– Respond immediately– Wait for data to be journaled, then respond
Single Server – What can go wrong• Network can go down once message hits other side
• Client doesn’t know what happens without going back and checking
• Write could fail for logical reason (unique key exception)
• Server could crash before journaled
• Write is lost journaled • Server could crash after journaled
• When server is recovered, write is replayed and safe• Hard drive can crash irrecoverably
• Data center could lose power for large period of time
Any single server will failReplica Sets
Replica Set - Reminders
• N nodes
• Each node has a fully copy of the data
• Replication is asynchronous
Replica Set - Acknowledgements• “w” : how many servers must apply write
before acknowledged
• w=2 : do not acknowledge until write is on two servers– If primary fails, election guarantees new primary has all
writes acknowledge w=2
• w=majority : do not acknowledge until writes is on a majority of nodes in a replica set– If any primary is elected automatically, all writes
acknowledged with w=majority will be on primary.
Good, but not enough…What if I lose an entire data center?
Replica Set - tags
• A node can have a set of tags– region=us-east– color=blue
• Operator configures write level– Critical– has to be in 3 regions– Important – has to be in 2 regions
• w=critical– Do not acknowledge write until its in 3 data
centers– Losing an entire data center causes no data loss
What about sharding?
• Same rules apply
• Given a series of writes, they may go to different shards– A w=majority at the end means all writes on that
socket are acknowledge by a majority of the relevant replica set
• Config servers have no impact on fault tolerance/durability, only on admin uptime (or real uptime in a disaster)
Examples
Personal Blog
• Single server
• No replication
• Hourly backups
• If server crashes– Down until back up– All acknowledge writes safe
• If server is destroyed– Have to recover from backup– Lose up to 1 hour of writes
Departmental App
• Single replica set
• 3 nodes in a single server
• If any single node goes down– System is still readable/writeable– Writes done with w=2 are safe
• If 2 nodes go down at the same time– Only writes with w=3 are safe (bad idea)– No primary, last node is read-only
Core User Database
• Single replica set
• 3 data centers– Primary data center: 3 node (p=2)– 2 alternates with 2 nodes each (p=1)
• Different types of operations– Password change (w=majority)– Adds a “like” (w=2)– Login count (w=1)
Core User Database – cont’d• Lose any single server
– Can only lose a login count
• Lose any 2 servers– Could lose a “like” if you are unlucky
• Lose a data center– Still have a majority– All password changes are safe
Choice is a double edged sword
When to give a choice?
• Give choice over semantics– Developers and Operators know their needs
• Tuning parameters are dangerous– System should be smart enough to avoid
thousands of knobs
• Defaults should be– Intuitive and sensible– Changing is hard– Always changing a little
Semantics Knobs – too many?
Already have them in different architecture components
• Caching
• Worker queues
• Asynchronous replication
• Synchronous replication
• Two-phase commit
MongoDB gives you the choice of durability semantics from many systems in one.
• Control per write
• One source of truth in architecture
What should you do?
• Pick a default write level for your app
• Only deviate with good reason
• Test disaster scenario so you know what’s going to happen
CTO, 10gen
Eliot Horowitz
#MongoDBDays
Thank You