27
MongoDB large scale data-centric architectures QConSF 2012 Kenny Gorman Founder, ObjectRocket @objectrocket @kennygorman

MongoDB Large-scale Data Centric Architectures

Embed Size (px)

Citation preview

Page 1: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 1/27

MongoDB large scale

data-centric architecturesQConSF 2012Kenny Gorman

Founder, ObjectRocket

@objectrocket @kennygorman

Page 2: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 2/27

MongoDB at scale

● Designing for scale

● Techniques to ease pain

● Things to avoid

Page 3: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 3/27

What is scale?

● Scale; massive adoption/usage● Scale; a very big, busy, or tricky system.

● Scale; I just want to sleep.● Scale; The docs just seem silly now.● Scale; Am I the only one with this problem?

Page 4: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 4/27

Vintage playbook

● No joins, foreign keys, triggers, stored procs● De-normalize until it hurts● Split vertically, then horizontally.

○ Conventional wisdom. eBay an early pioneer.

● Many DBA's, Sysadmin's, storage engineers, etc● Huge hardwarez● You have your own datacenter or colo-location● You realize your ORM has been screwing you

● You better have some clever folks on staff, shit getsweird at scale

Page 5: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 5/27

Example:

while True:

try:

add_column()

exit()

exception e:

print ("%s; crud") % e

Page 6: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 6/27

Vintage scaling playbook

Page 7: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 7/27

Scaling today

● Many persistence store options

● Horizontal scalability is expected

● Cloud based architectures prevalent

○ Hardware and data centers are abstracted fromdevelopers

● Focus on rapid development

● Mostly developers, maybe some devops

● Expectations that stuff just works

● Technologies are less mature, less tunables

Page 8: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 8/27

Enter MongoDB

● Document based NoSQL database● JSON/BSON (www.bson.org)● Developers dream

● OPS nightmare (for now)● Schema-less● Built in horizontal scaling framework● Built in replication● ~65% deployments in the cloud

Page 9: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 9/27

MongoDB challenges

● The lock scope

● Visibility

● Schema

● When bad things happen

Page 10: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 10/27

A MongoDB document

{_id : ObjectId("4e77bb3b8a3e000000004f7a"),when : Date("2011-09-19T02:10:11.3Z",author : "alex",

title : "No Free Lunch",text : "This is the text of the post",tags : [ "business", "ramblings" ],votes : 5,voters : [ "jane", "joe", "spencer" ],

}

Page 11: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 11/27

MongoDB keys for success at scale

● Design Matters!

Page 12: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 12/27

Design for scale; macro level

● Keep it simple● Break up workloads● Tune your workloads

● NoORM; dump it● Shard early● Replicate● Load test pre-production!

Page 13: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 13/27

Your success is only as good as the thing youdo a million times a second 

Page 14: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 14/27

Design for scale; specifics

● Embedded vs not

● Indexing○ The right amount○ Covered

● Atomic operations

● Use profiler and explain()

Page 15: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 15/27

Example; document embedding

// yes, guaranteed 1 i/o

{userid: 100, post_id: 10, comments:["comment1","comment2"..]}

db.blog.find({"userid":100}).explain()

{ ..., "nscannedObjects" : 1, ... }

// no

{userid: 100, post_id: 10, comment: "hi, this is kewl"}

{userid: 100, post_id: 10, comment: "thats what you think"}

{userid: 100, post_id: 10, comment: "I am thirsty"}

db.blog.find({"userid":100}).explain()

{ ..., "nscannedObjects" : 3, ... }

Page 16: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 16/27

Example; covered Indexes

mongos> db.foo.find({"foo":1},{_id:0,"foo":1}).explain()

{

"cursor" : "BtreeCursor foo_-1", "isMultiKey" : false,

"n" : 1,

"nscannedObjects" : 1, "nscanned" : 1,"nscannedObjectsAllPlans" : 1,

"nscannedAllPlans" : 1, "scanAndOrder" : false,

"indexOnly" : true,

"nYields" : 0, "nChunkSkips" : 0,

"millis" : 0, "indexBounds" : { "foo" : [[1,1]]},

"millis" : 0

}

Page 17: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 17/27

Design for scale

● Shard keys

● Tradeoffs

● Local vs Scattered

● Figure out at design time

Page 18: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 18/27

Example; Shard Keys

● Tuning for writes● Queries are scattered

{

_id: ObjectId("4e77bb3b8a3e000000004f7a"),

skey: md5(userid+date), // shard key

payload: {...}

}

Page 19: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 19/27

Example; Shard Keys

● Tuning for reads○ Localized queries○ Writes reasonably distributed

{

userid: 999, // shard key

post: {"userid":23343,

"capt":"hey checkout my pic",

"url":"http://www.lolcats.com"

}

}

Page 20: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 20/27

Design for scale; architecture

● Engage all processors○ Single writer lock○ Concurrency

● Replication○ Understand elections, and fault zones○ Understand the 'shell game', rebuilding slaves

■ Fragmentation○ Client connections, getLastError 

● Sharding○ Pick good keys or die○ Get enough I/O

Page 21: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 21/27

Design for scale; architecture

● I/O○ You need it○ Conventional wisdom is wrong○ Maybe they don't have big databases?

Page 22: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 22/27

Example; 'shell game'

Slave APP

Master 

Slave

Perform work on

slave thenstepDown() back toprimary

Page 23: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 23/27

Example; network partition

 A B

?????

"replSet can't see a majority, will not try to elect self"

Page 24: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 24/27

Example; write concern

// ensure data is in local journal

BasicDBObject doc = new BasicDBObject();

doc.put("payload","foo");coll.insert(doc, WriteConcern.SAFE);

Page 25: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 25/27

Random parting tips

● Monitor elections, and who is primary● Write scripts to kill sessions > Nms or based

on your architecture

● Automate or die● Tools

○ Mongostat○ Historical performance

Page 26: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 26/27

Gotchas, risks, shit that will makeyou nuts

● Logical schema corruption● That lock!● Not enough I/O

● Engaging all processors● Visibility● Not understanding how MongoDB works● FUD

Page 27: MongoDB Large-scale Data Centric Architectures

7/30/2019 MongoDB Large-scale Data Centric Architectures

http://slidepdf.com/reader/full/mongodb-large-scale-data-centric-architectures 27/27

Contact

@kennygorman

@objectrocket

[email protected]

https://www.objectrocket.com

https://github.com/objectrocket/rocketstat