27
Technical Director, 10gen @forjared Jared Rosoff #MongoSV 2012 Schema Design -- Inboxes!

MongoDB Advanced Schema Design - Inboxes

Embed Size (px)

DESCRIPTION

Schema design considerations for building status feeds and messaging systems at scale.

Citation preview

Page 1: MongoDB Advanced Schema Design - Inboxes

Technical Director, 10gen

@forjared

Jared Rosoff

#MongoSV 2012

Schema Design-- Inboxes!

Page 2: MongoDB Advanced Schema Design - Inboxes

Single Table En

Agenda

• Problem overview

• Design Options – Fan out on Read– Fan out on Write– Fan out on Write with Bucketing

• Conclusions

Page 3: MongoDB Advanced Schema Design - Inboxes

Problem Overview

Page 4: MongoDB Advanced Schema Design - Inboxes

Let’s getSocial

Page 5: MongoDB Advanced Schema Design - Inboxes

Sending Messages

?

Page 6: MongoDB Advanced Schema Design - Inboxes

Reading my Inbox

?

Page 7: MongoDB Advanced Schema Design - Inboxes

Design Options

Page 8: MongoDB Advanced Schema Design - Inboxes

3 Approaches (there are more)• Fan out on Read

• Fan out on Write

• Fan out on Write with Bucketing

Page 9: MongoDB Advanced Schema Design - Inboxes

Fan out on read

• Generally, not the right approach

• 1 document per message sent

• Multiple recipients in an array key

• Reading an inbox is finding all messages with my own name in the recipient field

• Requires scatter-gather on sharded cluster

• Then a lot of random IO on a shard to find everything

Page 10: MongoDB Advanced Schema Design - Inboxes

// Shard on “from”db.shardCollection(”myapp.messages”, { ”from”: 1} )

// Make sure we have an index to handle inbox readsdb.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } )

msg = { from: "Joe”, to: [ ”Bob”, “Jane” ],

sent: new Date(), message: ”Hi!”,

}

// Send a messagedb.messages.save(msg)

// Read my inboxdb.messages.find({ to: ”Joe” }).sort({ sent: -1 })

Fan out on Read

Page 11: MongoDB Advanced Schema Design - Inboxes

Fan out on read – Send Message

Shard 1 Shard 2 Shard 3

Send Message

Page 12: MongoDB Advanced Schema Design - Inboxes

Fan out on read – Inbox Read

Shard 1 Shard 2 Shard 3

Read Inbox

Page 13: MongoDB Advanced Schema Design - Inboxes

Fan out on write

• Tends to scale better than fan out on read

• 1 document per recipient

• Reading my inbox is just finding all of the messages with me as the recipient

• Can shard on recipient, so inbox reads hit one shard

• But still lots of random IO on the shard

Page 14: MongoDB Advanced Schema Design - Inboxes

// Shard on “recipient” and “sent” db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } )

msg = { from: "Joe”, to: [ ”Bob”, “Jane” ],

sent: new Date(), message: ”Hi!”,

}

// Send a messagefor( recipient in msg.to ) {

msg.recipient = recipientdb.messages.save(msg);

}

// Read my inboxdb.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })

Fan out on Write

Page 15: MongoDB Advanced Schema Design - Inboxes

Fan out on write – Send Message

Shard 1 Shard 2 Shard 3

Send Message

Page 16: MongoDB Advanced Schema Design - Inboxes

Fan out on write– Read Inbox

Shard 1 Shard 2 Shard 3

Read Inbox

Page 17: MongoDB Advanced Schema Design - Inboxes

Fan out on write with bucketing• Generally the best approach

• Each “inbox” document is an array of messages

• Append a message onto “inbox” of recipient

• Bucket inbox documents so there’s not too many per document

• Can shard on recipient, so inbox reads hit one shard

• 1 or 2 documents to read the whole inbox

Page 18: MongoDB Advanced Schema Design - Inboxes

// Shard on “owner / sequence”db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } )db.shardCollection(”myapp.users”, { ”user_name”: 1 } )msg = { from: "Joe”, to: [ ”Bob”, “Jane” ],

sent: new Date(), message: ”Hi!”,

}// Send a messagefor( recipient in msg.to) { sequence = db.users.findAndModify({ query: { user_name: recipient}, update: { '$inc': { ’msg_count': 1 }}, upsert: true, new: true }).msg_count / 50

db.inbox.update({ owner: recipient, sequence: sequence},

{ $push: { ‘messages’: msg } },

{ upsert: true });}// Read my inboxdb.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)

Fan out on Write

Page 19: MongoDB Advanced Schema Design - Inboxes

Bucketed fan out on write - Send

Shard 1 Shard 2 Shard 3

Send Message

Page 20: MongoDB Advanced Schema Design - Inboxes

Bucketed fan out on write - Read

Shard 1 Shard 2 Shard 3

Read Inbox

Page 21: MongoDB Advanced Schema Design - Inboxes

Discussion

Page 22: MongoDB Advanced Schema Design - Inboxes

TradeoffsFan out on

ReadFan out on

WriteBucketed Fan out on Write

Send Message Performance

Best Single shardSingle write

GoodShard per recipientMultiple writes

WorstShard per recipientAppends (grows)

Read Inbox Performance

WorstBroadcast all shardsRandom reads

GoodSingle shardRandom reads

Best Single shardSingle read

Data Size Best Message stored once

WorstCopy per recipient

WorstCopy per recipient

Page 23: MongoDB Advanced Schema Design - Inboxes

Things to consider

• Lots of recipients

• Fan out on write might become prohibitive• Consider introducing a “Group”

• Very large message size

• Multiple copies of messages can be a burden• Consider single copy of message with a “pointer”

per inbox

• More writes than reads

• Fan out on read might be okay

Page 24: MongoDB Advanced Schema Design - Inboxes

Comments – where do they live?

Page 25: MongoDB Advanced Schema Design - Inboxes

Conclusion

Page 26: MongoDB Advanced Schema Design - Inboxes

Summary

• Multiple ways to model status updates

• Bucketed fan out on write is typically the better approach

• Think about how your model distributes across shards

• Think about how much random IO needs to happen on a shard

Page 27: MongoDB Advanced Schema Design - Inboxes

Technical Director, 10gen

Jared Rosoff

#MongoSV

Thank You