Transcript
Page 1: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Building a Social Platform

Part 3: Scaling the Data Feed

Page 2: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Socialite

• Reference Implementation – Various Fanout Feed Models– User Graph Implementation– Content storage

• Configurable models and options• REST API in Dropwizard (Yammer)– https://dropwizard.github.io/dropwizard/

• Built-in benchmarking

https://github.com/10gen-labs/socialite

Page 3: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Architecture

Graph Service

Proxy

Cont

ent

Prox

y

Page 4: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Feed Service

• Two main functions :– Aggregating “followed” content for a user– Forwarding user’s content to “followers”

• Common implementation models :– Fanout on read

• Query content of all followed users on fly– Fanout on write

• Add to “cache” of each user’s timeline for every post• Various storage models for the timeline

Page 5: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Fanout On Read

Page 6: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Fanout On Read

Pros

Simple implementationNo extra storage for timelines

Cons

– Timeline reads (typically) hit all shards– Often involves reading more data than required– May require additional indexing on Content

Page 7: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Fanout On Write

Page 8: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Fanout On Write

Pros

Timeline can be single document readDormant users easily excludedWorking set minimized

Cons

– Fanout for large follower lists can be expensive– Additional storage for materialized timelines

Page 9: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Fanout On Write

• Three different approaches– Time buckets– Size buckets– Cache

• Each has different pros & cons

Page 10: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Timeline Buckets - Time

Upsert to time range buckets for each user> db.timed_buckets.find().pretty(){

"_id" : {"_u" : "jsr", "_t" : 516935},"_c" : [

{"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"},{"_id" : ObjectId("...dd2"), "_a" : "ian", "_m" : "message from ian"}

]}{

"_id" : {"_u" : "ian", "_t" : 516935},"_c" : [

{"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"}]

}{

"_id" : {"_u" : "jsr", "_t" : 516934 },"_c" : [

{"_id" : ObjectId("...da7"), "_a" : "ian", "_m" : "earlier from ian"}]

}

Page 11: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Timeline Buckets - Size

More complex, but more consistently sized> db.sized_buckets.find().pretty(){

"_id" : ObjectId("...122"),"_c" : [

{"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"},{"_id" : ObjectId("...dd2"), "_a" : "ian", "_m" : "message from ian"},{"_id" : ObjectId("...da7"), "_a" : "ian", "_m" : "earlier from ian"}

],"_s" : 3,"_u" : "jsr"

}{

"_id" : ObjectId("...011"),"_c" : [

{"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"}],"_s" : 1,"_u" : "ian"

}

Page 12: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Timeline - CacheStore a limited cache, fall back to fanout on read

– Create single cache doc on demand with upsert– Limit size of cache with $slice– Timeout docs with TTL for inactive users

> db.timeline_cache.find().pretty(){

"_c" : [{"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"},{"_id" : ObjectId("...dd2"), "_a" : "ian", "_m" : "message from ian"},{"_id" : ObjectId("...da7"), "_a" : "ian", "_m" : "earlier from ian"}

],"_u" : "jsr"

}{

"_c" : [{"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"}

],"_u" : "ian"

}

Page 13: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Embedding vs Linking ContentEmbedded content for direct access– Great when it is small, predictable in size

Link to content, store only metadata

– Read only desired content on demand– Further stabilizes cache document sizes

> db.timeline_cache.findOne({”_id" : "jsr"}){

"_c" : [{"_id" : ObjectId("...dc1”)},{"_id" : ObjectId("...dd2”)},{"_id" : ObjectId("...da7”)}

],”_id" : "jsr"

}

Page 14: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Socialite Feed Service

• Implemented four models as plugins– FanoutOnRead– FanoutOnWrite – Buckets (size)– FanoutOnWrite – Buckets (time)– FanoutOnWrite - Cache

• Switchable by config• Store content by reference or value• Benchmark-able back to back

Page 15: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmark by feed type

Page 16: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmarking the Feed

• Biggest challenge: scaling the feed• High cost of "fanout on write"

• Popular user posts => # operations:– Content collection insert: 1– Timeline Cache: on average, 130+ cache document

updates• SCATTER GATHER (slowest shard determines latency)

Page 17: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmarking the Feed

• Timeline is different from content! – "It's a Cache"

IT CAN BE REBUILT!

Page 18: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmarking the Feed

• MongoDB as a cache

Page 19: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

IT CAN BE REBUILT!

Effect of removing the cache and forcing drop-back to fanout on read and rebuilding of the cache:

Benchmarking the Feed

Page 20: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmarking the Feed

Page 21: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmarking the Feed

Page 22: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Benchmarking the Feed

• Results– last two weeks– ran load with one million users– ran load with ten million users (currently running)– used avg send rate 1K/s; 2K/s; reads 10K-20k/s

– 22 AWS c3.2xlarge servers (7.5GB RAM)– 18 across six shards (3 content, 3 user graph)– 4 mongos and app machines

– 2 c2x4xlarge servers (30GB RAM)– timeline feed cache (six shards)

Page 23: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Summary

Page 24: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

Socialite

• Real Working Implementation – Implements All Components– Configurable models and options

• Built-in benchmarking

• Questions? – We will be at "Ask The Experts" this afternoon!

https://github.com/10gen-labs/socialite

https://github.com/10gen-labs/socialite

Page 25: Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

https://github.com/10gen-labs/socialite

Thank You!