View
222
Download
0
Category
Tags:
Preview:
Citation preview
Robots at MySpaceScaling a .NET Website with Microsoft
Robotic Studio
Erik Nelson Group Architect / enelson@myspace-inc.comAkash Patel Senior Architect / apatel@myspace-inc.comTony Chow Development Manager / tchow@myspace-inc.com
Core PlatformMySpace.com
• MySpace is the largest Social Network• … based in Los Angeles• The largest .NET website in the world• Can’t be big without Caching• >6 million requests/second to the middle tier at peak• Many TB of user generated cached data• Data must not be stale– Users hate that
• New and interesting features require more than just a “cache”• Our middle tier is called Data Relay, because it does much
more than just cache.• Data Relay has been in production since 2006!
CCR
• What is CCR?• Coordination and Concurrency Runtime• Part of the Robotics Toolkit• Provides– Thread pools (Dispatcher)– Job queues (DispatcherQueue)– Flexible ways of connecting actions to those
queues and pools (Ports and Arbiter)
Graphs are Cool
Requests/Sec
The Stream
• The stream is everywhere• The stream is extremely volatile data– Both the “who” and the “what”
• Updates from our Users don’t just go to us– Twitter, Google, etc
• ~60,000 Stream Queries per Second– Over 5 billion a day
• 35 million updates a day• ~5 TB of Data in our Stream
Why not a DB?
• We decided to be “publisher” based and not “subscriber” based
• For us, that would involve a massively distributed query– Hundreds of databases
• Decoupling writing from reading
OK So How Then?
Robots!
Robots?
• Lots of inputs and outputs!• Need for minimum latency and decoupling
between jobs!• Just like a robot!
Abusing a Metaphor
• Our robots must– Incorporate incoming messages– Tell their neighbors about any messages they
receive– Be able to answer lots of questions– Talk to other robots when they need more info– Deal with other robots being slow or missing
How Does CCR Help?
• Division of labor– Incorporate incoming messages– Tell their neighbors about any messages they
receive– Be able to answer lots of questions– Talk to other robots when they need more info– Deal with other robots being slow or missing
How Does CCR Help?
• Queue Control– We can has Buckets
• Queue Division– Different destinations have their own queues
• Strict Pool Control
Akash PatelSenior Architect
Activity Stream
• Activity Stream (News Feed)– Aggregation of your friends activities
• Activity Stream Generation– Explicitly: Status Update– Implicitly: Post New Photo Album– Auto: 3rd Party App
Imagine this is You …
Friends & Activities
Your Friends …You post a new status update .. an index is createdYou upload a new photo album .. Index UpdatedIndex grows with new activitiesPublisher Based Cache- Activity Associated to Publishing User
Where’s the Activity Stream?
• Activity Stream Generated by Querying– Filter & Merge Friend’s Activities
Friends & Activities
Very Volatile
Stream Architecture
• Utilizes Data Relay Framework– Message Based System• Fire & Forget Msgs [Save, Delete, Update]• RoundTrip Msgs [Get, Query, Execute]
– Replication & Clustering Built-in• Index Cache – Not a Key/Value Store– Storage & Querying System– 2 Tiered System (separates index from data)
C1C1
Data Relay Architecture
N1 N2 N3 N4 N5 N6 N7 N8 N9
Cluster 1 Cluster 2 Cluster 3
NodeClusterGroup
Group A
Group B
Data is Partitioned across clusters
Data is Replicated within Clusters
Stream Architecture
C1C1N1 N2 N3 N4 N5 N6 N7 N8 N9
Cluster 1 Cluster 2 Cluster 3
Activities Index
Activity Stream Update
N1 N2 N3 N4 N5 N6 N7 N8 N9
Cluster 1 Cluster 2 Cluster 3
New Activity Msg
Node 2 Proxy (Destination Node)
Round Trip MsgsFire & Forget Msg
CCR Perspective
DispatcherQueue
Thread Pool
Port1
Port2
Arbiters
Port1 Arbiters
DispatcherQueue
Thread Pool
New Activity Msg
N1
Activity Stream Request
N1 N2 N3 N4 N5 N6 N7 N8 N9
Cluster 1 Cluster 2 Cluster 3
Activity Stream Request Distributed Query - FriendList
SubQuery FriendList1
SubQueryFriendList3SubQuery – FriendList2
Client
Node 1 Proxy (Destination Node)
Round Trip MsgsFire & Forget Msg
CCR Perspective
DispatcherQueue
Thread Pool
Port1
Port2
Arbiters
Port1 Arbiters
DispatcherQueue
Thread Pool
Activity Stream Query
Client
Activity Stream Request
N1 N2 N3 N4 N5 N6 N7 N8 N9
Cluster 1 Cluster 2 Cluster 3
Query Result
Sub-Query Result1
Sub-Query Result3
Sub-Query Result2
Activity Stream Request
N1 N2 N3 N4 N5 N6 N7 N8 N9
Cluster 1 Cluster 2 Cluster 3
Activity Stream ResponseQuery Result
Sub-Query Result1
Sub-Query Result3
Sub-Query Result2
Activity Stream RequestActivity Stream Response
Query Result
Activity Index Cache
C1C1
Activities Data Cache
More Graphs
Stream Requests
Index Gets
Requests/Sec
Index Cache
• De Facto Distributed Querying Platform– Sort, Merge, Filter
• Ubiquitous when Key/Value Store is not enough– Activity Stream– Videos– Music– MySpace Developer Platform
Robots Processing Your Every Move!
• CCR constructs in every NodeProxy– Ports– Arbiters– Dispatcher Queues– Dispatchers (Shared)
• Messages Batching– Arbiter.Choice
• Arbiter.MultipleItemReceive• Arbiter.Receive from the TimeoutPort
• Threadpool Flexibilty– Number of pools– Flexibility to set & change pool size dynamically*
Activity Stream
• Activities are everywhere
MySpace
Tony ChowDevelopment Manager
Google Searchable Stream
Real-Time Stream
• Pushes user activities out to subscribers using the PubSubHubbub standard
• Anyone can subscribe to the Real-Time Stream, free of charge
• Launched in December 2009• Major subscribers: Google, Groovy, OneRiot• ~100 million messages delivered per day
Users
Front End
What doesn’t work
Groovy
OneRiot Slow.com
The Challenges
• Protect the user experience• Constant stream to healthy subscribers• Give all subscribers a fair chance at trying• Prevent unhealthy subscribers from doing
damage
Streaming Architecture
Front End
Filter
activitiesdelivery
Subscribers
TransactionManager
<atom>
TransactionManager
Users
Groovy
Slow.com
iLike
• Queue• Partition• Throttle• Async I/O
Policing the Stream
Policing the Stream
• So far so good—for occasionally slow subscribers
• But chronically underperforming subscribers call for more drastic measures
Groovy
Slow.com
iLike
• Discard• Unsubscribe
Policing the Stream
Filterunsubscribe
Transaction Manager is Everywhere @ MySpace!
• Generic platform for reliable persistence• Supports SQL, SOAP, REST, and SMTP calls• MySpace Mail• Friend Requests• Status/Mood Update• And much more!
The Role of CCR
• CCR is integral to DataRelay• CCR Iterator Pattern for Async I/O
Asynchronous I/O
• Synchronous I/O– Needs lots of threads to do lots of I/O– Massive context switching– Doesn’t scale
• Asynchronous I/O– Efficient use of threads– Massively scales– Hard to program, harder to read– Gnarly and unmaintainable code
The CCR Iterator Pattern
• A better way to do write async code– C# Iterators—makes enumerators easier– CCR Iterators—makes async I/O easier
• Makes async code look like sync code
The Diffference
void Before(){
cmd1.BeginExecuteNonQuery(result1=>{
cmd1.EndExecuteNonQuery();cmd2.BeginExecuteNonQuery(
result2=>{
cmd2.EndExecuteNonQuery();});
});}
IEnumerable<ITask> After(){
cmd1.BeginExecuteNonQuery(result=>port.Post(1));yield return Arbiter.Receive(...);cmd1.EndExecuteNonQuery();
cmd2.BeginExecuteNonQuery(result=>port.Post(1));yield return Arbiter.Receive(...);cmd2.EndExecuteNonQuery();
}
The CCR Iterator Pattern
• Improves readability and maintainability• Far less bug-prone• Indispensible for asynchronous programming
What Now?
• We didn’t show any code samples…• Because we are going to share more than
samples …
WE ARE OPEN SOURCING!!
Open Source
• http://DataRelay.CodePlex.com• Lesser GPL License for…– Data Relay Base– Our C#/Managed C++ Berkeley DB Wrapper and
Storage Component– Index Cache System– Network transport– Serialization System
What Now?
• Places in our code with CCR– Bucketed batch
• \Infrastructure\DataRelay\RelayComponent.Forwarding\Node.cs - ActivateBurstReceive(int count)
– Distributed bulk message handling• \Infrastructure\DataRelay\RelayComponent.Forwarding\
Forwarder.cs - HandleMessages
– General Message Handling• \Infrastructure\DataRelay\DataRelay.RelayNode\
RelayNode.cs• \Infrastructure\SocketTransport\Server\SocketServer.cs
Evaluate Us!
Please fill out an evaluation for our presentation!
More evaluations = more better for everyone.
Thank You! Questions?
• Erik Nelson– enelson@myspace-inc.com
• Akash Patel– Apatel@myspace-inc.com
• Tony Chow– Tchow@myspace-inc.com
• http://DataRelay.CodePlex.com
Recommended