Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Triple R – Riak, Redis and RabbitMQ at XING Dr. Stefan Kaes, Sebastian Röbke
NoSQL matters Cologne, April 27, 2013
ActivityStream Intro
3 Types of Feeds
News Feed
Me Feed
Company Feed
Activity Creation
ActivityStream
POST /activitystream/activities
ActivityStream
events.participation.changed
events.participation.changed
Events Appevents.event.created
...
groups.member.joined
Groups Appgroups.article.created
...
users.contact.created
User Appusers.profile.updated
...
...
etc.
...
... ActivityStream
Old Approach
Comment
Activity
INSERT INTO `activities` ...
LikeINSERT INTO `comments` ...
INSERT INTO `likes` ...
several hundred millions
Efficient Activity Creation
But ... slow reads
User App
Groups App
GET group memberships
Companies App
Settings
ActivityStream
likes comments
activities
etc.
...
GET contacts
GET privacy settings
GET followed companies
?
User App
Groups App
GET group memberships
Companies App
Settings
ActivityStream
likes comments
activities
etc.
...
GET contacts
GET privacy settings
GET followed companies
Activities immediately visible
Consistency
SQL Databases arewell understood
Database Master is
Single Point of Failure
Sharding
Unsatisfactory Read Performance
New Approach
Materialized Feeds
ActivityStream
likes
comments
activities news feeds me feeds company feeds
Activity
Storage
User App
GET contacts
etc....
create
?
Requirements
Better Read Performance
Activities created by me must be visible to myself immediately
Activities created by others should appear within a reasonable time
frame in my stream
Storage layer must tolerate high read and write loads
Storage layer must provide easy capacity scaling
Low maintenance
Option 1:
Do it yourself SQLdatabase design
Option 2:
Off the shelf NoSQL database
We chose
We tend to view it as a highly available distributed hash table
Eventual consistency/conflict resolution is the hard part
Bounded size feeds are easyhttp://www.paperplanes.de/2011/12/15/storing-timelines-in-riak.html
Unbounded feeds are much harder
Object Model
JSON Documents
Activities
Activities
Activities
Activities
2P-Set
JSON Documents
Feeds
Feeds
bounded list of chunk references
Feeds
chunk sequence number
Feeds
youngest activity ref
Feeds
oldest activity ref
Feeds
size of referenced chunk
JSON Documents
FeedChunk
FeedChunk
2P-Set
The Migration
Incremental rollout
Part 1:
From old to new
Let’s start simple!
Replicating some data
Old ActivityStream
New ActivityStream
activity.deleted
activity.created
comment.deleted
comment.created
like.deleted
like.created
data migration processors
Measuring performance
Old ActivityStream
New ActivityStream
mefeed.viewed
newsfeed.viewed
companyfeed.viewed
shadow query processors
data migration processors
Part 2:
From new to old
Old ActivityStream
New ActivityStream
DELETE /activitystream/activities/{id}
POST /activitystream/activities
DELETE /activitystream/activities/{activity_id}/comments/{id}
POST /activitystream/activities/{activity_id}/comments
DELETE /activitystream/activities/{activity_id}/likes/{user_id}
PUT /activitystream/activities/{activity_id}/likes/{user_id}
Beta User B
Beta User A
Beta User C
Old ActivityStream
New ActivityStream
DELETE /activitystream/activities/{id}
POST /activitystream/activities
DELETE /activitystream/activities/{activity_id}/comments/{id}
POST /activitystream/activities/{activity_id}/comments
DELETE /activitystream/activities/{activity_id}/likes/{user_id}
PUT /activitystream/activities/{activity_id}/likes/{user_id}
activity idBeta User B
Beta User A
Beta User C
Part 3:
What about the old data?
Bulk Data Migration:
Failed Version 1
Bulk Data Migration:
Failed Version 11. Reset data in the new system2. Query the old system REST API for the feeds3. Store them in the new system4. Switch to the new system
This was naive
The old system was way too slow to return the millions of feeds in
their full length
Bulk Data Migration:
Failed Version 2
Bulk Data Migration:
Failed Version 21. Reset data in the new system2. Read all activities based on a dump of the
old system3. Publish “created” messages to RabbitMQ for
each activity/comment/like4. Let the new system build its data structures5. Switch to the new system
This was naive
You can’t replay the history of 2.5 years this way
Bulk Data Migration:
Successful Version
Bulk Data Migration:
Successful Version1. Reset data in the new system2. Obtain data dump from old system3. Extract data from the dumps and compute a
representation of the feeds in Redis with a massive amount of batch processors
4. Use this data to build up the structures in the new system
5. Switch to the new system
It worked!
But...
A lot of additional code
Fragile, manual steps involved
A lot of technology:
RabbitMQ, Riak, Redis, Varnish
One run took 5 days
But it worked!
Current Status
New system is live for all users since 12/12/12
Old and new system were kept in sync till April 2013
In case of serious trouble, we could have switched to the old
system within seconds
Performance goals have been met
Performance goals have been met
Old system New system
happy t < 0.1s 0.17% 62.01%
satisfied t < 0.5s 41.36% 99.71%
tolerating 0.5s ≤ t < 2s 58.20% 0.28%
frustrated t ≥ 2s 0.44% 0.00%
Apdex Score 0.70 1.00
Production setup
‣ 10 Riak Servers as Primary Cluster
‣ 10 Riak Servers as Backup Cluster (Multi-DC Replication)
‣ SSDs, Raid 0 and a proper Linux file I/O scheduler setting (noop)
‣ Bitcask storage backend
‣ 4 REST API Servers
‣ 4 Background Worker Servers
‣ Monitoring using Ganglia and Logjam (App Performance)
Lessons learned
‣Eventual consistency sounds easy, but is hard to implement correctly in practice‣There’s a steep learning curve at the beginning‣High update rates and large objects don’t go
together well, if your storage system offers just get, put and delete operations‣Achieving high performance requires careful
thought about data structures, algorithms and access patterns‣Building a new system from scratch is lot easier
than migrating an existing system
‣Protobuffs API faster than HTTP ‣Use the best performing JSON library you can
find (Ruby: Oj gem)‣Avoid a full-blown ORM for Riak if you care
about performance (Ruby: Ripple gem)
‣At one point we saturated the Gigabit network cards on the Riak cluster‣This lead to compressing all data before storing
it on the cluster and breaking news feeds into chunks
The professional network www.xing.com
Thank you for your
attention!
Dr. Stefan Kaes
Twitter: @stkaes
Sebastian Röbke
Twitter: @boosty
We’re hiring: [email protected]