103
Triple R – Riak, Redis and RabbitMQ at XING Dr. Stefan Kaes, Sebastian Röbke NoSQL matters Cologne, April 27, 2013

Triple R – Riak, Redis and RabbitMQ at XING

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Triple R – Riak, Redis and RabbitMQ at XING

Triple R – Riak, Redis and RabbitMQ at XING Dr. Stefan Kaes, Sebastian Röbke

NoSQL matters Cologne, April 27, 2013

Page 2: Triple R – Riak, Redis and RabbitMQ at XING

ActivityStream Intro

Page 3: Triple R – Riak, Redis and RabbitMQ at XING
Page 4: Triple R – Riak, Redis and RabbitMQ at XING
Page 5: Triple R – Riak, Redis and RabbitMQ at XING
Page 6: Triple R – Riak, Redis and RabbitMQ at XING

3 Types of Feeds

Page 7: Triple R – Riak, Redis and RabbitMQ at XING

News Feed

Page 8: Triple R – Riak, Redis and RabbitMQ at XING

Me Feed

Page 9: Triple R – Riak, Redis and RabbitMQ at XING

Company Feed

Page 10: Triple R – Riak, Redis and RabbitMQ at XING

Activity Creation

Page 11: Triple R – Riak, Redis and RabbitMQ at XING
Page 12: Triple R – Riak, Redis and RabbitMQ at XING

ActivityStream

POST /activitystream/activities

Page 13: Triple R – Riak, Redis and RabbitMQ at XING
Page 14: Triple R – Riak, Redis and RabbitMQ at XING

ActivityStream

events.participation.changed

Page 15: Triple R – Riak, Redis and RabbitMQ at XING

events.participation.changed

Events Appevents.event.created

...

groups.member.joined

Groups Appgroups.article.created

...

users.contact.created

User Appusers.profile.updated

...

...

etc.

...

... ActivityStream

Page 16: Triple R – Riak, Redis and RabbitMQ at XING

Old Approach

Page 17: Triple R – Riak, Redis and RabbitMQ at XING

Comment

Activity

INSERT INTO `activities` ...

LikeINSERT INTO `comments` ...

INSERT INTO `likes` ...

several hundred millions

Page 18: Triple R – Riak, Redis and RabbitMQ at XING

Efficient Activity Creation

Page 19: Triple R – Riak, Redis and RabbitMQ at XING

But ... slow reads

Page 20: Triple R – Riak, Redis and RabbitMQ at XING
Page 21: Triple R – Riak, Redis and RabbitMQ at XING

User App

Groups App

GET group memberships

Companies App

Settings

ActivityStream

likes comments

activities

etc.

...

GET contacts

GET privacy settings

GET followed companies

?

Page 22: Triple R – Riak, Redis and RabbitMQ at XING

User App

Groups App

GET group memberships

Companies App

Settings

ActivityStream

likes comments

activities

etc.

...

GET contacts

GET privacy settings

GET followed companies

Page 23: Triple R – Riak, Redis and RabbitMQ at XING

Activities immediately visible

Page 24: Triple R – Riak, Redis and RabbitMQ at XING

Consistency

Page 25: Triple R – Riak, Redis and RabbitMQ at XING

SQL Databases arewell understood

Page 26: Triple R – Riak, Redis and RabbitMQ at XING

Database Master is

Single Point of Failure

Page 27: Triple R – Riak, Redis and RabbitMQ at XING

Sharding

Page 28: Triple R – Riak, Redis and RabbitMQ at XING

Unsatisfactory Read Performance

Page 29: Triple R – Riak, Redis and RabbitMQ at XING

New Approach

Page 30: Triple R – Riak, Redis and RabbitMQ at XING

Materialized Feeds

Page 31: Triple R – Riak, Redis and RabbitMQ at XING

ActivityStream

likes

comments

activities news feeds me feeds company feeds

Activity

Storage

User App

GET contacts

etc....

create

?

Page 32: Triple R – Riak, Redis and RabbitMQ at XING

Requirements

Page 33: Triple R – Riak, Redis and RabbitMQ at XING

Better Read Performance

Page 34: Triple R – Riak, Redis and RabbitMQ at XING

Activities created by me must be visible to myself immediately

Page 35: Triple R – Riak, Redis and RabbitMQ at XING

Activities created by others should appear within a reasonable time

frame in my stream

Page 36: Triple R – Riak, Redis and RabbitMQ at XING

Storage layer must tolerate high read and write loads

Page 37: Triple R – Riak, Redis and RabbitMQ at XING

Storage layer must provide easy capacity scaling

Page 38: Triple R – Riak, Redis and RabbitMQ at XING

Low maintenance

Page 39: Triple R – Riak, Redis and RabbitMQ at XING

Option 1:

Do it yourself SQLdatabase design

Page 40: Triple R – Riak, Redis and RabbitMQ at XING

Option 2:

Off the shelf NoSQL database

Page 41: Triple R – Riak, Redis and RabbitMQ at XING

We chose

Page 42: Triple R – Riak, Redis and RabbitMQ at XING

We tend to view it as a highly available distributed hash table

Page 43: Triple R – Riak, Redis and RabbitMQ at XING

Eventual consistency/conflict resolution is the hard part

Page 44: Triple R – Riak, Redis and RabbitMQ at XING

Bounded size feeds are easyhttp://www.paperplanes.de/2011/12/15/storing-timelines-in-riak.html

Page 45: Triple R – Riak, Redis and RabbitMQ at XING

Unbounded feeds are much harder

Page 46: Triple R – Riak, Redis and RabbitMQ at XING

Object Model

Page 47: Triple R – Riak, Redis and RabbitMQ at XING

JSON Documents

Page 48: Triple R – Riak, Redis and RabbitMQ at XING

Activities

Page 49: Triple R – Riak, Redis and RabbitMQ at XING

Activities

Page 50: Triple R – Riak, Redis and RabbitMQ at XING

Activities

Page 51: Triple R – Riak, Redis and RabbitMQ at XING

Activities

2P-Set

Page 52: Triple R – Riak, Redis and RabbitMQ at XING

JSON Documents

Page 53: Triple R – Riak, Redis and RabbitMQ at XING

Feeds

Page 54: Triple R – Riak, Redis and RabbitMQ at XING

Feeds

bounded list of chunk references

Page 55: Triple R – Riak, Redis and RabbitMQ at XING

Feeds

chunk sequence number

Page 56: Triple R – Riak, Redis and RabbitMQ at XING

Feeds

youngest activity ref

Page 57: Triple R – Riak, Redis and RabbitMQ at XING

Feeds

oldest activity ref

Page 58: Triple R – Riak, Redis and RabbitMQ at XING

Feeds

size of referenced chunk

Page 59: Triple R – Riak, Redis and RabbitMQ at XING

JSON Documents

Page 60: Triple R – Riak, Redis and RabbitMQ at XING

FeedChunk

Page 61: Triple R – Riak, Redis and RabbitMQ at XING

FeedChunk

2P-Set

Page 62: Triple R – Riak, Redis and RabbitMQ at XING

The Migration

Page 63: Triple R – Riak, Redis and RabbitMQ at XING

Incremental rollout

Page 64: Triple R – Riak, Redis and RabbitMQ at XING

Part 1:

From old to new

Page 65: Triple R – Riak, Redis and RabbitMQ at XING

Let’s start simple!

Page 66: Triple R – Riak, Redis and RabbitMQ at XING

Replicating some data

Page 67: Triple R – Riak, Redis and RabbitMQ at XING

Old ActivityStream

New ActivityStream

activity.deleted

activity.created

comment.deleted

comment.created

like.deleted

like.created

data migration processors

Page 68: Triple R – Riak, Redis and RabbitMQ at XING

Measuring performance

Page 69: Triple R – Riak, Redis and RabbitMQ at XING

Old ActivityStream

New ActivityStream

mefeed.viewed

newsfeed.viewed

companyfeed.viewed

shadow query processors

data migration processors

Page 70: Triple R – Riak, Redis and RabbitMQ at XING
Page 71: Triple R – Riak, Redis and RabbitMQ at XING

Part 2:

From new to old

Page 72: Triple R – Riak, Redis and RabbitMQ at XING

Old ActivityStream

New ActivityStream

DELETE /activitystream/activities/{id}

POST /activitystream/activities

DELETE /activitystream/activities/{activity_id}/comments/{id}

POST /activitystream/activities/{activity_id}/comments

DELETE /activitystream/activities/{activity_id}/likes/{user_id}

PUT /activitystream/activities/{activity_id}/likes/{user_id}

Beta User B

Beta User A

Beta User C

Page 73: Triple R – Riak, Redis and RabbitMQ at XING

Old ActivityStream

New ActivityStream

DELETE /activitystream/activities/{id}

POST /activitystream/activities

DELETE /activitystream/activities/{activity_id}/comments/{id}

POST /activitystream/activities/{activity_id}/comments

DELETE /activitystream/activities/{activity_id}/likes/{user_id}

PUT /activitystream/activities/{activity_id}/likes/{user_id}

activity idBeta User B

Beta User A

Beta User C

Page 74: Triple R – Riak, Redis and RabbitMQ at XING

Part 3:

What about the old data?

Page 75: Triple R – Riak, Redis and RabbitMQ at XING

Bulk Data Migration:

Failed Version 1

Page 76: Triple R – Riak, Redis and RabbitMQ at XING

Bulk Data Migration:

Failed Version 11. Reset data in the new system2. Query the old system REST API for the feeds3. Store them in the new system4. Switch to the new system

Page 77: Triple R – Riak, Redis and RabbitMQ at XING

This was naive

Page 78: Triple R – Riak, Redis and RabbitMQ at XING

The old system was way too slow to return the millions of feeds in

their full length

Page 79: Triple R – Riak, Redis and RabbitMQ at XING

Bulk Data Migration:

Failed Version 2

Page 80: Triple R – Riak, Redis and RabbitMQ at XING

Bulk Data Migration:

Failed Version 21. Reset data in the new system2. Read all activities based on a dump of the

old system3. Publish “created” messages to RabbitMQ for

each activity/comment/like4. Let the new system build its data structures5. Switch to the new system

Page 81: Triple R – Riak, Redis and RabbitMQ at XING

This was naive

Page 82: Triple R – Riak, Redis and RabbitMQ at XING

You can’t replay the history of 2.5 years this way

Page 83: Triple R – Riak, Redis and RabbitMQ at XING

Bulk Data Migration:

Successful Version

Page 84: Triple R – Riak, Redis and RabbitMQ at XING

Bulk Data Migration:

Successful Version1. Reset data in the new system2. Obtain data dump from old system3. Extract data from the dumps and compute a

representation of the feeds in Redis with a massive amount of batch processors

4. Use this data to build up the structures in the new system

5. Switch to the new system

Page 85: Triple R – Riak, Redis and RabbitMQ at XING

It worked!

Page 86: Triple R – Riak, Redis and RabbitMQ at XING

But...

Page 87: Triple R – Riak, Redis and RabbitMQ at XING

A lot of additional code

Page 88: Triple R – Riak, Redis and RabbitMQ at XING

Fragile, manual steps involved

Page 89: Triple R – Riak, Redis and RabbitMQ at XING

A lot of technology:

RabbitMQ, Riak, Redis, Varnish

Page 90: Triple R – Riak, Redis and RabbitMQ at XING

One run took 5 days

Page 91: Triple R – Riak, Redis and RabbitMQ at XING

But it worked!

Page 92: Triple R – Riak, Redis and RabbitMQ at XING

Current Status

Page 93: Triple R – Riak, Redis and RabbitMQ at XING

New system is live for all users since 12/12/12

Page 94: Triple R – Riak, Redis and RabbitMQ at XING

Old and new system were kept in sync till April 2013

Page 95: Triple R – Riak, Redis and RabbitMQ at XING

In case of serious trouble, we could have switched to the old

system within seconds

Page 96: Triple R – Riak, Redis and RabbitMQ at XING

Performance goals have been met

Page 97: Triple R – Riak, Redis and RabbitMQ at XING

Performance goals have been met

Old system New system

happy t < 0.1s 0.17% 62.01%

satisfied t < 0.5s 41.36% 99.71%

tolerating 0.5s ≤ t < 2s 58.20% 0.28%

frustrated t ≥ 2s 0.44% 0.00%

Apdex Score 0.70 1.00

Page 98: Triple R – Riak, Redis and RabbitMQ at XING

Production setup

‣ 10 Riak Servers as Primary Cluster

‣ 10 Riak Servers as Backup Cluster (Multi-DC Replication)

‣ SSDs, Raid 0 and a proper Linux file I/O scheduler setting (noop)

‣ Bitcask storage backend

‣ 4 REST API Servers

‣ 4 Background Worker Servers

‣ Monitoring using Ganglia and Logjam (App Performance)

Page 99: Triple R – Riak, Redis and RabbitMQ at XING

Lessons learned

Page 100: Triple R – Riak, Redis and RabbitMQ at XING

‣Eventual consistency sounds easy, but is hard to implement correctly in practice‣There’s a steep learning curve at the beginning‣High update rates and large objects don’t go

together well, if your storage system offers just get, put and delete operations‣Achieving high performance requires careful

thought about data structures, algorithms and access patterns‣Building a new system from scratch is lot easier

than migrating an existing system

Page 101: Triple R – Riak, Redis and RabbitMQ at XING

‣Protobuffs API faster than HTTP ‣Use the best performing JSON library you can

find (Ruby: Oj gem)‣Avoid a full-blown ORM for Riak if you care

about performance (Ruby: Ripple gem)

Page 102: Triple R – Riak, Redis and RabbitMQ at XING

‣At one point we saturated the Gigabit network cards on the Riak cluster‣This lead to compressing all data before storing

it on the cluster and breaking news feeds into chunks

Page 103: Triple R – Riak, Redis and RabbitMQ at XING

The professional network www.xing.com

Thank you for your

attention!

Dr. Stefan Kaes

Twitter: @stkaes

Sebastian Röbke

Twitter: @boosty

We’re hiring: [email protected]