76
Redis to the Rescue? O’Reilly MySQL Conference 2011-04-13

Redis to the Rescue?

  • Upload
    wooga

  • View
    19.802

  • Download
    13

Embed Size (px)

DESCRIPTION

Based in Berlin, wooga is the leading European social games developer. Social games offer interesting scaling challenges—when they become popular, the user base can grow quickly, often by 50.000 people per day or more. This case study recounts how we successfully scaled up two Facebook games to one million daily active users each, why we decided to replace MySQL (once partially, once fully) with Redis, what difficulties we encountered on the way and how we solved them. As an in-memory database, Redis offers an order-of-magnitude reduction in query roundtrip latency, but also introduces new challenges: How can you guarantee durability of data in the case of server outages? How do you best structure your data when there are no ad-hoc query capabilities? This talk will go into technical details of our backend architecture and discuss both its advantages and disadvantages, how it stacks up against other possible setups, and what lessons we have learned.

Citation preview

Page 1: Redis to the Rescue?

Redis to the Rescue?

O’Reilly MySQL Conference2011-04-13

Page 2: Redis to the Rescue?

Who?

• Tim Lossen / @tlossen

• Berlin, Germany

• backend developer at wooga

Page 3: Redis to the Rescue?
Page 4: Redis to the Rescue?

Redis IntroCase 1: Monster WorldCase 2: Happy HospitalDiscussion

Page 5: Redis to the Rescue?

Redis IntroCase 1: Monster WorldCase 2: Happy HospitalDiscussion

Page 6: Redis to the Rescue?

What?

• key-value-store

• in-memory database

• “data structure server”

Page 7: Redis to the Rescue?
Page 8: Redis to the Rescue?

Data Types

• strings (integers)

Page 9: Redis to the Rescue?

Data Types

• strings (integers)

• lists

• hashes

Page 10: Redis to the Rescue?

Data Types

• strings (integers)

• lists

• hashes

• sets

• sorted sets

Page 11: Redis to the Rescue?

flickr.com/photos/atzu/2645776918

Page 12: Redis to the Rescue?

Performance

• same for reads / writes

Page 13: Redis to the Rescue?

Performance

• same for reads / writes

• 50 K ops/second

- regular notebook, EC2 instance

Page 14: Redis to the Rescue?

Performance

• same for reads / writes

• 50 K ops/second

- regular notebook, EC2 instance

• 200 K ops/second

- intel core i7 X980 (3.33 GHz)

Page 15: Redis to the Rescue?

Durability

• snapshots

• append-only log

Page 16: Redis to the Rescue?

Other Features

• master-slave replication

• virtual memory

• ...

Page 17: Redis to the Rescue?

Redis IntroCase 1: Monster WorldCase 2: Happy HospitalDiscussion

Page 18: Redis to the Rescue?
Page 19: Redis to the Rescue?

June July Aug Sept OctMayApril

Daily Active Users

2010

Page 20: Redis to the Rescue?

June July Aug Sept OctMayApril

Daily Active Users

2010

Page 21: Redis to the Rescue?

Challenge

• traffic growing rapidly

Page 22: Redis to the Rescue?

Challenge

• traffic growing rapidly

• bottleneck: write throughput

- EBS volumes pretty slow

Page 23: Redis to the Rescue?

Challenge

• traffic growing rapidly

• bottleneck: write throughput

- EBS volumes pretty slow

• MySQL already sharded

- 4 x 2 = 8 shards

Page 24: Redis to the Rescue?

Idea

• move write-itensive data to Redis

Page 25: Redis to the Rescue?

Idea

• move write-itensive data to Redis

• first candidate: inventory

- integer fields

- frequently changing

Page 26: Redis to the Rescue?
Page 27: Redis to the Rescue?

Solution

• inventory = Redis hash

- atomic increment / decrement !

Page 28: Redis to the Rescue?

Solution

• inventory = Redis hash

- atomic increment / decrement !

• on-demand migration of users

- with batch roll-up

Page 29: Redis to the Rescue?

Results

• quick win

- implemented in 2 weeks

- 10% less load on MySQL servers

Page 30: Redis to the Rescue?

Results

• quick win

- implemented in 2 weeks

- 10% less load on MySQL servers

• decision: move over more data

Page 31: Redis to the Rescue?

But ...

• “honeymoon soon over”

Page 32: Redis to the Rescue?

But ...

• “honeymoon soon over”

• growing memory usage (fragmentation)

- servers need periodic “refresh”

- replication dance

Page 33: Redis to the Rescue?

Current Status

• hybrid setup

- 4 MySQL master-slave pairs

- 4 Redis master-slave pairs

Page 34: Redis to the Rescue?

Current Status

• hybrid setup

- 4 MySQL master-slave pairs

- 4 Redis master-slave pairs

• evaluating other alternatives

- Riak, Membase

Page 35: Redis to the Rescue?

Redis IntroCase 1: Monster WorldCase 2: Happy HospitalDiscussion

Page 36: Redis to the Rescue?
Page 37: Redis to the Rescue?

Challenge

• expected peak load:

- 16000 concurrent users

- 4000 requests/second

- mostly writes

Page 38: Redis to the Rescue?

“Memory is the new Disk, Disk is the new Tape.”

⎯ Jim Gray

Page 39: Redis to the Rescue?

Idea

• use Redis as main database

- excellent (write) performance

- virtual memory for capacity

Page 40: Redis to the Rescue?

Idea

• use Redis as main database

- excellent (write) performance

- virtual memory for capacity

• no sharding = simple operations

Page 41: Redis to the Rescue?

Data Model

• user = single Redis hash

- each entity stored in hash field (serialized to JSON)

Page 42: Redis to the Rescue?

Data Model

• user = single Redis hash

- each entity stored in hash field (serialized to JSON)

• custom Ruby mapping layer (“Remodel”)

Page 43: Redis to the Rescue?
Page 44: Redis to the Rescue?

1220032045 u1 {“level”: 4,“xp”: 241}

u1_pets [“p7”, “p8”]

p7 {“pet_type”: “Cat”}

p8 {“pet_type”: “Dog”}

1234599660 u1 {“level”: 1,“xp”: 22}

u1_pets [“p3”]

... ...

Page 45: Redis to the Rescue?

1220032045 u1 {“level”: 4,“xp”: 241}

u1_pets [“p7”, “p8”]

p7 {“pet_type”: “Cat”}

p8 {“pet_type”: “Dog”}

1234599660 u1 {“level”: 1,“xp”: 22}

u1_pets [“p3”]

... ...

Page 46: Redis to the Rescue?

1220032045 u1 {“level”: 4,“xp”: 241}

u1_pets [“p7”, “p8”]

p7 {“pet_type”: “Cat”}

p8 {“pet_type”: “Dog”}

1234599660 u1 {“level”: 1,“xp”: 22}

u1_pets [“p3”]

... ...

Page 47: Redis to the Rescue?

1220032045 u1 {“level”: 4,“xp”: 241}

u1_pets [“p7”, “p8”]

p7 {“pet_type”: “Cat”}

p8 {“pet_type”: “Dog”}

1234599660 u1 {“level”: 1,“xp”: 22}

u1_pets [“p3”]

... ...

Page 48: Redis to the Rescue?

1220032045 u1 {“level”: 4,“xp”: 241}

u1_pets [“p7”, “p8”]

p7 {“pet_type”: “Cat”}

p8 {“pet_type”: “Dog”}

1234599660 u1 {“level”: 1,“xp”: 22}

u1_pets [“p3”]

... ...

Page 49: Redis to the Rescue?

1220032045 u1 {“level”: 4,“xp”: 241}

u1_pets [“p7”, “p8”]

p7 {“pet_type”: “Cat”}

p8 {“pet_type”: “Dog”}

1234599660 u1 {“level”: 1,“xp”: 22}

u1_pets [“p3”]

... ...

Page 50: Redis to the Rescue?

Virtual Memory

• server: 24 GB RAM, 500 GB disk

- can only keep “hot data” in RAM

Page 51: Redis to the Rescue?

Virtual Memory

• server: 24 GB RAM, 500 GB disk

- can only keep “hot data” in RAM

• 380 GB swap file

- 50 mio. pages, 8 KB each

Page 52: Redis to the Rescue?

Dec 2010: Crisis

• memory usage growing fast

Page 53: Redis to the Rescue?

Dec 2010: Crisis

• memory usage growing fast

• cannot take snapshots any more

- cannot start new slaves

Page 54: Redis to the Rescue?

Dec 2010: Crisis

• memory usage growing fast

• cannot take snapshots any more

- cannot start new slaves

• random crashes

Page 55: Redis to the Rescue?

Analysis

• Redis virtual memory not compatible with:

- persistence

- replication

Page 56: Redis to the Rescue?

Analysis

• Redis virtual memory not compatible with:

- persistence

- replication

• need to implement our own!

Page 57: Redis to the Rescue?

Workaround

• “dumper” process

- tracks active users

- every 10 minutes, writes them into YAML file

Page 58: Redis to the Rescue?

ruby redis disk

Page 59: Redis to the Rescue?

ruby redis disk

Page 60: Redis to the Rescue?

ruby redis disk

Page 61: Redis to the Rescue?

ruby redis disk

Page 62: Redis to the Rescue?

ruby redis disk

Page 63: Redis to the Rescue?

Workaround

• in case of Redis crash

- start with empty database

- restore users on demand from YAML files

Page 64: Redis to the Rescue?

Real Solution

• Redis “diskstore”

- keeps all data on disk

- swaps data into memory as needed

Page 65: Redis to the Rescue?

Real Solution

• Redis “diskstore”

- keeps all data on disk

- swaps data into memory as needed

• under development (expected Q2)

Page 66: Redis to the Rescue?
Page 67: Redis to the Rescue?

Results

• average response time: 10 ms

Page 68: Redis to the Rescue?

Results

• average response time: 10 ms

• peak traffic:

- 1500 requests/second

- 15000 Redis ops/second

Page 69: Redis to the Rescue?

Current Status

• very happy with setup

- simple, robust, fast

- easy to operate

• still lots of spare capacity

Page 70: Redis to the Rescue?

Redis IntroCase 1: Monster WorldCase 2: Happy HospitalDiscussion

Page 71: Redis to the Rescue?

Advantages

• order-of-magnitude performance improvement

- removes main bottleneck

- enables simple architecture

Page 72: Redis to the Rescue?

Disadvantages

• main challenge: durability

- diskstore very promising

Page 73: Redis to the Rescue?

Disadvantages

• main challenge: durability

- diskstore very promising

• no ad-hoc queries

- think hard about data model

- hybrid approach?

Page 74: Redis to the Rescue?

Conclusion

• Ruby + Redis = killer combo

Page 75: Redis to the Rescue?

Q & A