Upload
mysqlops
View
2.768
Download
0
Embed Size (px)
DESCRIPTION
Couchbase,meetup
Citation preview
1
Steven Mih, Couchbase Inc.
Bin Cui, Couchbase Inc.
USING COUCHBASE FOR SOCIAL GAME SCALING AND SPEED
2
• Introduction • What is Couchbase Server?
– Simple, Fast, Elastic – Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party – Challenges before Couchbase
• Original Architecture
– Why Couchbase? • Simplicity • Performance • Flexibility
– Deploying Couchbase • New Architecture • EC2 • Data Model • Accessing data in Couchbase
• Product Roadmap • Q&A
Agenda
3
• Membase and CouchOne have merged to form Couchbase Inc. (headquartered in Silicon Valley)
• Team – Brings together the creators and core contributors of Memcached,
Membase and CouchDB technologies
– Doubles technical team size, accelerates roadmaps by over a year
• Products – Couchbase Server (Formerly Membase)
– Couchbase Single Server
– Mobile Couchbase (iPhone and Android)
• Technology – Most mature, reliable and widely deployed NoSQL technologies
– Fully featured, open source document datastore
– First complete, end-to-end NoSQL database product
Couchbase Inc.
4
Modern Interactive Web Application Architecture
Application Scales Out Just add more commodity web servers
Database Scales Up Get a bigger, more complex server
www.facebook.com/animalparty
Web Servers
Relational Database
Load Balancer
- Expensive and disruptive sharding - Doesn’t perform at Web Scale
5
Couchbase Server is a distributed database
Couchbase Servers
Web application server
Application user
Couchbase Web Console
6
Couchbase data layer scales like application logic tier Data layer now scales with linear cost and constant performance.
Application Scales Out Just add more commodity web servers
Database Scales Out Just add more commodity data servers
Scaling out flattens the cost and performance curves.
Couchbase Servers
www.facebook.com/animalparty
Web Servers
Load Balancer
Horizontally scalable, schema-less, auto-sharding, high-performance at Web Scale
7
Couchbase Demonstration
Couchbase Servers
In the EC2 or Datacenter
Web application server
Application user
• Membase ServerTemplate Demo
– Starting with four database nodes under load
– Dynamically scaling to eight database nodes
– Easy management and monitoring
– Not possible any other database technology
8
Couchbase Server is Simple, Fast, Elastic
• Five minutes or less to a working cluster – Downloads for Windows, Linux and OSX
– Start with a single node
– One button press joins nodes to a cluster
• Easy to develop against – Just SET and GET – no schema required
– Drop it in. 10,000+ existing applications already “speak Couchbase” (via memcached)
– Practically every language and application framework is supported, out of the box
• Easy to manage – One-click failover and cluster rebalancing
– Graphical and programmatic interfaces
– Configurable alerting
9
Couchbase Server is Simple, Fast, Elastic
• Predictable – “Never keep an application waiting”
– Quasi-deterministic latency and throughput
• Low latency – Built-in Memcached technology
– Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk)
– Selectable write behavior – asynchronous, synchronous (on replication, persistence)
• High throughput – Multi-threaded
– Low lock contention
– Asynchronous wherever possible
– Automatic write de-duplication
10
Couchbase Server is Simple, Fast, Elastic
• Zero-downtime elasticity – Spread I/O and data across commodity
servers (or VMs)
– Consistent performance with linear cost
– Dynamic rebalancing of a live cluster
• All nodes are created equal – No special case nodes
– Clone to grow
• Extensible – Change feeds
– Real-time map-reduce
– RESTful interface for management
Couchbase Web Console
11
Proven at Small, and Extra Large Scale
• Leading cloud service (PAAS)
provider
• Over 150,000 hosted applications
• Couchbase Server serving over
6,200 Heroku customers
• Social game leader – FarmVille,
Mafia Wars, Empires and Allies,
Café World, FishVille
• Over 230 million monthly users
• Couchbase Server is the primary
database behind key Zynga
properties
13
moxi
11211 11210
memcached protocol listener/sender
Couchbase Storage Engine
engine interface
memcapable 1.0 memcapable 2.0
21100 – 21199 4369 8091
http R
EST
man
agem
ent
AP
I/W
eb U
I
Hea
rtb
eat
Pro
cess
mo
nit
or
Glo
bal
sin
glet
on
su
per
viso
r
Co
nfi
gura
tio
n m
anag
er
on each node
Erlang/OTP
Reb
alan
ce o
rch
estr
ato
r
No
de
hea
lth
mo
nit
or
one per cluster
vBu
cket
sta
te a
nd
rep
licat
ion
man
ager
HTTP distributed erlang erlang port mapper
Data Manager Cluster Manager
Couchbase Server Architecture
14
moxi
11211 11210
memcached protocol listener/sender
engine interface
memcapable 1.0 memcapable 2.0
21100 – 21199 4369 8091
http R
EST
man
agem
ent
AP
I/W
eb U
I
Hea
rtb
eat
Pro
cess
mo
nit
or
Glo
bal
sin
glet
on
su
per
viso
r
Co
nfi
gura
tio
n m
anag
er
on each node
Erlang/OTP
Reb
alan
ce o
rch
estr
ato
r
No
de
hea
lth
mo
nit
or
one per cluster
vBu
cket
sta
te a
nd
rep
licat
ion
man
ager
HTTP distributed erlang erlang port mapper
Couchbase Server Architecture
Couchbase Storage Engine
15
Couchbase “write” Data Flow – application view
User action results in the need to change the VALUE of KEY
Application updates key’s VALUE, performs SET operation
Couchbase client hashes KEY, identifies KEY’s master server SET request sent over
network to master server
Couchbase replicates KEY-VALUE pair, caches it in memory and stores it to disk
1
2
3 4
5
16
Couchbase Data Flow – under the hood
Listener-Sender
DiskDisk Disk
RAM*
mem
bas
e st
ora
ge e
ngi
ne
SSDSSD SSD
Listener-Sender
DiskDisk Disk
RAM*
mem
bas
e st
ora
ge e
ngi
ne
SSDSSD SSD
SET request arrives at KEY’s master server
Listener-Sender
Master server for KEY Replica Server 2 for KEY Replica Server 1 for KEY
2 2
1 SET acknowledgement returned to application
3
Disk Disk Disk
RAM*
Co
uch
bas
e st
ora
ge e
ngi
ne
SSD SSD SSD
2
4
17
Elasticity - Rebalancing
vBucket 1
vBucket 2
vBucket 3
vBucket 4
vBucket 5
vBucket 6
Node 1 Node 2 Node 3
vBucket 1
vBucket 2
vBucket 3
vBucket 4
vBucket 5
vBucket 6
vBucket 7
vBucket 8
vBucket 9
vBucket 10
vBucket 11
vBucket 12
Before • Adding Node 3 • Node 3 is in pending state • Clients talk to Node 1,2 only
After • Node 3 is balanced • Clients are reconfigured to talk to
Node 3
During • Rebalancing orchestrator recalculates
the vBucket map (including replicas) • Migrate vBuckets to the new server • Finalize migration
vBucket 7
vBucket 8
vBucket 9
vBucket 10
vBucket 11
vBucket 12
Pending state
vBucket 1
vBucket 2
vBucket 3
vBucket 4
vBucket 5
vBucket 6
vBucket 7
vBucket 8
vBucket 9
vBucket 10
vBucket 11
vBucket 12
Rebalancing
vBucket migrator vBucket migrator
Client
18
Data buckets are secure Couchbase “slices”
Couchbase data servers
In the data center
Web application server
Application user
On the administrator console
Bucket 1
Bucket 2
Aggregate Cluster Memory and Disk Capacity
19
• Support large-scale analytics on application data by streaming data from Couchbase to Hadoop – Real-time integration using Flume
– Batch integration using Sqoop
• Examples – Various game statistics (e.g., monthly / daily / hourly rankings)
– Analyze game patterns from users to enhance various game metrics
Couchbase and Hadoop Integration
memcached protocol listener/sender
engine interface
Couchbase Storage Engine
TAP
Flume
Sqoop
20
• Introduction • What is Couchbase Server?
– Simple, Fast, Elastic – Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party – Challenges before Couchbase
• Original Architecture
– Why Couchbase? • Simplicity • Performance • Flexibility
– Deploying Couchbase • New Architecture • EC2 • Data Model • Accessing data in Couchbase
• Product Roadmap • Q&A
Agenda
21
Common steps on scaling up database:
● Tune queries (indexing, explain query)
● Denormalization
● Cache data (APC / Memcache)
● Tune MySQL configuration
● Replication (read slaves)
Where do we go from here to prepare for the scale of a successful social game?
Tribal Crossing: Challenges
22
● Write-heavy requests
– Caching does not help
– MySQL / InnoDB limitation (Percona)
● Need to scale drastically over night
– My Polls – 100 to 1m users over a weekend
● Small team, no dedicated sysadmin
– Focus on what we do best – making games
● Keeping cost down
Tribal Crossing: Challenges
23
● MySQL with master-to-master replication and sharding
– Complex to setup, high administration cost
– Requires application level changes
● Cassandra
– High write, but low read throughput
– Live cluster reconfiguration and rebalance is quite complicated
– Eventual consistency gives too much burden to application developers
● MongoDB
– High read/write, but unpredictable latency
– Live cluster rebalance for existing nodes only
– Eventual consistency with slave nodes
Tribal Crossing: “Old” Architecture and Options
24
● SPEED, SPEED, SPEED
● Immediate consistency
● Interface is dead simple to use
– We are already using Memcache
● Low sysadmin overhead
● Schema-less data store
● Used and Proven by big guys like Zynga
● … and lastly, because Tribal CAN
– Bigger firms with legacy code base = hard to adapt
– Small team = ability to get on the cutting edge
Tribal Crossing: Why Couchbase Server?
25
● But, there are some different challenges in using Couchbase (currently 1.7) to handle the game data:
– No easy way to query data
– No transaction / rollback
➔ Couchbase Server 2.0 resolves them by using CouchDB as the underlying database engine
● Can this work for an online game?
– Break out of the old ORM / relational paradigm!
– We are not handling bank transactions
Tribal Crossing: New Challenges With Couchbase
26
Couchbase Cluster
Web Server
Tribal Crossing: Deploying Couchbase in EC2
● Basic production environment setup
● Dev/Stage environment – feel free to install Couchbase on your web server
Apache
Couchbase Couchbase
DNS Entry
Client-side Moxi
Cluster Mgmt. Requests
…
27
Tribal Crossing: Deploying Couchbase in EC2
● Amazon Linux AMI, 64-bit, EBS backed instance
● Setup swap space
● Install Couchbase’s Membase Server 1.7
● Access web console http://<hostname>:8091
● Start the new cluster with a single node
● Add the other nodes to the cluster and rebalance
Couchbase Cluster
Web Server
Apache
Couchbase
DNS Entry
Client-side Moxi
Cluster Mgmt. Requests
… Couchbase
28
Tribal Crossing: Deploying Couchbase in EC2
Moxi figures out which node in the cluster holds data for a given key.
● On each web server, install Moxi proxy
● Start Moxi by pointing it to the DNS entry you created
● Web apps connect to Moxi that is running locally memcache->addServer(‘localhost’,
11211);
Couchbase Cluster
Web Server
Apache
Couchbase Couchbase
DNS Entry
Client-side Moxi
Cluster Mgmt. Requests
…
29
Use case - simple farming game:
● A player can have a variety of plants on their farm.
● A player can add or remove plants from their farm.
● A Player can see what plants are on another player's farm.
Tribal Crossing: Representing Game Data in Couchbase
30
Representing Objects
● Simply treat an object as an associative array
● Determine the key for an object using the class name (or type) of the object and an unique ID
Representing Object Lists
● Denormalization
● Save a comma separated list or an array of object IDs
Tribal Crossing: Representing Game Data in Couchbase
31
Player Object
Key: 'Player1'
Array
(
[Id] => 1
[Name] => Shawn
)
Tribal Crossing: Representing Game Data in Couchbase
Plant Object
Key: 'Plant201'
Array
(
[Id] => 201
[Player_Id] => 1
[Name] => Starflower
) PlayerPlant List Key: 'Player1_PlantList' Array ( [0] => 201 [1] => 202 [2] => 204 )
32
● No need to “ALTER TABLE”
● Add new “fields” all objects at any time
– Specify default value for missing fields
– Increased development speed
● Using JSON for data objects though, owing to the ability to query on arbitrary fields in Couchbase 2.0
Tribal Crossing: Schema-less Game Data
33
Get all plants belong to a given player
Request: GET /player/1/farm
$plant_ids = couchbase->get('Player1_PlantList');
$response = array();
foreach ($plant_ids as $plant_id)
{
$plant = couchbase->get('Plant' . $plant_id);
$response[] = $plant;
}
echo json_encode($response);
Tribal Crossing: Accessing Game Data in Couchbase
34
Give a player a new plant
// Create the new plant
$new_plant = array (
'id' => 100,
'name' => 'Mushroom'
);
$couchbase->set('Plant100', $new_plant);
// Update the player plant list
$plant_ids = $couchbase->get('Player1_PlantList');
$plant_ids[] = $new_plant['id'];
$couchbase->set('Player1_PlantList', $plant_ids);
Tribal Crossing: Modifying Game Data in Couchbase
35
Concurrency issue can occur when multiple requests are working with the same piece of data.
Solution:
● CAS (check-and-set)
– Client can know if someone else has modified the data while you are trying to update
– Implement optimistic concurrency control
● Locking (try/wait cycle)
– GETL (get with lock + timeout) operations
– Pessimistic concurrency control
Tribal Crossing: Concurrency
36
● Record object relationships both ways
– Example: Plots and Plants
● Plot object stores id of the plant that it hosts
● Plant object stores id of the plot that it grows on
– Resolution in case of mismatch
● Don't sweat the extra calls to load data in a one-to-many relationship
– Use multiGet
– We can still cache aggregated results in a Memcache bucket if needed
Tribal Crossing: Data Relationship
37
Web Server
First migrated large or slow performing tables and frequently updated fields from MySQL to Couchbase
Tribal Crossing: Migrating to Couchbase Servers
memcached protocol listener/sender
engine interface
Couchbase Storage Engine
TAP
TAP Client
Apache + PHP
Client-side Moxi
Reporting Applications
MySQL
38
Tribal Crossing: Deployment
39
Tribal Crossing: Deployment
40
• Significantly reduced the cost incurred by scaling up database servers and managing them.
• Achieved significant improvements in various performance metrics (e.g., read, write, latency, etc.)
• Allowed them to focus more on game development and optimizing key metrics
• Plan to use real-time MapReduce, querying, and indexing abilities provided by the upcoming Elastic Couchbase 2.0
Tribal Crossing: Conclusion
41
• Introduction • What is Couchbase Server?
– Simple, Fast, Elastic – Technology Overview (Architecture, data flow, rebalancing)
• Tribal Crossing Inc: Animal Party – Challenges before Couchbase
• Original Architecture
– Why Couchbase? • Simplicity • Performance • Flexibility
– Deploying Couchbase • New Architecture • EC2 • Data Model • Accessing data in Couchbase
• Product Roadmap • Q&A
Agenda
42
• Mobile to cloud data synchronization
• Cross data center replication
Product Roadmap: Couchbase Server 2.0
…
Couchbase Single Server
US West Coast Data Center
Couchbase Server
…
Couchbase Single Server
US East Coast Data Center
Couchbase Server
CouchSync
CouchSync CouchSync
CouchSync
… … …
… …
CouchSync
43
• Replace Sqlite-based storage engine with CouchDB
• Support indexing and querying on values
• Integrate real-time MapReduce into Couchbase server
• SDK for Couchbase server
Product Roadmap: Couchbase Server 2.0
The world’s leading caching and clustering technology
The most reliable and full-featured document database
The fastest, most complete and most reliable database on the
planet
Membase Server 1.7 CouchDB 1.1 Couchbase Server 2.0
44
• Community Edition
– Open source build
– Free forum support
• Enterprise Edition
– Free for non-production use
– Certified, QA tested version of open source
– Case tracking and guaranteed SLA for production environments
• Community Sponsors in China
– Interested in being Couchbase Ambassador for Beijing?
Couchbase Product Download
QClub
技术演讲
QClub
分享
桥梁
交流
争鸣
社区 趋势
话题
沟通
Java
云计算
运维
架构
SOA
敏捷
.NET
Ruby 技术社区活动日历
畅想
InfoQ
Open Space
QClub
QClub是由InfoQ中文站定期组织的一个线下技术交流活动,目的是让中高端技术人员有一 个相对自由的交流思想和交友的平台。每次QClub只关注一个主题,邀请国内外在该主题领域有研究的技术专家分享其经验,但QClub更注重讨论,因为它认为讨论才是真理,从讨论中才能激发参与者的智慧和火花。 关注我们:weibo.com/infoqchina QClub专题页:http://www.infoq.com/cn/qclub/