65
Scaling Out Without Flipping Out Luke Tillman (@LukeTillman) Language Evangelist at DataStax

Scaling Out Without Flipping Out

Embed Size (px)

Citation preview

Page 1: Scaling Out Without Flipping Out

Scaling Out Without Flipping Out

Luke Tillman (@LukeTillman)

Language Evangelist at DataStax

Page 2: Scaling Out Without Flipping Out

Who are you?!

• Evangelist with a focus on the .NET Community

• Long-time developer (mostly with relational databases)

• Recently presented at Cassandra Summit 2014 with Microsoft

2

Page 3: Scaling Out Without Flipping Out

1 More Users, More Problems

2 With Great Power Comes Great Responsibility

3 Leaving the Relational Past Behind

4 Cassandra and Azure, BFFs

3

Page 4: Scaling Out Without Flipping Out

More Users, More Problems

Page 5: Scaling Out Without Flipping Out

Scaling and Availability

• We all want applications and

services that are scalable and highly

available

• Scaling our app tier is usually pretty

painless, especially with cloud

infrastructure

– App tier tends to be stateless

Page 6: Scaling Out Without Flipping Out

Ways We Scale our Relational Databases

6

SELECT array_agg(players), player_teams FROM ( SELECT DISTINCT t1.t1player AS players, t1.player_teams FROM ( SELECT p.playerid AS t1id, concat(p.playerid, ':', p.playername, ' ') AS t1player, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t1 INNER JOIN ( SELECT p.playerid AS t2id, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t2 ON t1.player_teams = t2.player_teams AND t1.t1id <> t2.t2id ) innerQuery GROUP BY player_teams

Scaling Up

SELECT * FROM denormalized_view

Denormalization

Page 7: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Page 8: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Replication

Primary

Replica 1

Replica 2

Page 9: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Replication

Primary

Replica 1

Replica 2

Failover

Process

Page 10: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Replication

Primary

Replica 1

Replica 2

Failover

Process

Monitor

Failover

Page 11: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Replication

Primary

Replica 1

Replica 2

Failover

Process

Monitor

Failover

Write Requests Read Requests

Page 12: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Replication

Primary

Replica 1

Replica 2

Failover

Process

Monitor

Failover

Write Requests Read Requests

Replication Lag

Page 13: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Page 14: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Sharding

A-F G-M N-T U-Z

Page 15: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Sharding

Router

A-F G-M N-T U-Z

Page 16: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Page 17: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Failover

Process

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Page 18: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Page 19: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Sharding and Replication (and probably Denormalization)

Page 20: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Replication Lag

Sharding and Replication (and probably Denormalization)

Page 21: Scaling Out Without Flipping Out

Ways we Scale our Relational Databases

7

Client

Users Data

Failover

Process

Monitor

Failover

Router

A-F G-M N-T U-Z

Replication Lag

Sharding and Replication (and probably Denormalization)

Page 22: Scaling Out Without Flipping Out

What is Cassandra?

• A Linearly Scaling and Fault Tolerant Distributed Database

• Fully Distributed

– Data spread over many nodes

– All nodes participate in a cluster

– All nodes are equal

– No SPOF (shared nothing)

– Run on commodity hardware

22

Page 23: Scaling Out Without Flipping Out

What is Cassandra?

Linearly Scaling

– Have More Data? Add more nodes.

– Need More Throughput? Add more nodes.

23

Fault Tolerant

– Nodes Down != Database Down

– Datacenter Down != Database Down

Page 24: Scaling Out Without Flipping Out

What is Cassandra?

• Fully Replicated

• Clients write local

• Data syncs across WAN

• Replication Factor per DC

24

US Europe

Client

Page 25: Scaling Out Without Flipping Out

Cassandra and the CAP Theorem

• The CAP Theorem limits what distributed systems can do

• Consistency

• Availability

• Partition Tolerance

• Limits? “Pick 2 out of 3”

• Cassandra is an AP system that is Eventually Consistent

25

Page 26: Scaling Out Without Flipping Out

With Great Power Comes Great

Responsibility

Page 27: Scaling Out Without Flipping Out

You Control the Fault Tolerance of Cassandra

• Replication Factor

– You set this on the server-side in

Cassandra

• Consistency Level

– You set this on the client-side in

your application

– Choose this for each read and

write you do against Cassandra

27

Page 28: Scaling Out Without Flipping Out

Replication Factor (server-side)

• How many copies of the data should exist?

28

Client

B AD

C AB

A CD

D BC

Write A

RF=3

Page 29: Scaling Out Without Flipping Out

Consistency Level (client-side)

• How many replicas do we need to hear from before we

acknowledge?

29

Client

B AD

C AB

A CD

D BC

Write A

CL=QUORUM

Client

B AD

C AB

A CD

D BC

Write A

CL=ONE

Page 30: Scaling Out Without Flipping Out

Consistency Levels

• Applies to both Reads and Writes (i.e. is set on each query)

• ONE – one replica from any DC

• LOCAL_ONE – one replica from local DC

• QUORUM – 51% of replicas from any DC

• LOCAL_QUORUM – 51% of replicas from local DC

• ALL – all replicas

• TWO

30

Page 31: Scaling Out Without Flipping Out

Consistency Level and Speed

• How many replicas we need to hear from can affect how quickly

we can read and write data in Cassandra

31

Client

B AD

C AB

A CD

D BC

5 µs ack

300 µs ack

12 µs ack

12 µs ack

Read A

(CL=QUORUM)

Page 32: Scaling Out Without Flipping Out

Consistency Level and Availability

• Consistency Level choice affects availability

• For example, QUORUM can tolerate one replica being down and

still be available (in RF=3)

32

Client

B AD

C AB

A CD

D BC

A=2

A=2

A=2

Read A

(CL=QUORUM)

Page 33: Scaling Out Without Flipping Out

Consistency Level and Eventual Consistency

• Cassandra is an AP system that is Eventually Consistent so

replicas may disagree

• Column values are timestamped

• In Cassandra, Last Write Wins (LWW)

33

Client

B AD

C AB

A CD

D BC

A=2

Newer

A=1

Older

A=2

Read A

(CL=QUORUM)

Christos from Netflix: “Eventual Consistency != Hopeful Consistency”

https://www.youtube.com/watch?v=lwIA8tsDXXE

Page 34: Scaling Out Without Flipping Out

Leaving the Relational Past Behind

34

Page 35: Scaling Out Without Flipping Out

KillrVideo, a Video Sharing Site (like YouTube)

• Live demo available at http://www.killrvideo.com – Written in C#, JavaScript

– Live Demo running in Azure, backed by DataStax Enterprise cluster

– Open source: https://github.com/luketillman/killrvideo-csharp

Page 36: Scaling Out Without Flipping Out

Data Structures

• Keyspace is like RDBMS Database or Schema

• Like RDBMS, Cassandra uses Tables to store data

• Partitions can have one row (narrow) or multiple

rows (wide)

36

Keyspace

Tables

Partitions

Rows

Page 37: Scaling Out Without Flipping Out

Schema Definition (DDL)

• Easy to define tables for storing data

• First part of Primary Key is the Partition Key

CREATE TABLE videos ( videoid uuid, userid uuid, name text, description text, preview_image_location text, tags set<text>, added_date timestamp, PRIMARY KEY (videoid) );

37

Page 38: Scaling Out Without Flipping Out

Partition Key

Partition Key Determines Data Distribution

• The Partition Key determines node placement

38

name description ...

Keyboard Cat Keyboard Cat is the ... ...

Nyan Cat Check out Nyan cat ... ...

Original Grumpy Cat Visit Grumpy Cat’s … ...

videoid

689d56e5- …

93357d73- …

d978b136- …

Page 39: Scaling Out Without Flipping Out

Partition Key – Hashing

• The Partition Key is hashed using a consistent hashing function

(Murmur 3) and the output is used to place the data on a node

• The data is also replicated to RF-1 other nodes

39

Murmur3 videoid: 689d56e5- ... Murmur3: A

B AD

C AB

A CD

D BC

RF=3 Partition Key

name description ...

Keyboard Cat Keyboard Cat is the ... ...

videoid

689d56e5- ...

Page 40: Scaling Out Without Flipping Out

Hashing – Back to Reality

• Back in reality, Partition Keys actually hash to 128 bit numbers

• Nodes in Cassandra own token ranges (i.e. hash ranges)

40

B AD

C AB

A CD

D BC

Range Start End

A 0xC000000..1 0x0000000..0

B 0x0000000..1 0x4000000..0

C 0x4000000..1 0x8000000..0

D 0x8000000..1 0xC000000..0

Murmur3 0xadb95e99da887a8a4cb474db86eb5769

Partition Key

videoid

689d56e5- ...

Page 41: Scaling Out Without Flipping Out

Clustering Columns

• Second part of Primary Key is Clustering Column(s)

• Clustering columns affect ordering of data inside a partition (and on disk)

• Ascending/Descending order is possible

41

CREATE TABLE comments_by_video ( videoid uuid, commentid timeuuid, userid uuid, comment text, PRIMARY KEY (videoid, commentid) ) WITH CLUSTERING ORDER BY (commentid DESC);

Page 42: Scaling Out Without Flipping Out

Clustering Columns – Wide Rows

• Use of Clustering Columns (and the layout on disk) is where the

term “Wide Rows” comes from

42

videoid='0fe6a...'

userid= 'ac346...'

comment= 'Awesome!'

commentid='82be1...' (10/1/2014 9:36AM)

userid= 'f89d3...'

comment= 'Garbage!'

commentid='765ac...' (9/17/2014 7:55AM)

CREATE TABLE comments_by_video ( videoid uuid, commentid timeuuid, userid uuid, comment text, PRIMARY KEY (videoid, commentid) ) WITH CLUSTERING ORDER BY (commentid DESC);

Page 43: Scaling Out Without Flipping Out

Inserts and Updates

• Use INSERT or UPDATE to add and modify data

• Both will overwrite data (no constraints like RDBMS)

• INSERT and UPDATE functionally equivalent 43

INSERT INTO comments_by_video ( videoid, commentid, userid, comment) VALUES ( '0fe6a...', '82be1...', 'ac346...', 'Awesome!');

UPDATE comments_by_video SET userid = 'ac346...', comment = 'Awesome!' WHERE videoid = '0fe6a...' AND commentid = '82be1...';

Page 44: Scaling Out Without Flipping Out

TTL and Deletes

• Can specify a Time to Live (TTL) in seconds when doing an

INSERT or UPDATE

• Use DELETE statement to remove data

• Can optionally specify columns to remove part of a row

44

INSERT INTO comments_by_video ( ... ) VALUES ( ... ) USING TTL 86400;

DELETE FROM comments_by_video WHERE videoid = '0fe6a...' AND commentid = '82be1...';

Page 45: Scaling Out Without Flipping Out

Querying

• Use SELECT to get data from your tables

• Always include Partition Key and optionally Clustering Columns in queries

• Can use ORDER BY (on Clustering Columns) and LIMIT

• Use range queries (for example, by date) to slice partitions

45

SELECT * FROM comments_by_video WHERE videoid = 'a67cd...' LIMIT 10;

Page 46: Scaling Out Without Flipping Out

Breaking the Relational Mindset

• How do we data model when we have to query by the Partition Key (and optionally Clustering Columns)?

• Denormalize all the things!

• Disk is cheap now and writes in Cassandra are FAST

• Data modeling is very much query driven

• Many times we end up with a “table per query”

46

Page 47: Scaling Out Without Flipping Out

Users – The Relational Way

• Single Users table with all user data and an Id Primary Key

• Add an index on email address to allow queries by email

User Logs

into site

Find user by email

address

Show basic

information

about user Find user by id

47

Page 48: Scaling Out Without Flipping Out

Users – The Cassandra Way

User Logs

into site

Find user by email

address

Show basic

information

about user Find user by id

CREATE TABLE user_credentials ( email text, password text, userid uuid, PRIMARY KEY (email) );

CREATE TABLE users ( userid uuid, firstname text, lastname text, email text, created_date timestamp, PRIMARY KEY (userid) );

48

Page 49: Scaling Out Without Flipping Out

Cassandra and Azure, BFFs

Page 50: Scaling Out Without Flipping Out

Cassandra and Azure: Languages and Platforms

50

Open

Source

Languages

Platforms

https://github.com/datastax https://github.com/Azure

Notes:

• DataStax also offers a C++ driver

• Over 20% of Azure VMs run Linux

Page 51: Scaling Out Without Flipping Out

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Page 52: Scaling Out Without Flipping Out

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob)

A7 instances

Page 53: Scaling Out Without Flipping Out

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

A7 instances G3/G4 instances

Page 54: Scaling Out Without Flipping Out

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

More safety

Less Performance

Less safety

More Performance

A7 instances G3/G4 instances

Page 55: Scaling Out Without Flipping Out

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

Logical DC1 Logical DC2

Multi-DC Replication

Page 56: Scaling Out Without Flipping Out

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

Frequent Snapshots

Page 58: Scaling Out Without Flipping Out

Marketplace Deployment from Preview Portal

58 https://portal.azure.com

Page 59: Scaling Out Without Flipping Out

Marketplace Deployment from Preview Portal

• Configure VM size in the Portal UI, click a button (yes, that easy)

• What you get:

– 8 VMs configured for use as DataStax Enterprise nodes

– 1 VM with OpsCenter

• Decommission any nodes you don't want/need, then use

OpsCenter to deploy Cassandra

– More detailed instructions:

http://www.tonyguid.net/2014/11/Datastax_now_what/

59

Page 60: Scaling Out Without Flipping Out

OpsCenter: Management and Monitoring

60

Page 61: Scaling Out Without Flipping Out

OpsCenter: Creating a Cluster and Adding Nodes

61

Page 62: Scaling Out Without Flipping Out

Picking a Distribution: Apache Cassandra

• Get the latest bleeding-edge

features

• File JIRAs

• Support via community on

mailing list and IRC

• Perfect for hacking

62

http://cassandra.apache.org

Page 64: Scaling Out Without Flipping Out

Some Parting Thoughts

• Spending time re-architecting or changing infrastructure to

meet scale challenges doesn't add business value

• Can you make infrastructure and architecture decisions now that

will help you scale in the future?

• Learn more

– Apache Cassandra: http://planetcassandra.org

– DataStax Enterprise, Free Tools: http://www.datastax.com

– Azure: http://azure.microsoft.com

64

Page 65: Scaling Out Without Flipping Out

Questions?

Follow me for updates or to ask questions later @LukeTillman Slides: http://www.slideshare.net/LukeTillman

65