78
Scale Your Apps for the Big Time Luke Tillman (@LukeTillman) Language Evangelist at DataStax

Scale Your Apps for the Big Time

Embed Size (px)

Citation preview

Scale Your Apps for the Big Time

Luke Tillman (@LukeTillman)

Language Evangelist at DataStax

Who are you?!

• Evangelist with a focus on the .NET Community

• Long-time developer (mostly with relational databases)

• Recently presented at Cassandra Summit 2014 with Microsoft

2

1 Why are we talking about this?

2 How is Cassandra different?

3 An Application for Sharing Videos

4 Handling User-Generated Videos

5 Scaling Out with Cassandra and Azure

3

Why are we talking about this?

Scaling and Availability

• We all want applications and

services that are scalable and highly

available

• Scaling our app tier is usually pretty

painless, especially with cloud

infrastructure

– App tier tends to be stateless

But wait, my relational database can do that

• Scaling up

– Faster hardware, bigger SAN, etc.

– Can get expensive, quickly

• My favorite Oracle Exception

6

But wait, my relational database can do that

• Denormalization

– Speed up poorly

performing queries

– Sometimes done as

“materialized” or

“indexed” views

– No more 3NF?

– ACID? Sometimes

views are eventually

consistent

7

SELECT array_agg(players), player_teams FROM ( SELECT DISTINCT t1.t1player AS players, t1.player_teams FROM ( SELECT p.playerid AS t1id, concat(p.playerid, ':', p.playername, ' ') AS t1player, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t1 INNER JOIN ( SELECT p.playerid AS t2id, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t2 ON t1.player_teams = t2.player_teams AND t1.t1id <> t2.t2id ) innerQuery GROUP BY player_teams

But wait, my relational database can do that

• Replication

– Master-slave replication, Read

replication

– Oftentimes includes failover

– ACID? Replicas are eventually

consistent

8

Client

Users Data

Replica 2

Replica 1

Primary

Write Requests Read Requests

But wait, my relational database can do that

• Sharding

– Picking the right shard key is

important

– Pushes a lot of complexity to the

application

– What about JOINs,

aggregations, or transactions

across shards?

– Schema changes are a lot more

“fun”

9

Client

Router

A-F G-M N-T U-Z

Users Data

Holy Complexity, Batman!

10

Client

Router

A-F G-M N-T U-Z

Users Data

11

Why might I think about using something else?

• Volume: I’ve got a lot of data

• Velocity: I need to read/write a lot of data really quickly

• Availability: I need data to always be available for reading/writing

• Geographic Complexity: I need my data in multiple data centers

• Structure: My data isn't best represented as relational

12

Let’s use some computer science

• Dynamo Paper (2007)

– How do we build a data store that is:

• Reliable

• Performant

• Always On

– Everything that’s old is new again

– 24 papers cited

13

Let’s use some computer science

• BigTable Paper (2006)

– Richer data model

– 1 key, lots of values

– Fast, sequential access

– 38 papers cited

14

Let’s use some computer science

• Cassandra (2008)

– Distributed features from Dynamo

– Data model and storage from BigTable

– Became a top-level Apache project in

February 2010

15

How’s it going after six years?

• Wait, Microsoft

Access?!

• Growing

16

DB Engines Rankings (March 2015)

http://db-engines.com/en/ranking

Before you get too excited, Cassandra is not…

• A Data Ocean, Lake, or Pond

• An In-Memory Database

• A Key-Value Store

• A magical database unicorn

17

When to think about using Oracle, SQL Server,

Postgres, <RDBMS>

Loose data model (joins, sub-selects, ad-hoc querying)

Absolute consistency (I need ACID!)

No need to use anything else

You’ll miss the long, candle lit dinners with your Oracle rep that

always ends with, “What’s your budget look like this year?”

18

When to think about using Cassandra

Uptime is a top priority

Unpredictable or high scaling requirements

Workload is transactional

Willing to put the time or effort into understanding how

Cassandra works and how to use it

19

Demo: Working with Cassandra

How is Cassandra different?

What is Cassandra?

• A Linearly Scaling and Fault Tolerant Distributed Database

• Fully Distributed

– Data spread over many nodes

– All nodes participate in a cluster

– All nodes are equal

– No SPOF (shared nothing)

22

What is Cassandra?

• Linearly Scaling

– Have More Data? Add more nodes.

– Need More Throughput? Add more nodes.

23

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

What is Cassandra?

• Fault Tolerant

– Nodes Down != Database Down

– Datacenter Down != Database Down

24

What is Cassandra?

• Fully Replicated

• Clients write local

• Data syncs across WAN

• Replication Factor per DC

25

US Europe

Client

Cassandra and the CAP Theorem

• The CAP Theorem limits what distributed systems can do

• Consistency

• Availability

• Partition Tolerance

• Limits? “Pick 2 out of 3”

• Cassandra is an AP system that is Eventually Consistent

26

Two knobs control Cassandra fault tolerance

• Replication Factor (server side)

– How many copies of the data should exist?

27

Client

B AD

C AB

A CD

D BC

Write A

RF=3

Two knobs control Cassandra fault tolerance

• Consistency Level (client side)

– How many replicas do we need to hear from before we acknowledge?

28

Client

B AD

C AB

A CD

D BC

Write A

CL=QUORUM

Client

B AD

C AB

A CD

D BC

Write A

CL=ONE

Consistency Levels

• Applies to both Reads and Writes (i.e. is set on each query)

• ONE – one replica from any DC

• LOCAL_ONE – one replica from local DC

• QUORUM – 51% of replicas from any DC

• LOCAL_QUORUM – 51% of replicas from local DC

• ALL – all replicas

• TWO

29

Consistency Level and Speed

• How many replicas we need to hear from can affect how quickly

we can read and write data in Cassandra

30

Client

B AD

C AB

A CD

D BC

5 µs ack

300 µs ack

12 µs ack

12 µs ack

Read A

(CL=QUORUM)

Consistency Level and Availability

• Consistency Level choice affects availability

• For example, QUORUM can tolerate one replica being down and

still be available (in RF=3)

31

Client

B AD

C AB

A CD

D BC

A=2

A=2

A=2

Read A

(CL=QUORUM)

Consistency Level and Eventual Consistency

• Cassandra is an AP system that is Eventually Consistent so

replicas may disagree

• Column values are timestamped

• In Cassandra, Last Write Wins (LWW)

32

Client

B AD

C AB

A CD

D BC

A=2

Newer

A=1

Older

A=2

Read A

(CL=QUORUM)

Christos from Netflix: “Eventual Consistency != Hopeful Consistency”

https://www.youtube.com/watch?v=lwIA8tsDXXE

All Your Questions Answered at Cassandra Day!

• Topics we'll cover:

– How is data distributed around the ring?

– Read and write paths in Cassandra and why they're so fast

– CQL, how it's different from SQL

– Data modeling

– Client drivers and interacting with Cassandra from code

33

CREATE TABLE users ( userid uuid, firstname text, lastname text, email text, created_date timestamp, PRIMARY KEY (userid) );

SELECT firstname, lastname FROM users WHERE userid = 99051fe9-6a9c-46c2-b949-38ef78858dd0

An Application for Sharing Videos

Demo: KillrVideo

35

See the Live Demo, Get the Code

• Live demo available at http://www.killrvideo.com

– Written in C#, JavaScript

– Live Demo running in Azure, backed by DataStax Enterprise cluster

– Open source: https://github.com/luketillman/killrvideo-csharp

• Interesting use case because of different data modeling

challenges and the scale of something like YouTube

– More than 1 billion unique users visit YouTube each month

– 100 hours of video are uploaded to YouTube every minute

36

KillrVideo Logical Architecture

Web UI HTML5 / JavaScript

KillrVideo MVC App Serves up Web UI HTML and handles JSON requests from Web UI

Comments

Tracks comments on

videos by users

Uploads

Handles processing,

storing, and encoding

uploaded videos

Video Catalog

Tracks the catalog of

available videos

User Management

User accounts, login

credentials, profiles

Cassandra

Cluster (DSE)

App data storage

for services (e.g.

users, comments)

DataStax

OpsCenter

Management,

provisioning, and

monitoring

Azure Media

Services

Video encoding,

thumbnail

generation

Azure Storage

(Blob, Queue)

Video file and

thumbnail image

storage

Azure Service

Bus

Published events

from services for

interactions

Browser

Server

Services

Infrastructure

Inside a Simple Service: Video Catalog

Video Catalog

Tracks the catalog of

available videos

Cassandra

Cluster (DSE)

App data storage

for services (e.g.

users, comments)

Azure Service

Bus

Published events

from services for

interactions

Inside a Simple Service: Video Catalog

Video Catalog

Tracks the catalog of

available videos

Cassandra

Cluster (DSE)

App data storage

for services (e.g.

users, comments)

• Stores metadata about videos in

Cassandra (e.g. name, description,

location, thumbnail location, etc.)

Inside a Simple Service: Video Catalog

Video Catalog

Tracks the catalog of

available videos

Azure Service

Bus

Published events

from services for

interactions

• Publishes events about interesting things

that happen (e.g. YouTubeVideoAdded,

UploadedVideoAccepted, etc.)

Inside a More Complicated Service: Uploads

Uploads

Handles processing,

storing, and encoding

uploaded videos

Cassandra

Cluster (DSE)

App data storage

for services (e.g.

users, comments)

Azure Storage

(Blob, Queue)

Video file and

thumbnail image

storage

Azure Media

Services

Video encoding,

thumbnail

generation

Azure Service

Bus

Published events

from services for

interactions

Inside a More Complicated Service: Uploads

Uploads

Handles processing,

storing, and encoding

uploaded videos

Cassandra

Cluster (DSE)

App data storage

for services (e.g.

users, comments)

• Stores data about uploaded video file

locations, encoding jobs, job status, etc.

Inside a More Complicated Service: Uploads

Uploads

Handles processing,

storing, and encoding

uploaded videos

Azure Storage

(Blob, Queue)

Video file and

thumbnail image

storage

• Stores original and re-encoded video file

assets, as well as thumbnail preview

images generated

Inside a More Complicated Service: Uploads

Uploads

Handles processing,

storing, and encoding

uploaded videos

Azure Media

Services

Video encoding,

thumbnail

generation

• Re-encodes uploaded videos to format

suitable for the web, generates

thumbnail image previews

Inside a More Complicated Service: Uploads

Uploads

Handles processing,

storing, and encoding

uploaded videos

Azure Service

Bus

Published events

from services for

interactions

• Publishes events about interesting things

that happen (e.g.

UploadedVideoPublished, etc.)

Event Driven Architecture

• Only the application(s)

give commands

• Decoupled: Pub-sub

messaging to tell other

parts of the system

something interesting

happened

• Services could be

deployed, scaled, and

versioned independently

(AKA microservices)

40

Azure Service

Bus

User

Management

Comments

Video

Ratings

Sample Data

Search

Statistics Suggested

Videos

Uploads

Video

Catalog

Event Driven Architecture

• Only the application(s)

give commands

• Decoupled: Pub-sub

messaging to tell other

parts of the system

something interesting

happened

• Services could be

deployed, scaled, and

versioned independently

(AKA microservices)

40

Azure Service

Bus

Search

Suggested

Videos

Video

Catalog

Hey, I added this

new YouTube video

to the catalog!

Event Driven Architecture

• Only the application(s)

give commands

• Decoupled: Pub-sub

messaging to tell other

parts of the system

something interesting

happened

• Services could be

deployed, scaled, and

versioned independently

(AKA microservices)

40

Azure Service

Bus

Search

Suggested

Videos

Video

Catalog

Hey, I added this

new YouTube video

to the catalog!

Time to figure

out what videos

to suggest for

that new video.

Better index that

new video so it

shows up in

search results.

Cassandra and Azure: Languages and Platforms

49

Open

Source

Languages

Platforms

https://github.com/datastax https://github.com/Azure

Notes:

• DataStax also offers a C++ driver

• Over 20% of Azure VMs run Linux

Handling User-Generated Videos

Azure Media Services Concepts

51

Azure Media Services Concepts

51

Asset

Azure Storage

(Blob)

File File File

Asset: container for digital files (video, audio, image, etc.)

Azure Media Services Concepts

51

Asset

Azure Storage

(Blob)

Locator

Locator: URL for accessing an Asset

Azure Media Services Concepts

51

Job

Task Task Task

Job: series of Tasks to be performed

Azure Media Services Concepts

51

Asset

Task Task Task

Task: work that takes input Assets and produces output Assets

Azure Media Services Concepts

51

Media

Processor

Task Task Task

Media Processor: does work of processing Assets

Azure Media Services Concepts

51

Job

Notification

Endpoint

Azure Storage

(Queue)

Notification Endpoint: endpoint to listen for updates to Job status

Demo: Uploading a Video

Uploading the Source Video

• Possible workflow:

– Web UI posts file data to web server (MVC App)

– MVC App streams file data to Upload Service

– Upload Service uses Media Services SDK to stream file

data to Azure Storage

• That's a lot of extra load on the app tier just to

stream file data back to Azure Storage

• Why don't we cut out the middle man?

59

Web UI

Azure Storage

(Blob)

MVC App

Uploads

Service

Uploading the Source Video

60

Web UI

Azure Storage

(Blob)

killrvideo.com

killrvideo-storage.windowsazure.com

Uploading the Source Video

60

Web UI

Azure Storage

(Blob)

Cross domain

requests not allowed

killrvideo.com

killrvideo-storage.windowsazure.com

Uploading the Source Video

60

Web UI

Azure Storage

(Blob)

killrvideo.com

killrvideo-storage.windowsazure.com

CORS Rule: Allow requests

from killrvideo.com

Uploading the Source Video

60

Web UI

Azure Storage

(Blob)

killrvideo.com

killrvideo-storage.windowsazure.com

killrvideo.com is now

an allowed domain

Scaling Out with Cassandra and Azure

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob)

A7 instances

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

A7 instances G3/G4 instances

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

More safety

Less Performance

Less safety

More Performance

A7 instances G3/G4 instances

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

Logical DC1 Logical DC2

Multi-DC Replication

Deploying Cassandra in Azure

• IOPs are super important, choosing can be tricky

Azure Storage

(Blob) SSD SSD

Frequent Snapshots

Demo: Cassandra and Azure

73

Gallery Deployment from Azure Preview Portal

• Configure VM size in the Portal UI, click a button (yes, that easy)

• What you get:

– 8 VMs configured for use as DataStax Enterprise nodes

– 1 VM with OpsCenter

• Decommission any nodes you don't want/need, then use

OpsCenter to deploy Cassandra

– More detailed instructions:

http://www.tonyguid.net/2014/11/Datastax_now_what/

74

Picking a Distribution: Apache Cassandra

• Get the latest bleeding-edge

features

• File JIRAs

• Support via community on

mailing list and IRC

• Perfect for hacking

http://cassandra.apache.org

75

Some Parting Thoughts

• Spending time re-architecting or changing infrastructure to

meet scale challenges doesn't add business value

• Can you make infrastructure and architecture decisions now that

will help you scale in the future?

• Learn more

– Apache Cassandra: http://planetcassandra.org

– DataStax Enterprise, DevCenter, OpsCenter: http://www.datastax.com

– Azure: http://azure.microsoft.com

77

Questions?

Follow me for updates or to ask questions later @LukeTillman Slides: http://www.slideshare.net/LukeTillman

78