40
How To Size Up A Cassandra Cluster Joe Chu, Technical Trainer [email protected] April 2014 ©2014 DataStax Confidential. Do not distribute without consent.

How to size up an Apache Cassandra cluster (Training)

Embed Size (px)

DESCRIPTION

Learn how to size up an Apache Cassandra cluster by Joe Chu.

Citation preview

Page 1: How to size up an Apache Cassandra cluster (Training)

How To Size Up A Cassandra ClusterJoe Chu, Technical [email protected]

April 2014

©2014 DataStax Confidential. Do not distribute without consent.

Page 2: How to size up an Apache Cassandra cluster (Training)

2

What is Apache Cassandra?

• Distributed NoSQL database

• Linearly scalable

• Highly available with no single point of failure

• Fast writes and reads

• Tunable data consistency

• Rack and Datacenter awareness

©2014 DataStax Confidential. Do not distribute without consent.

Page 3: How to size up an Apache Cassandra cluster (Training)

3

Peer-to-peer architecture

• All nodes are the same

• No master / slave architecture

• Less operational overhead for better scalability.

• Eliminates single point of failure, increasing availability.

©2014 DataStax Confidential. Do not distribute without consent.

Master

Slave

Slave

Peer

Peer

PeerPeer

Peer

Page 4: How to size up an Apache Cassandra cluster (Training)

4

Linear Scalability

• Operation throughput increases linearly with the number of nodes added.

©2014 DataStax Confidential. Do not distribute without consent.

Page 5: How to size up an Apache Cassandra cluster (Training)

5

Data Replication

• Cassandra can write copies of data on different nodes.

RF = 3

• Replication factor setting determines the number of copies.

• Replication strategy can replicate data to different racks and and different datacenters.

©2014 DataStax Confidential. Do not distribute without consent.

INSERT INTO user_table (id, first_name, last_name) VALUES (1, ‘John’, ‘Smith’); R1

R2

R3

Page 6: How to size up an Apache Cassandra cluster (Training)

6

Node

• Instance of a running Cassandra process.

• Usually represented a single machine or server.

©2014 DataStax Confidential. Do not distribute without consent.

Page 7: How to size up an Apache Cassandra cluster (Training)

7

Rack

• Logical grouping of nodes.

• Allows data to be replicated across different racks.

©2014 DataStax Confidential. Do not distribute without consent.

Page 8: How to size up an Apache Cassandra cluster (Training)

8

Datacenter

• Grouping of nodes and racks.

• Each data center can have separate replication settings.

• May be in different geographical locations, but not always.

©2014 DataStax Confidential. Do not distribute without consent.

Page 9: How to size up an Apache Cassandra cluster (Training)

9

Cluster

• Grouping of datacenters, racks, and nodes that communicate with each other and replicate data.

• Clusters are not aware of other clusters.

©2014 DataStax Confidential. Do not distribute without consent.

Page 10: How to size up an Apache Cassandra cluster (Training)

10

Consistency Models

• Immediate consistency

When a write is successful, subsequent reads are guaranteed to return that latest value.

• Eventual consistency

When a write is successful, stale data may still be read but will eventually return the latest value.

©2014 DataStax Confidential. Do not distribute without consent.

Page 11: How to size up an Apache Cassandra cluster (Training)

11

Tunable Consistency

• Cassandra offers the ability to chose between immediate and eventual consistency by setting a consistency level.

• Consistency level is set per read or write operation.

• Common consistency levels are ONE, QUORUM, and ALL.

• For multi-datacenters, additional levels such as LOCAL_QUORUM and EACH_QUORUM to control cross-datacenter traffic.

©2014 DataStax Confidential. Do not distribute without consent.

Page 12: How to size up an Apache Cassandra cluster (Training)

12

CL ONE

• Write: Success when at least one replica node has acknowleged the write.

• Read: Only one replica node is given the read request.

©2014 DataStax Confidential. Do not distribute without consent.

R1

R2

R3CoordinatorClient

RF = 3

Page 13: How to size up an Apache Cassandra cluster (Training)

13

CL QUORUM

• Write: Success when a majority of the replica nodes has acknowledged the write.

• Read: A majority of the nodes are given the read request.

• Majority = ( RF / 2 ) + 1

©2013 DataStax Confidential. Do not distribute without consent.©2014 DataStax Confidential. Do not distribute without consent. 13

R1

R2

R3CoordinatorClient

RF = 3

Page 14: How to size up an Apache Cassandra cluster (Training)

14

CL ALL

• Write: Success when all of the replica nodes has acknowledged the write.

• Read: All replica nodes are given the read request.

©2013 DataStax Confidential. Do not distribute without consent.©2014 DataStax Confidential. Do not distribute without consent. 14

R1

R2

R3CoordinatorClient

RF = 3

Page 15: How to size up an Apache Cassandra cluster (Training)

16

Log-Structured Storage Engine

• Cassandra storage engine inspired by Google BigTable

• Key to fast write performance on Cassandra

©2014 DataStax Confidential. Do not distribute without consent.

Memtable

SSTable SSTable SSTable

Commit Log

Page 16: How to size up an Apache Cassandra cluster (Training)

17

Updates and Deletes

• SSTable files are immutable and cannot be changed.• Updates are written as new data.• Deletes write a tombstone, which mark a row or

column(s) as deleted.• Updates and deletes are just as fast as inserts.

©2014 DataStax Confidential. Do not distribute without consent.

SSTable SSTable SSTable

id:1, first:John, last:Smith

timestamp: …405

id:1, first:John, last:Williams

timestamp: …621

id:1, deleted

timestamp: …999

Page 17: How to size up an Apache Cassandra cluster (Training)

18

Compaction

• Periodically an operation is triggered that will merge the data in several SSTables into a single SSTable.

• Helps to limits the number of SSTables to read.• Removes old data and tombstones.• SSTables are deleted after compaction

©2014 DataStax Confidential. Do not distribute without consent.

SSTable SSTable SSTable

id:1, first:John, last:Smith

timestamp:405

id:1, first:John, last:Williams

timestamp:621

id:1, deleted

timestamp:999

New SSTable

id:1, deleted

timestamp:999

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Page 18: How to size up an Apache Cassandra cluster (Training)

19

Cluster Sizing

©2014 DataStax Confidential. Do not distribute without consent.

Page 19: How to size up an Apache Cassandra cluster (Training)

20

Cluster Sizing Considerations

• Replication Factor

• Data Size

“How many nodes would I need to store my data set?”

• Data Velocity (Performance)

“How many nodes would I need to achieve my desired throughput?” ©2014 DataStax Confidential. Do not distribute without consent.

Page 20: How to size up an Apache Cassandra cluster (Training)

21

Choosing a Replication Factor

©2014 DataStax Confidential. Do not distribute without consent.

Page 21: How to size up an Apache Cassandra cluster (Training)

22

What Are You Using Replication For?

• Durability or Availability?

• Each node has local durability (Commit Log), but replication can be used for distributed durability.

• For availability, a recommended setting is RF=3.

• RF=3 is the minimum necessary to achieve both consistency and availability using QUORUM and LOCAL_QUORUM.

©2014 DataStax Confidential. Do not distribute without consent.

Page 22: How to size up an Apache Cassandra cluster (Training)

23

How Replication Can Affect Consistency Level

• When RF < 3, you do not have as much flexibility when choosing consistency and availability.

• QUORUM = ALL

©2014 DataStax Confidential. Do not distribute without consent.

R1

R2

CoordinatorClient

RF = 2

Page 23: How to size up an Apache Cassandra cluster (Training)

24

Using A Larger Replication Factor

• When RF > 3, there is more data usage and higher latency for operations requiring immediate consistency.

• If using eventual consistency, a consistency level of ONE will have consistent performance regardless of the replication factor.

• High availability clusters may use a replication factor as high as 5.

©2014 DataStax Confidential. Do not distribute without consent.

Page 24: How to size up an Apache Cassandra cluster (Training)

25

Data Size

©2014 DataStax Confidential. Do not distribute without consent.

Page 25: How to size up an Apache Cassandra cluster (Training)

26

Disk Usage Factors

• Data Size

• Replication Setting

• Old Data

• Compaction

• Snapshots

©2014 DataStax Confidential. Do not distribute without consent.

Page 26: How to size up an Apache Cassandra cluster (Training)

27

Data Sizing

• Row and Column Data

• Row and Column Overhead

• Indices and Other Structures

©2014 DataStax Confidential. Do not distribute without consent.

Page 27: How to size up an Apache Cassandra cluster (Training)

28

Replication Overhead

• A replication factor > 1 will effectively multiply your data size by that amount.

©2014 DataStax Confidential. Do not distribute without consent.

RF = 1 RF = 2 RF = 3

Page 28: How to size up an Apache Cassandra cluster (Training)

29

Old Data

• Updates and deletes do not actually overwrite or delete data.

• Older versions of data and tombstones remain in the SSTable files until they are compacted.

• This becomes more important for heavy update and delete workloads.

©2014 DataStax Confidential. Do not distribute without consent.

Page 29: How to size up an Apache Cassandra cluster (Training)

30

Compaction

• Compaction needs free disk space to write the new SSTable, before the SSTables being compacted are removed.

• Leave enough free disk space on each node to allow compactions to run.

• Worst case for the Size Tier Compaction Strategy is

50% of the total data capacity of the node.

• For the Leveled Compaction Strategy, that is about 10% of the total data capacity.

©2014 DataStax Confidential. Do not distribute without consent.

Page 30: How to size up an Apache Cassandra cluster (Training)

31

Snapshots

• Snapshots are hard-links or copies of SSTable data files.

• After SSTables are compacted, the disk space may not be reclaimed if a snapshot of those SSTables were created.

Snapshots are created when:

• Executing the nodetool snapshot command• Dropping a keyspace or table• Incremental backups• During compaction

©2014 DataStax Confidential. Do not distribute without consent.

Page 31: How to size up an Apache Cassandra cluster (Training)

32

Recommended Disk Capacity

• For current Cassandra versions, the ideal disk capacity is approximate 1TB per node if using spinning disks and 3-5 TB per node using SSDs.

• Having a larger disk capacity may be limited by the resulting performance.

• What works for you is still dependent on your data model design and desired data velocity.

©2014 DataStax Confidential. Do not distribute without consent.

Page 32: How to size up an Apache Cassandra cluster (Training)

33

Data Velocity (Performance)

©2014 DataStax Confidential. Do not distribute without consent.

Page 33: How to size up an Apache Cassandra cluster (Training)

34

How to Measure Performance

• I/O Throughput

“How many reads and writes can be completed per second?”

• Read and Write Latency

“How fast should I be able to get a response for my read and write requests?”

©2014 DataStax Confidential. Do not distribute without consent.

Page 34: How to size up an Apache Cassandra cluster (Training)

35

Sizing for Failure

• Cluster must be sized taking into account the performance impact caused by failure.

• When a node fails, the corresponding workload must be absorbed by the other replica nodes in the cluster.

• Performance is further impacted when recovering a node. Data must be streamed or repaired using the other replica nodes.

©2014 DataStax Confidential. Do not distribute without consent.

Page 35: How to size up an Apache Cassandra cluster (Training)

36

Hardware Considerations for Performance

CPU• Operations are often CPU-intensive.• More cores are better.

Memory• Cassandra uses JVM heap memory.• Additional memory used as off-heap memory by

Cassandra, or as the OS page cache.

Disk• C* optimized for spinning disks, but SSDs will perform

better.• Attached storage (SAN) is strongly discouraged.

©2014 DataStax Confidential. Do not distribute without consent.

Page 36: How to size up an Apache Cassandra cluster (Training)

37

Some Final Words…

©2014 DataStax Confidential. Do not distribute without consent.

Page 37: How to size up an Apache Cassandra cluster (Training)

38

Summary

• Cassandra allows flexibility when sizing your cluster from a single node to thousands of nodes

• Your use case will dictate how you want to size and configure your Cassandra cluster. Do you need availability? Immediate consistency?

• The minimum number of nodes needed will be determined by your data size, desired performance and replication factor.

©2014 DataStax Confidential. Do not distribute without consent.

Page 38: How to size up an Apache Cassandra cluster (Training)

39

Additional Resources

• DataStax Documentationhttp://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAbout_c.html

• Planet Cassandrahttp://planetcassandra.org/nosql-cassandra-education/

• Cassandra Users Mailing [email protected]://mail-archives.apache.org/mod_mbox/cassandra-user/

©2014 DataStax Confidential. Do not distribute without consent.

Page 39: How to size up an Apache Cassandra cluster (Training)

40

Questions?

Questions?

©2014 DataStax Confidential. Do not distribute without consent.

Page 40: How to size up an Apache Cassandra cluster (Training)

Thank You

We power the big dataapps that transform business.

41©2014 DataStax Confidential. Do not distribute without consent.