How to size up an Apache Cassandra cluster (Training)

Preview:

DESCRIPTION

Learn how to size up an Apache Cassandra cluster by Joe Chu.

Citation preview

How To Size Up A Cassandra ClusterJoe Chu, Technical Trainerjchu@datastax.com

April 2014

©2014 DataStax Confidential. Do not distribute without consent.

2

What is Apache Cassandra?

• Distributed NoSQL database

• Linearly scalable

• Highly available with no single point of failure

• Fast writes and reads

• Tunable data consistency

• Rack and Datacenter awareness

©2014 DataStax Confidential. Do not distribute without consent.

3

Peer-to-peer architecture

• All nodes are the same

• No master / slave architecture

• Less operational overhead for better scalability.

• Eliminates single point of failure, increasing availability.

©2014 DataStax Confidential. Do not distribute without consent.

Master

Slave

Slave

Peer

Peer

PeerPeer

Peer

4

Linear Scalability

• Operation throughput increases linearly with the number of nodes added.

©2014 DataStax Confidential. Do not distribute without consent.

5

Data Replication

• Cassandra can write copies of data on different nodes.

RF = 3

• Replication factor setting determines the number of copies.

• Replication strategy can replicate data to different racks and and different datacenters.

©2014 DataStax Confidential. Do not distribute without consent.

INSERT INTO user_table (id, first_name, last_name) VALUES (1, ‘John’, ‘Smith’); R1

R2

R3

6

Node

• Instance of a running Cassandra process.

• Usually represented a single machine or server.

©2014 DataStax Confidential. Do not distribute without consent.

7

Rack

• Logical grouping of nodes.

• Allows data to be replicated across different racks.

©2014 DataStax Confidential. Do not distribute without consent.

8

Datacenter

• Grouping of nodes and racks.

• Each data center can have separate replication settings.

• May be in different geographical locations, but not always.

©2014 DataStax Confidential. Do not distribute without consent.

9

Cluster

• Grouping of datacenters, racks, and nodes that communicate with each other and replicate data.

• Clusters are not aware of other clusters.

©2014 DataStax Confidential. Do not distribute without consent.

10

Consistency Models

• Immediate consistency

When a write is successful, subsequent reads are guaranteed to return that latest value.

• Eventual consistency

When a write is successful, stale data may still be read but will eventually return the latest value.

©2014 DataStax Confidential. Do not distribute without consent.

11

Tunable Consistency

• Cassandra offers the ability to chose between immediate and eventual consistency by setting a consistency level.

• Consistency level is set per read or write operation.

• Common consistency levels are ONE, QUORUM, and ALL.

• For multi-datacenters, additional levels such as LOCAL_QUORUM and EACH_QUORUM to control cross-datacenter traffic.

©2014 DataStax Confidential. Do not distribute without consent.

12

CL ONE

• Write: Success when at least one replica node has acknowleged the write.

• Read: Only one replica node is given the read request.

©2014 DataStax Confidential. Do not distribute without consent.

R1

R2

R3CoordinatorClient

RF = 3

13

CL QUORUM

• Write: Success when a majority of the replica nodes has acknowledged the write.

• Read: A majority of the nodes are given the read request.

• Majority = ( RF / 2 ) + 1

©2013 DataStax Confidential. Do not distribute without consent.©2014 DataStax Confidential. Do not distribute without consent. 13

R1

R2

R3CoordinatorClient

RF = 3

14

CL ALL

• Write: Success when all of the replica nodes has acknowledged the write.

• Read: All replica nodes are given the read request.

©2013 DataStax Confidential. Do not distribute without consent.©2014 DataStax Confidential. Do not distribute without consent. 14

R1

R2

R3CoordinatorClient

RF = 3

16

Log-Structured Storage Engine

• Cassandra storage engine inspired by Google BigTable

• Key to fast write performance on Cassandra

©2014 DataStax Confidential. Do not distribute without consent.

Memtable

SSTable SSTable SSTable

Commit Log

17

Updates and Deletes

• SSTable files are immutable and cannot be changed.• Updates are written as new data.• Deletes write a tombstone, which mark a row or

column(s) as deleted.• Updates and deletes are just as fast as inserts.

©2014 DataStax Confidential. Do not distribute without consent.

SSTable SSTable SSTable

id:1, first:John, last:Smith

timestamp: …405

id:1, first:John, last:Williams

timestamp: …621

id:1, deleted

timestamp: …999

18

Compaction

• Periodically an operation is triggered that will merge the data in several SSTables into a single SSTable.

• Helps to limits the number of SSTables to read.• Removes old data and tombstones.• SSTables are deleted after compaction

©2014 DataStax Confidential. Do not distribute without consent.

SSTable SSTable SSTable

id:1, first:John, last:Smith

timestamp:405

id:1, first:John, last:Williams

timestamp:621

id:1, deleted

timestamp:999

New SSTable

id:1, deleted

timestamp:999

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

19

Cluster Sizing

©2014 DataStax Confidential. Do not distribute without consent.

20

Cluster Sizing Considerations

• Replication Factor

• Data Size

“How many nodes would I need to store my data set?”

• Data Velocity (Performance)

“How many nodes would I need to achieve my desired throughput?” ©2014 DataStax Confidential. Do not distribute without consent.

21

Choosing a Replication Factor

©2014 DataStax Confidential. Do not distribute without consent.

22

What Are You Using Replication For?

• Durability or Availability?

• Each node has local durability (Commit Log), but replication can be used for distributed durability.

• For availability, a recommended setting is RF=3.

• RF=3 is the minimum necessary to achieve both consistency and availability using QUORUM and LOCAL_QUORUM.

©2014 DataStax Confidential. Do not distribute without consent.

23

How Replication Can Affect Consistency Level

• When RF < 3, you do not have as much flexibility when choosing consistency and availability.

• QUORUM = ALL

©2014 DataStax Confidential. Do not distribute without consent.

R1

R2

CoordinatorClient

RF = 2

24

Using A Larger Replication Factor

• When RF > 3, there is more data usage and higher latency for operations requiring immediate consistency.

• If using eventual consistency, a consistency level of ONE will have consistent performance regardless of the replication factor.

• High availability clusters may use a replication factor as high as 5.

©2014 DataStax Confidential. Do not distribute without consent.

25

Data Size

©2014 DataStax Confidential. Do not distribute without consent.

26

Disk Usage Factors

• Data Size

• Replication Setting

• Old Data

• Compaction

• Snapshots

©2014 DataStax Confidential. Do not distribute without consent.

27

Data Sizing

• Row and Column Data

• Row and Column Overhead

• Indices and Other Structures

©2014 DataStax Confidential. Do not distribute without consent.

28

Replication Overhead

• A replication factor > 1 will effectively multiply your data size by that amount.

©2014 DataStax Confidential. Do not distribute without consent.

RF = 1 RF = 2 RF = 3

29

Old Data

• Updates and deletes do not actually overwrite or delete data.

• Older versions of data and tombstones remain in the SSTable files until they are compacted.

• This becomes more important for heavy update and delete workloads.

©2014 DataStax Confidential. Do not distribute without consent.

30

Compaction

• Compaction needs free disk space to write the new SSTable, before the SSTables being compacted are removed.

• Leave enough free disk space on each node to allow compactions to run.

• Worst case for the Size Tier Compaction Strategy is

50% of the total data capacity of the node.

• For the Leveled Compaction Strategy, that is about 10% of the total data capacity.

©2014 DataStax Confidential. Do not distribute without consent.

31

Snapshots

• Snapshots are hard-links or copies of SSTable data files.

• After SSTables are compacted, the disk space may not be reclaimed if a snapshot of those SSTables were created.

Snapshots are created when:

• Executing the nodetool snapshot command• Dropping a keyspace or table• Incremental backups• During compaction

©2014 DataStax Confidential. Do not distribute without consent.

32

Recommended Disk Capacity

• For current Cassandra versions, the ideal disk capacity is approximate 1TB per node if using spinning disks and 3-5 TB per node using SSDs.

• Having a larger disk capacity may be limited by the resulting performance.

• What works for you is still dependent on your data model design and desired data velocity.

©2014 DataStax Confidential. Do not distribute without consent.

33

Data Velocity (Performance)

©2014 DataStax Confidential. Do not distribute without consent.

34

How to Measure Performance

• I/O Throughput

“How many reads and writes can be completed per second?”

• Read and Write Latency

“How fast should I be able to get a response for my read and write requests?”

©2014 DataStax Confidential. Do not distribute without consent.

35

Sizing for Failure

• Cluster must be sized taking into account the performance impact caused by failure.

• When a node fails, the corresponding workload must be absorbed by the other replica nodes in the cluster.

• Performance is further impacted when recovering a node. Data must be streamed or repaired using the other replica nodes.

©2014 DataStax Confidential. Do not distribute without consent.

36

Hardware Considerations for Performance

CPU• Operations are often CPU-intensive.• More cores are better.

Memory• Cassandra uses JVM heap memory.• Additional memory used as off-heap memory by

Cassandra, or as the OS page cache.

Disk• C* optimized for spinning disks, but SSDs will perform

better.• Attached storage (SAN) is strongly discouraged.

©2014 DataStax Confidential. Do not distribute without consent.

37

Some Final Words…

©2014 DataStax Confidential. Do not distribute without consent.

38

Summary

• Cassandra allows flexibility when sizing your cluster from a single node to thousands of nodes

• Your use case will dictate how you want to size and configure your Cassandra cluster. Do you need availability? Immediate consistency?

• The minimum number of nodes needed will be determined by your data size, desired performance and replication factor.

©2014 DataStax Confidential. Do not distribute without consent.

39

Additional Resources

• DataStax Documentationhttp://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAbout_c.html

• Planet Cassandrahttp://planetcassandra.org/nosql-cassandra-education/

• Cassandra Users Mailing Listuser-subscribe@cassandra.apache.orghttp://mail-archives.apache.org/mod_mbox/cassandra-user/

©2014 DataStax Confidential. Do not distribute without consent.

40

Questions?

Questions?

©2014 DataStax Confidential. Do not distribute without consent.

Thank You

We power the big dataapps that transform business.

41©2014 DataStax Confidential. Do not distribute without consent.

Recommended