Transcript
Page 1: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

©2013 DataStax Confidential. Do not distribute without consent.

@PatrickMcFadin

Patrick McFadinChief Evangelist/Solution Architect - DataStax

Cassandra 2.0: Better, Stronger, Faster

Thursday, October 3, 13

Page 2: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Five Years of Cassandra

Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13

0.1 0.3 0.6 0.7 1.0 1.2...

2.0

DSE

Jul-08

Thursday, October 3, 13

Page 3: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Cassandra 2.0 - Big new features

Thursday, October 3, 13

Page 4: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

SELECT * FROM usersWHERE username = ’jbellis’

[empty resultset]

Session 1SELECT * FROM usersWHERE username = ’jbellis’

[empty resultset]

Session 2

Lightweight transactions: the problem

INSERT INTO users (username,password)VALUES (’jbellis’,‘xdg44hh’)

INSERT INTO users (userName,password)VALUES (’jbellis’,‘8dhh43k’)

It’s a Race!

Who wins?

Thursday, October 3, 13

Page 5: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 6: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

X

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 7: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

hint X

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 8: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

hint

timeoutresponse

X

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 9: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Paxos• Consensus algorithm• All operations are quorum-based• Each replica sends information about unfinished operations to the leader

during prepare• Paxos made Simple

Thursday, October 3, 13

Page 10: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

LWT: details• 4 round trips vs 1 for normal updates• Paxos state is durable• Immediate consistency with no leader election or failover• ConsistencyLevel.SERIAL• http://www.datastax.com/dev/blog/lightweight-transactions-in-

cassandra-2-0

Thursday, October 3, 13

Page 12: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

UPDATE USERS SET email = ’[email protected]’, ...WHERE username = ’jbellis’IF email = ’[email protected]’;

INSERT INTO USERS (username, email, ...)VALUES (‘jbellis’, ‘[email protected]’, ... )IF NOT EXISTS;

Using LWT

• Don’t overwrite an existing record

• Only update record if condition is met

Thursday, October 3, 13

Page 13: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Triggers

CREATE TRIGGER <name> ON <table> USING <classname>;

DROP TRIGGER <name> ON [<keyspace>.]<table>;

• Executed on the coordinator before mutation• Takes original mutation and adds any new• Jars deployed per server

Thursday, October 3, 13

Page 14: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Trigger implementationclass MyTrigger implements ITrigger{ public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) { ... }}

• You have to implement your own ITrigger (for now)• Compile and deploy to each server

Thursday, October 3, 13

Page 15: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Experimental!• Relies on internal RowMutation, ColumnFamily classes•Not sandboxed. Be careful!• Expect changes in 2.1

Thursday, October 3, 13

Page 16: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

CQL Improvements• ALTER DROP• Remove a field from a CQL table

• Conditional schema changes• Only execute if condition met

CREATE KEYSPACE IF NOT EXISTS ksWITH replication = { 'class': 'SimpleStrategy','replication_factor' : 3 };

CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);

DROP KEYSPACE IF EXISTS ks;

ALTER TABLE users DROP address3;

Thursday, October 3, 13

Page 17: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

CQL Improvements• Aliases in SELECT

• Limit and TTL in prepared statements

SELECT event_id, dateOf(created_at) AS creation_date, blobAsText(content) AS content FROM timeline;

event_id | creation_date | content-------------------------+--------------------------+---------------------- 550e8400-e29b-41d4-a716 | 2013-07-26 10:44:33+0200 | Something happened!?

SELECT * FROM myTable LIMIT ?;

UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';

Thursday, October 3, 13

Page 18: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Cassandra 2.0 - Minor features

Thursday, October 3, 13

Page 19: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Query performance •Hint when reading time series data• Time series slices find data faster

•Hybrid approach to Leveled Compaction under stress• Use size tiered until we catch up• Reduce read latency impact

• Off-heap memory speedup• Bytes moved on and off 10x faster

• Removal of row-level bloom filtersThursday, October 3, 13

Page 20: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Server performance• Single pass compaction• No more incremental compaction for large storage rows

• LMAX Disruptor on Thrift interface• Crazy fast and efficient concurrent threads. Faster HSHA

• Support for pluggable off-heap memory allocators• JEMalloc support to start. Faster memory access.

• Bigger Level 0 file size• 5M was just too small. Now 256M

Thursday, October 3, 13

Page 21: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Removed features• SuperColumns are gone!• Not the API just the underlying implementation

• On-heap row cache• Row cache is no longer an option in the JVM

•Memory pressure relief valves - Gone from yaml• flush_largest_memtables_at

• reduce_cache_sizes_at

• reduce_cache_sizes_to

Thursday, October 3, 13

Page 22: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Operation Changes• JDK 7 now required

• Vnodes are default

• Streaming overhaul• Control. Streams are grouped and broken into plans• Traceability. Each stream has an ID. Monitor each stream.

• Performance. Streams are now pipelined. No waiting for ACK

Thursday, October 3, 13

Page 23: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

Thank you!

Apache Cassandra 2.0 - Data model on fire

Next talk in my data model series!

Thursday, October 3, 13

Page 24: Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger

©2013 DataStax Confidential. Do not distribute without consent. 21Thursday, October 3, 13