64
#CASSANDRAEU Jonathan Ellis Cassandra 2.0 and 2.1 CTO, DataStax

C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

Embed Size (px)

DESCRIPTION

Speaker: Jonathan Ellis, Apache Cassandra Chair & CTO/Co-Founder at DataStax Keynote presentation on Apache Cassandra 2.0 & 2.1 at Cassandra Summit EU 2013

Citation preview

Page 1: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

Jonathan Ellis

Cassandra 2.0 and 2.1

CTO, DataStax

Page 2: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUFive years of Cassandra

Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13

0.1 0.3 0.6 0.7 1.0 1.2...

2.0

DSE

Jul-08

Page 3: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUCore values•Massive scalability•High performance

•Reliability/Availabilty

Cassandra HBase RedisMySQL

Page 4: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

0

20000

40000

60000

80000

0 2 4 6 8 10 12

Cassandra HBase RedisMySQL

NUMBER OF NODES

THRO

UG

HPU

T O

PS/S

EC) CASSANDRA

VLDB benchmark (RWS)

Page 5: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

0

8750

17500

26250

35000

1 2 4 8 16 32

Cassandra HBase MongoDB

CASSANDRA

Endpoint benchmark (RW)TH

ROU

GH

PUT

OPS

/SEC

)

NUMBER OF NODES

Page 6: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

Page 7: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUNew core value•Massive scalability•High performance

•Reliability/Availabilty

•Ease of use

CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int);

CREATE INDEX ON users(state);

SELECT * FROM users WHERE state=‘Texas’ AND birth_date > 1950;

Page 8: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUNative Drivers•CQL native protocol: efficient, lightweight, asynchronous•Java (GA): https://github.com/datastax/java-driver

•.NET (GA): https://github.com/datastax/csharp-driver

•Python (Beta): https://github.com/datastax/python-driver

•Coming soon: PHP, Ruby

Page 9: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUTracingcqlsh:foo> INSERT INTO bar (i, j) VALUES (6, 2);Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9

activity | timestamp | source | source_elapsed-------------------------------------+--------------+-----------+---------------- Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 Message received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63 Applying mutation | 00:02:37,016 | 127.0.0.2 | 220 Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250 Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277 Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378 Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710 Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888 Message received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334 Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550

Page 10: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUAuthentication[cassandra.yaml]authenticator: PasswordAuthenticator# DSE offers KerberosAuthenticator

Page 11: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUAuthentication[cassandra.yaml]authenticator: PasswordAuthenticator# DSE offers KerberosAuthenticator

CREATE USER robinWITH PASSWORD 'manager' SUPERUSER;

ALTER USER cassandraWITH PASSWORD 'newpassword';

LIST USERS;

DROP USER cassandra;

Page 12: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUAuthorization[cassandra.yaml]authorizer: CassandraAuthorizer

GRANT select ON audit TO jonathan;

GRANT modify ON users TO robin;

GRANT all ON ALL KEYSPACES TO lara;

Page 13: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

SELECT * FROM usersWHERE username = ’jbellis’

[empty resultset]

INSERT INTO users (...)VALUES (’jbellis’, ...)

Session 1SELECT * FROM usersWHERE username = ’jbellis’

[empty resultset]

INSERT INTO users (...)VALUES (’jbellis’, ...)

Session 2

Lightweight transactions

Page 14: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUPaxos•All operations are quorum-based•Each replica sends information about unfinished operations to the leader during prepare

•Paxos made Simple

Page 15: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUDetails•4 round trips vs 1 for normal updates•Paxos state is durable

•Immediate consistency with no leader election or failover

•ConsistencyLevel.SERIAL•http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0

Page 17: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

UPDATE USERS SET email = ’[email protected]’, ...WHERE username = ’jbellis’IF email = ’[email protected]’;

INSERT INTO USERS (username, email, ...)VALUES (‘jbellis’, ‘[email protected]’, ... )IF NOT EXISTS;

Syntax

Page 18: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUTriggersCREATE TRIGGER <name> ON <table>USING <classname>;

Page 19: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUTrigger implementationclass MyTrigger implements ITrigger{ public Collection<RowMutation> augment (ByteBuffer key, ColumnFamily update) { ... }}

Page 20: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUExperimental!•Relies on internal RowMutation, ColumnFamily classes•[partition] key is a ByteBuffer

•Expect changes in 2.1

Page 21: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUCursors (before)

SELECT *FROM timelineWHERE (user_id = :last_key AND tweet_id > :last_tweet) OR token(user_id) > token(:last_key)LIMIT 100

CREATE TABLE timeline (  user_id uuid,  tweet_id timeuuid,  tweet_author uuid, tweet_body text,  PRIMARY KEY (user_id, tweet_id));

Page 22: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUCursors (after)SELECT *FROM timeline

Page 23: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOther CQL improvements

Page 24: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOther CQL improvements•SELECT DISTINCT pk

Page 25: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOther CQL improvements•SELECT DISTINCT pk•CREATE TABLE IF NOT EXISTS table

Page 26: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOther CQL improvements•SELECT DISTINCT pk•CREATE TABLE IF NOT EXISTS table

•SELECT ... AS• SELECT event_id, dateOf(created_at) AS creation_date

Page 27: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOther CQL improvements•SELECT DISTINCT pk•CREATE TABLE IF NOT EXISTS table

•SELECT ... AS• SELECT event_id, dateOf(created_at) AS creation_date

•ALTER TABLE DROP column

Page 28: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

Off-HeapNot managed by GC

Java Process

On-HeapManaged by GC

On-Heap/Off-Heap

Page 29: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURead path (per sstable)

Bloomfilter

Memory

Disk

Page 30: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURead path (per sstable)

Bloomfilter

Memory

Disk

Partitionkey cache

Page 31: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURead path (per sstable)

Bloomfilter

Memory

Disk

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 32: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURead path (per sstable)

Bloomfilter

Memory

Disk0X...0X...0X...0X...

Partitionindex

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 33: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURead path (per sstable)

Bloomfilter

Memory

Disk0X...0X...0X...0X...

Partitionindex

Compressionoffsets

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 34: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURead path (per sstable)

Bloomfilter

Memory

Disk0X...0X...0X...0X...

PartitionindexData

Compressionoffsets

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 35: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOff heap in 2.0Partition key bloom filter1-2GB per billion partitions

Bloomfilter

Memory

Disk0X...0X...0X...0X...

PartitionindexData

Compressionoffsets

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 36: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOff heap in 2.0Compression metadata~1-3GB per TB compressed

Bloomfilter

Memory

Disk0X...0X...0X...0X...

PartitionindexData

Compressionoffsets

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 37: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUOff heap in 2.0Partition index summary(depends on rows per partition)

Bloomfilter

Memory

Disk0X...0X...0X...0X...

PartitionindexData

Compressionoffsets

Partitionkey cache

Partitionsummary

0X...0X...0X...

Page 38: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUCompaction•Single-pass, always•LCS performs STCS in L0

Page 39: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUHealthy leveled compaction

L0

L1

L2

L3

L4

L5

Page 40: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUSad leveled compaction

L0

L1

L2

L3

L4

L5

Page 41: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUSTCS in L0

L0

L1

L2

L3

L4

L5

Page 42: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEURapid Read Protection

NONE

Page 43: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

Cassandra 2.1

Page 44: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUUser defined typesCREATE TYPE address (

street text, city text, zip_code int, phones set<text>)

CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, address>)

SELECT id, name, addresses.city, addresses.phones FROM users;

id | name | addresses.city | addresses.phones--------------------+----------------+-------------------------- 63bf691f | jbellis | Austin | {'512-4567', '512-9999'}

Page 45: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUCollection indexingCREATE TABLE songs (

id uuid PRIMARY KEY, artist text, album text, title text, data blob, tags set<text>);

CREATE INDEX song_tags_idx ON songs(tags);

SELECT * FROM songs WHERE 'blues' IN tags;

id | album | artist | tags | title----------+---------------+-------------------+-----------------------+------------------ 5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind

Page 46: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUInefficient bloom filters

+

= ?

Page 47: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

+

=

Inefficient bloom filters

Page 48: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

+

=

Inefficient bloom filters

Page 49: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUInefficient bloom filters

Page 50: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUHyperLogLog applied

Page 51: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUHLL and compaction

Page 52: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUHLL and compaction

Page 53: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUHLL and compaction

Page 54: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 55: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 56: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 57: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 58: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 59: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 60: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 61: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 62: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEUMore-efficient repair

Page 63: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU2.1 roadmap•January 2014

Page 64: C* Summit EU 2013: Keynote by Jonathan Ellis — Cassandra 2.0 & 2.1

#CASSANDRAEU

Questions?