59
Cassandra - lesson learned Andrzej Ludwikowski

Cassandra - lesson learned

Embed Size (px)

Citation preview

Page 1: Cassandra  - lesson learned

Cassandra - lesson learned

Andrzej Ludwikowski

Page 2: Cassandra  - lesson learned

About me?- http://aludwikowski.blogspot.com/- https://github.com/aludwiko- @aludwikowski- SoftwareMill

Page 3: Cassandra  - lesson learned

Why cassandra?- BigData!!!

- Volume (petabytes of data, trillions of entities)- Velocity (real-time, streams, millions of transactions per second)- Variety (un-, semi-, structured)

- Near-linear horizontal scaling (in proper use cases)- Fully distributed, with no single point of failure

- Data replication- By default

Page 4: Cassandra  - lesson learned

What is cassandra vs CAP?- CAP Theorem - pick two

Page 5: Cassandra  - lesson learned

What is cassandra vs CAP?- CAP Theorem - pick two

Page 6: Cassandra  - lesson learned

What is cassandra vs CAP?- CAP Theorem - pick two

Page 7: Cassandra  - lesson learned

Origins?

2010

Page 8: Cassandra  - lesson learned

Name?

Page 9: Cassandra  - lesson learned

Name?

Page 10: Cassandra  - lesson learned

Write path

Node 1

Node 2

Node 3

Node 4

Client (driver)

Page 11: Cassandra  - lesson learned

Write path

Node 1

Node 2

Node 3

Node 4

Client (driver)

- Any node can coordinate any request (NSPOF)

Page 12: Cassandra  - lesson learned

- Any node can coordinate any request (NSPOF)- Replication Factor

Write path

Node 1

Node 2

Node 3

Node 4

Client

RF=3

Page 13: Cassandra  - lesson learned

- Any node can coordinate any request (NSPOF)- Replication Factor- Consistency Level

Write path

Node 1

Node 2

Node 3

Node 4

Client

RF=3

CL=2

Page 14: Cassandra  - lesson learned

- Token ring from -2^63 to 2^64

Write path - consistent hashing

Node 1

Node 2

Node 3

Node 4

0100

Page 15: Cassandra  - lesson learned

- Token ring from -2^63 to 2^64 - Partitioner: partition key -> token

Write path - consistent hashing

Node 1

Node 2

Node 3

Node 4

Client

Partitioner

0-25

25-5051-75

76-10077

Page 16: Cassandra  - lesson learned

- Token ring from -2^63 to 2^64 - Partitioner: primary key -> token

Write path - consistent hashing

Node 1

Node 2

Node 3

Node 4

Client

Partitioner

0-25

25-5051-75

76-100

77

Page 17: Cassandra  - lesson learned

- Token ring from -2^63 to 2^64 - Partitioner: primary key -> token

Write path - consistent hashing

Node 1

Node 2

Node 3

Node 4

Client

Partitioner

0-25

25-5051-75

76-100

77

77

77

Page 18: Cassandra  - lesson learned

- Token ring from -2^63 to 2^64 - Partitioner: primary key -> token

Write path - consistent hashing

Node 1

Node 2

Node 3

Node 4

Client

0-25

Partitioner

77

25-5051-75

76-100

77

77

Page 19: Cassandra  - lesson learned

DEMO

Page 20: Cassandra  - lesson learned

Write path - problems?

Node 1

Node 2

Node 3

Node 4

Client

0-2577

25-5051-75

76-100

77

77

Page 21: Cassandra  - lesson learned

- Hinted handoff

Write path - problems?

Node 1

Node 2

Node 3

Node 4

Client

0-2577

25-5051-75

76-100

77

77

Page 22: Cassandra  - lesson learned

- Hinted handoff- Retry idempotent inserts

- build-in policies

Write path - problems?

Node 1

Node 2

Node 3

Node 4

Client

0-2577

25-5051-75

76-100

77

77

Page 23: Cassandra  - lesson learned

- Hinted handoff- Retry idempotent inserts

- build-in policies

- Lightweight transactions (Paxos)

Write path - problems?

Node 1

Node 2

Node 3

Node 4

Client

0-2577

25-5051-75

76-100

77

77

Page 24: Cassandra  - lesson learned

- Hinted handoff- Retry idempotent inserts

- build-in policies

- Lightweight transactions (Paxos)- Batches

Write path - problems?

Node 1

Node 2

Node 3

Node 4

Client

0-2577

25-5051-75

76-100

77

77

Page 25: Cassandra  - lesson learned

Write path - node level

Page 26: Cassandra  - lesson learned

Write path - why so fast?- Commit log - append only

Page 27: Cassandra  - lesson learned

Write path - why so fast?

Page 28: Cassandra  - lesson learned

Write path - why so fast?

50,000 t/s 50 t/ms 5 t/100us 1 t/20us

Page 29: Cassandra  - lesson learned

Write path - why so fast?- Commit log - append only- Periodic (10s) or batch sync to disk

Node 1

Node 2

Node 3

Node 4

Client

RF=2

CL=2

Page 30: Cassandra  - lesson learned

Dasdd Rack 2

Rack 1

Write path - why so fast?- Commit log - append only- Periodic or batch sync to disk- Network topology aware

Node 1

Node 2

Node 3

Node 4

Client

RF=2

CL=2

Page 31: Cassandra  - lesson learned

Write path - why so fast?

Client

- Commit log - append only- Periodic or batch sync to disk- Network topology aware

Asia DC

Europe DC

Page 32: Cassandra  - lesson learned

- Most recent win- Eager retries- In-memory

- MemTable- Row Cache- Bloom Filters- Key Caches- Partition Summaries

- On disk- Partition Indexes- SSTables

Node 1

Node 2

Node 3

Node 4

Client

RF=3

CL=3

Read path

timestamp 67

timestamp 99

timestamp 88

Page 33: Cassandra  - lesson learned

Immediate vs. Eventual Consistency- if (writeCL + readCL) > replication_factor then immediate consistency- writeCL=ALL, readCL=1- writeCL=1, readCL=ALL- writeCL,readCL=QUORUM- If "stale" is measured in milliseconds,

how much are those milliseconds worth?

Node 1

Node 2

Node 3

Node 4

Client

RF=3

Page 34: Cassandra  - lesson learned

Modeling - new mindset- QDD, Query Driven Development- Nesting is ok- Duplication is ok- Writes are cheap

Page 35: Cassandra  - lesson learned

QDD - Conceptual model- Technology independent- Chen notation

Page 36: Cassandra  - lesson learned

QDD - Application workflow

Page 37: Cassandra  - lesson learned

QDD - Logical model

- Chebotko diagram

Page 38: Cassandra  - lesson learned

QDD - Physical model

- Technology dependent- Analysis and validation (finding problems)- Physical optimization (fixing problems)- Data types

Page 39: Cassandra  - lesson learned

Physical storage

- Primary key- Partition key

CREATE TABLE videos ( id int, title text, runtime int, year int, PRIMARY KEY (id));

id | title | runtime | year----+---------------------+---------+------ 1 | dzien swira | 93 | 2002 2 | chlopaki nie placza | 96 | 2000 3 | psy | 104 | 1992 4 | psy 2 | 96 | 1994

1title runtime year

dzien swira 93 2002

2title runtime year

chlopaki... 96 2000

3title runtime year

psy 104 1992

4title runtime year

psy 2 96 1994

SELECT FROM videosWHERE title = ‘dzien swira’

Page 40: Cassandra  - lesson learned

Physical storage

CREATE TABLE videos_with_clustering ( title text, runtime int, year int, PRIMARY KEY ((title), year));

- Primary key (could be compound)- Partition key- Clustering column (order, uniqueness)

title | year | runtime-------------+------+--------- godzilla | 1954 | 98 godzilla | 1998 | 140 godzilla | 2014 | 123 psy | 1992 | 104

godzilla1954 runtime

98

1998 runtime

140

2014 runtime

123

1992 runtime

104psy

SELECT FROM videos_with_clusteringWHERE title = ‘godzilla’

Page 41: Cassandra  - lesson learned

Physical storage

CREATE TABLE videos_with_composite_pk( title text, runtime int, year int, PRIMARY KEY ((title, year)));

- Primary key (could be compound)- Partition key (could be composite)- Clustering column (order, uniqueness)

title | year | runtime-------------+------+--------- godzilla | 1954 | 98 godzilla | 1998 | 140 godzilla | 2014 | 123 psy | 1992 | 104

godzilla:1954runtime

93

godzilla:1998runtime

140

godzilla:2014runtime

123

psy:1992runtime

104

SELECT FROM videos_with_composite_pkWHERE title = ‘godzilla’AND year = 1954

Page 42: Cassandra  - lesson learned

Modeling - clustering column(s)

Q: Retrieve videos an actor has appeared in (newest first).

Page 43: Cassandra  - lesson learned

Modeling - clustering column(s)

CREATE TABLE videos_by_actor ( actor text, added_date timestamp, video_id timeuuid, character_name text, description text, encoding frozen<video_encoding>, tags set<text>, title text, user_id uuid, PRIMARY KEY ( )) WITH CLUSTERING ORDER BY ( );

Q: Retrieve videos an actor has appeared in (newest first).

Page 44: Cassandra  - lesson learned

Modeling - clustering column(s)

CREATE TABLE videos_by_actor ( actor text, added_date timestamp, video_id timeuuid, character_name text, description text, encoding frozen<video_encoding>, tags set<text>, title text, user_id uuid, PRIMARY KEY ((actor), added_date)) WITH CLUSTERING ORDER BY (added_date desc);

Q: Retrieve videos an actor has appeared in (newest first).

Page 45: Cassandra  - lesson learned

Modeling - clustering column(s)

CREATE TABLE videos_by_actor ( actor text, added_date timestamp, video_id timeuuid, character_name text, description text, encoding frozen<video_encoding>, tags set<text>, title text, user_id uuid, PRIMARY KEY ((actor), added_date, video_id)) WITH CLUSTERING ORDER BY (added_date desc);

Q: Retrieve videos an actor has appeared in (newest first).

Page 46: Cassandra  - lesson learned

Modeling - clustering column(s)

CREATE TABLE videos_by_actor ( actor text, added_date timestamp, video_id timeuuid, character_name text, description text, encoding frozen<video_encoding>, tags set<text>, title text, user_id uuid, PRIMARY KEY ((actor), added_date, video_id, character_name)) WITH CLUSTERING ORDER BY (added_date desc);

Q: Retrieve videos an actor has appeared in (newest first).

Page 47: Cassandra  - lesson learned

Modeling - compound partition key

CREATE TABLE temperature_by_day ( weather_station_id text, date text, event_time timestamp, temperature text PRIMARY KEY ( )) WITH CLUSTERING ORDER BY ( );

Q: Retrieve last 1000 measurement from given day.

Page 48: Cassandra  - lesson learned

Modeling - compound partition key

CREATE TABLE temperature_by_day ( weather_station_id text, date text, event_time timestamp, temperature text PRIMARY KEY ((weather_station_id), date, event_time)) WITH CLUSTERING ORDER BY (event_time desc);

Q: Retrieve last 1000 measurement from given day.

1 day = 86 400 rows1 week = 604 800 rows1 month = 2 592 000 rows1 year = 31 536 000 rows

Page 49: Cassandra  - lesson learned

Modeling - compound partition key

CREATE TABLE temperature_by_day ( weather_station_id text, date text, event_time timestamp, temperature text PRIMARY KEY ((weather_station_id, date), event_time)) WITH CLUSTERING ORDER BY (event_time desc);

Q: Retrieve last 1000 measurement from given day.

Page 50: Cassandra  - lesson learned

Modeling - TTL

CREATE TABLE temperature_by_day ( weather_station_id text, date text, event_time timestamp, temperature text PRIMARY KEY ((weather_station_id, date), event_time)) WITH CLUSTERING ORDER BY (event_time desc);

Retention policy - keep data only from last week.

INSERT INTO temperature_by_day … USING TTL 604800;

Page 51: Cassandra  - lesson learned

Modeling - bit map index

CREATE TABLE car ( year timestamp, model text, color timestamp, vehicle_id int, //other columns PRIMARY KEY ((year, model, color), vehicle_id));

Q: Find car by year and/or model and/or color.

INSERT INTO car (year, model, color, vehicle_id, ...) VALUES (2000, 'Multipla', 'blue', 13, ...);INSERT INTO car (year, model, color, vehicle_id, ...) VALUES (2000, 'Multipla', '', 13, ...);INSERT INTO car (year, model, color, vehicle_id, ...) VALUES (2000, '', 'blue', 13, ...);INSERT INTO car (year, model, color, vehicle_id, ...) VALUES (2000, '', '', 13, ...);INSERT INTO car (year, model, color, vehicle_id, ...) VALUES ('', 'Multipla', 'blue', 13, ...);INSERT INTO car (year, model, color, vehicle_id, ...) VALUES ('', 'Multipla', '', 13, ...);INSERT INTO car (year, model, color, vehicle_id, ...) VALUES ('', '', 'blue', 13, ...);

SELECT * FROM car WHERE year=2000 and model=’’ and color=’blue’;

Page 52: Cassandra  - lesson learned

Modeling - wide rows

CREATE TABLE user ( email text, name text, age int, PRIMARY KEY (email));

Q: Find user by email.

Page 53: Cassandra  - lesson learned

Modeling - wide rows

CREATE TABLE user ( domain text, user text, name text, age int, PRIMARY KEY ((domain), user));

Q: Find user by email.

Page 54: Cassandra  - lesson learned

Modeling - versioning with lightweight transactions

CREATE TABLE document ( id text, content text, version int, locked_by text, PRIMARY KEY ((id)));

INSERT INTO document (id, content , version ) VALUES ( 'my doc', 'some content', 1) IF NOT EXISTS;

UPDATE document SET locked_by = 'andrzej' WHERE id = 'my doc' IF locked_by = null;

UPDATE document SET content = 'better content', version = 2, locked_by = null WHERE id = 'my doc' IF locked_by = 'andrzej';

Page 55: Cassandra  - lesson learned

Modeling - JSON with UDT and tuples

{"title": "Example Schema","type": "object","properties": {

"firstName": “andrzej”,"lastName": “ludwikowski”,"age": {

"description": "Age in years","type": "integer","minimum": 0

}},“x_dimension”: “1”,

“y_dimension”: “2”,}

CREATE TYPE age ( description text, type int, minimum int);

CREATE TYPE prop ( firstName text, lastName text, age frozen <age>);

CREATE TABLE json ( title text, type text, properties list<frozen <prop>>, dimensions tuple<int, int> PRIMARY KEY (title));

Page 56: Cassandra  - lesson learned

Common use cases

- Sensor data (Zonar)- Fraud detection (Barracuda)- Playlist and collections (Spotify)- Personalization and recommendation engines (Ebay)- Messaging (Instagram)

Page 57: Cassandra  - lesson learned

Common anti use cases

- Queue- Search engine

Page 58: Cassandra  - lesson learned
Page 59: Cassandra  - lesson learned