Cassandra Data Modeling - Practical Considerations @ Netflix

Cassandra Data modeling

Practical considerations

Nitish Korla

Why Cassandra? High Availability / Fully distributed Scalability (Linear) Write performance Simple to install and operate Multi-region replication support (bi-directional)

Cassandra footprint @ Netflix

• 60+ Cassandra clusters• 1600+ nodes holding 100+ TB data• AWS 500 IOPS -> 100, 000 IOPS• Streaming data completely persisted in Cassandra

• Related Open Source Projects– Cassandra/Astyanax : in-house committer– Priam : Cassandra Automation– Test Tools : jmeter– http://github.com/netflix

http://github.com/netflix

http://github.com/netflix

Data Modelkeyspace

column family

Rowcolumn• name• value• timestamp

Cassandra RDBMS Equivalent

KEYSPACE DATABASE/SCHEMA

COLUMN FAMILY TABLE

ROW ROW

FLEXIBLE COLUMNS DEFINED COLUMNS

Data ModelColumns sorted by comparator

name

356Paul

group

34567

sex

male

name

54kim

group

34566

sex

female

US:CA:Fremont

54353US:CA:Hayward

34343

status

single

zip

94538

rows

Composite columns US:CA:San Jose

987556population

Columns sorted by composite comparators

Do your Homework

①Understand your application requirements

② Identify your access patterns

③ Model around these access patterns

④ Denormalization is your new friend but…

⑤ Benchmark – Avoid Surprises

Cost of getting it wrong is high

Example 1 : Edge Service

Edge Services Data Model

alloc/xyz/jkl_1

000

active

yes

script

text

alloc/xyl/jkl_2

111

active

yes

script

text

alloc/xyl/jkl_3

222

active

yes

script

text

ROWID ALLOCATION ACTIVE SCRIPT

Script_location_version 000 YES OR NO

EDGE SERVICECLUSTER

Edge Service Anti patterns

• High concurrency: Edge servers auto scale• Range scans: Read all data• Large payload: ~1MB of data

Very high read latency / unstable cassandra

Solution: inverted index

scripts

client

1

2

alloc/xyz/jkl_1

000

active

yes

script

text

alloc/xyl/jkl_2

111

active

yes

script

text

alloc/xyl/tml_3

222

active

yes

script

text

/xyz/jklIndex_1

1

/xyz/jzp

2

/xyz/plm

1

/xyz/tml

3

/xyz/urs

1

/xyz/zjkl

2

Script_index

Inverted Index considerations

• Column name can be used a row key placeholder

• Hotspots!!• Sharding

Other possible improvement

• Textual Data• Think compression

Upcoming features- Hadoop integration

- Solr

Example 2: Ratings

RDBMS -> CASSANDRAuser

id (primary key)

name

alias

email

movie

id (primary key)

title

description

user_movie_rating

id (primary key)

userId (foreign key)

movieId (foreign key)

rating

1 ∞ 1∞

QueriesGet email of userid 123Get title and description of movieId 222 List all movie names and corresponding ratings for userId 123 List all users and corresponding rating for movieId 222

CASSANDRA MODEL

123222:rating 222:title 534:rating 534:title 888:rating 888:title

4 rockstar 2 Finding Nemo

1 Top Guns

movieId

userId

rating222

334 455 544 633 789 999

2 5 1 2 2 3

123name alias email

Nitish Korla buckwild [email protected]

223title description

Find Nemo Good luck with that

movie

ratingsByMovie

ratingsByUser

userId

Sequence?

Example 3 : Viewing History

Viewing History

ROWID 1234454545 : 5466Format<Timeuuid> : <movieid>

1234454545 : 5466 1234454545 : 5466

1234454545 : 5466

Subscriber_id Playback/Bookmark related SERRIALED DATA

Playback/Bookmark related SERRIALED DATA



3454545_5634534

JSON

3454546_5

JSON

3454547_5

JSON

3454555_9

JSON

3454560_9

JSON

3454580_9

JSON

454545_5654534

JSON

4454546_5

JSON

4454547_5

JSON

4454555_9

JSON

5554560_9

JSON

5554580_9

JSON

3454545_56

9545 JSON

3454546_5

JSON

3454547_5

JSON

3454555_9

JSON

3454560_9

JSON

3454580_9

JSON

3454545_564354

JSON

3454546_5

JSON

3454547_5

JSON

3454555_9

JSON

3454560_9

JSON

3454580_9

JSON

Viewing History compressionROWID 1234454545_5466

Format<Timeuuid>_<movieid>

1234454546_5466 1234454547_5466 1234454548_5466

Subscriber_id Playback/Bookmark related SERRIALED DATA




Re-sort by movie idMovie_id:[{playbackevent1,playbackevent2 ...... } ],Movie_id:[{playbackevent1,playbackevent2 ...... } ],Movie_id:[{playbackevent1,playbackevent2 ...... } ],Movie_id:[{playbackevent1,playbackevent2 ...... } ],

Compress data

1

3

2

4 Store in separate column family

Reduced data size by 7 times

Operational processes improved by 10 timesMoney saved: $,$$$,

$$$

Improvement in app read latency

Think Data Archival

• Data stores in Netflix grow exponentially• Have a process in place to archive data– DSE– Moving to a separate column family– Moving to a separate cluster (non SSD)– Setting right expectations w.r.t latencies with historical

data

• Cassandra TTL’s

Example 4 : Personalized recommendations

read-modify-write pattern

• Data read and written back (even if data was not modified)

• Large BLOB’s

Cassandra under IO pressurePeak traffic – compaction yet to

run – high read latency

read-modify-write pattern

• Do you really need to read data ?• Avoid write if data has not changed – SSTable

creation – immutable SSTables created at backend• Write with a new row key (Limit sstable scans). TTL

data• If a batch process, throttle the write rate to let

compactions catch up

Useful Tools• Cassandra real-time metrics

• Capture schema changes –(automatically)

Observations

• Cassandra scales linearly without any noticeable degradation to running cluster

• Self-healing : minimal operational noise• Developers– mindset need to shift from normalization to

denormalization– Need to have reasonable understanding of Cassandra

architecture– Enjoy the schema change flexibility. No more DDL locks/

DBA dependency

Questions

Reading from Cassandra

client

memtable

sstable

sstable

sstable

Row cachekey cache

Writing to Cassandra

client Commit log (Disk)

Memtable (memory)

sstable

Flush

Replication factor: 3

sstable sstablesstable

Technology

Cassandra Data Modeling - Practical Considerations @ Netflix