Upload
tyler-hobbs
View
2.370
Download
0
Tags:
Embed Size (px)
DESCRIPTION
An introduction to Apache Cassandra, covering the clustering model and the data model.Presented by Tyler Hobbs at the October 2011 Austin NoSQL meetup.
Citation preview
CassandraIntro to
Tyler Hobbs
Dynamo(clustering)
History
BigTable(data model)
Cassandra
Users
Every node plays the same role– No masters, slaves, or special nodes
– No single point of failure
Clustering
Consistent Hashing
0
10
20
30
40
50
0
10
20
30
40
50
Key: “www.google.com”
Consistent Hashing
0
10
20
30
40
50
Key: “www.google.com”
14
md5(“www.google.com”)
Consistent Hashing
0
10
20
30
40
50
14
Key: “www.google.com”
md5(“www.google.com”)
Consistent Hashing
0
10
20
30
40
50
14
Key: “www.google.com”
md5(“www.google.com”)
Consistent Hashing
0
10
20
30
40
50
14
Key: “www.google.com”
md5(“www.google.com”)
Replication Factor = 3
Consistent Hashing
Client can talk to any node
Clustering
Scaling
50
0
10
20
30
The node at50 owns the red portion
RF = 2
Scaling
50
0
10
20
30
40Add a new node at 40
RF = 2
Scaling
50
0
10
20
30
40Add a new node at 40
RF = 2
Node Failures
50
0
10
20
30
RF = 2
40
Replicas
Node Failures
50
0
10
20
30
RF = 2
40
Replicas
Node Failures
50
0
10
20
30
RF = 2
40
Consistency, Availability Consistency
– Can I read stale data? Availability
– Can I write/read at all? Tunable Consistency
Consistency N = Total number of replicas R = Number of replicas read from
– (before the response is returned) W = Number of replicas written to
– (before the write is considered a success)
Consistency N = Total number of replicas R = Number of replicas read from
– (before the response is returned) W = Number of replicas written to
– (before the write is considered a success)
W + R > N gives strong consistency
Consistency
W + R > N gives strong consistency
N = 3W = 2R = 2
2 + 2 > 3 ==> strongly consistent
Consistency
W + R > N gives strong consistency
N = 3W = 2R = 2
2 + 2 > 3 ==> strongly consistent
Only 2 of the 3 replicas must be available.
Consistency Tunable Consistency
– Specify N (Replication Factor) per data set– Specify R, W per operation
Consistency Tunable Consistency
– Specify N (Replication Factor) per data set– Specify R, W per operation– Quorum: N/2 + 1
• R = W = Quorum• Strong consistency• Tolerate the loss of N – Quorum replicas
– R, W can also be 1 or N
Availability Can tolerate the loss of:
– N – R replicas for reads– N – W replicas for writes
CAP Theorem
Availability
Consistency
During node or network failure:
100%
100%
Possible
Not Possible
CAP Theorem
Availability
Consistency
During node or network failure:
100%
100%
Cassandra
Not Possible
Possible
No single point of failure Replication that works Scales linearly
– 2x nodes = 2x performance• For both writes and reads
– Up to 100's of nodes Operationally simple Multi-Datacenter Replication
Clustering
Comes from Google BigTable Goals
– Minimize disk seeks– High throughput– Low latency– Durable
Data Model
Keyspace– A collection of Column Families– Controls replication settings
Column Family– Kinda resembles a table
Data Model
Static– Object data– Similar to a table in a relational database
Dynamic– Pre-calculated query results– Materialized views
Column Families
Static Column Families
zznate
driftx
thobbs
jbellis
password: *
password: *
password: *
name: Nate
name: Brandon
name: Tyler
password: * name: Jonathan site: riptano.com
Users
Rows– Each row has a unique primary key– Sorted list of (name, value) tuples
• Like a sorted map or dictionary– The (name, value) tuple is called a “column”
Dynamic Column Families
Dynamic Column Families
zznate
driftx
thobbs
jbellis
driftx: thobbs:
driftx: thobbs:mdennis: zznate
Following
zznate:
pcmanus xedin:
Column Timestamps– Each column (tuple) has a timestamp– In the case of a collision, the latest timestamp wins– Client specifies timestamp with write– Writes are idempotent
• Infinite retries allowed
Dynamic Column Families
Dynamic Column Families Other Examples:
– Timeline of tweets by a user– Timeline of tweets by all of the people a user is
following– List of comments sorted by score– List of friends grouped by state
The Data API Two choices
– RPC-based API– CQL
• Cassandra Query Language
Inserting Data
INSERT INTO users (KEY, “name”, “age”) VALUES (“thobbs”, “Tyler”, 24);
Updating Data
INSERT INTO users (KEY, “age”) VALUES (“thobbs”, 34);
Updates are the same as inserts:
Or
UPDATE users SET “age” = 34 WHERE KEY = “thobbs”;
Fetching Data
SELECT * FROM users WHERE KEY = “thobbs”;
Whole row select:
Fetching Data
SELECT “name”, “age” FROM users WHERE KEY = “thobbs”;
Explicit column select:
Fetching Data
UPDATE letters SET 1='a', 2='b', 3='c', 4='d', 5='e' WHERE KEY = “key”;
SELECT 1..3 FROM letters WHERE KEY = “key”;
Get a slice of columns
Returns [(1, a), (2, b), (3, c)]
Fetching Data
SELECT FIRST 2 FROM letters WHERE KEY = “key”;
Get a slice of columns
Returns [(1, a), (2, b)]
SELECT FIRST 2 REVERSED FROM letters WHERE KEY = “key”;
Returns [(5, e), (4, d)]
Fetching Data
SELECT 3..'' FROM letters WHERE KEY = “key”;
Get a slice of columns
Returns [(3, c), (4, d), (5, e)]
SELECT FIRST 2 REVERSED 4..'' FROM letters WHERE KEY = “key”;
Returns [(4, d), (3, c)]
Deleting Data
DELETE FROM users WHERE KEY = “thobbs”;
Delete a whole row:
DELETE “age” FROM users WHERE KEY = “thobbs”;
Delete specific columns:
Secondary Indexes
CREATE INDEX ageIndex ON users (age);
SELECT name FROM USERS WHERE age = 24 AND state = “TX”;
Builtin basic indexes
Performance Writes
– 10k – 30k per second per node– Sub-millisecond latency
Reads– 1k – 10k per second per node– Depends on data set, caching– Usually 0.1 to 10ms latency
Other Features Distributed Counters
– Can support millions of high-volume counters Excellent Multi-datacenter Support
– Disaster recovery– Locality
Hadoop Integration– Isolation of resources– Hive and Pig drivers
Compression
What Cassandra Can't Do Transactions
– Unless you use a distributed lock– Atomicity, Isolation– These aren't needed as often as you'd think
Limited support for ad-hoc queries– Know what you want to do with the data
Not One-size-fits-all Use alongside an RDBMS
– Use the RDBMS for highly-transactional or highly-relational data• Usually a small set of data
– Let Cassandra scale to handle the rest
Language Support Good:
– Java– Python– Ruby– PHP– C#
Coming Soon:– Everything else, now that we have CQL