194
Don’t Reinvent the Big-Data Wheel! Clint Kelly - @clintwkelly WibiData Building real-time, Big Data applications on Cassandra with the open-source Kiji project Big Data Camp LA 14 June 2014

Kiji cassandra la june 2014 - v02 clint-kelly

Embed Size (px)

DESCRIPTION

Big Data Camp LA 2014, Don't re-invent the Big-Data Wheel, Building real-time, Big Data applications on Cassandra with the open-source Kiji project by Clint Kelly of Wibidata

Citation preview

Page 1: Kiji cassandra la   june 2014 - v02 clint-kelly

Don’t Reinvent the Big-Data Wheel!

Clint Kelly - @clintwkellyWibiData

Building real-time, Big Data applications on Cassandra with the open-source Kiji project

Big Data Camp LA14 June 2014

Page 2: Kiji cassandra la   june 2014 - v02 clint-kelly

Agenda

Page 3: Kiji cassandra la   june 2014 - v02 clint-kelly

Agenda

The problem

Page 4: Kiji cassandra la   june 2014 - v02 clint-kelly

Agenda

The problemHow Kiji works

Page 5: Kiji cassandra la   june 2014 - v02 clint-kelly

Agenda

The problemHow Kiji works

Kiji in production

Page 6: Kiji cassandra la   june 2014 - v02 clint-kelly

Agenda

The problemHow Kiji works

Kiji in productionKiji on Cassandra

Page 7: Kiji cassandra la   june 2014 - v02 clint-kelly

The problem.

Page 8: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 9: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 10: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 11: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 12: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 13: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 14: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 15: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 16: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 17: Kiji cassandra la   june 2014 - v02 clint-kelly

!Open source

software

Page 18: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 19: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 20: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 21: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 22: Kiji cassandra la   june 2014 - v02 clint-kelly

!

Page 23: Kiji cassandra la   june 2014 - v02 clint-kelly

!

?

Page 24: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in

Page 25: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in

Page 26: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in

REST

Page 27: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect

Page 28: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect

Page 29: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect

Page 30: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect

Page 31: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect

Page 32: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

Page 33: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

Page 34: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

“Trained model”

Page 35: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

“Trained model”

Page 36: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

“Trained model”

Page 37: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

“Trained model”

Page 38: Kiji cassandra la   june 2014 - v02 clint-kelly

Train

“Trained model”

Page 39: Kiji cassandra la   june 2014 - v02 clint-kelly

Model

Page 40: Kiji cassandra la   june 2014 - v02 clint-kelly

Model

AaBb

Page 41: Kiji cassandra la   june 2014 - v02 clint-kelly

Model

AaBb

Page 42: Kiji cassandra la   june 2014 - v02 clint-kelly

Score

Page 43: Kiji cassandra la   june 2014 - v02 clint-kelly

Score

Page 44: Kiji cassandra la   june 2014 - v02 clint-kelly

ScoreAaBbAaBbAaBbAaBbAaBbAaBbAaBbAaBbAaBb

Page 45: Kiji cassandra la   june 2014 - v02 clint-kelly

ScoreAaBbAaBbAaBbAaBbAaBbAaBbAaBbAaBbAaBb

Page 46: Kiji cassandra la   june 2014 - v02 clint-kelly

Score

Batch

AaBbAaBbAaBbAaBbAaBbAaBbAaBbAaBbAaBb

Page 47: Kiji cassandra la   june 2014 - v02 clint-kelly

Data out

Page 48: Kiji cassandra la   june 2014 - v02 clint-kelly

Data out

Page 49: Kiji cassandra la   june 2014 - v02 clint-kelly

Data out

REST

Page 50: Kiji cassandra la   june 2014 - v02 clint-kelly

Data out

REST

Page 51: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 52: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 53: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 54: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 55: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 56: Kiji cassandra la   june 2014 - v02 clint-kelly

REST

Page 57: Kiji cassandra la   june 2014 - v02 clint-kelly

REST

Page 58: Kiji cassandra la   june 2014 - v02 clint-kelly

REST

Page 59: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 60: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 61: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 62: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 63: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 64: Kiji cassandra la   june 2014 - v02 clint-kelly
Page 65: Kiji cassandra la   june 2014 - v02 clint-kelly

AaBb

Page 66: Kiji cassandra la   june 2014 - v02 clint-kelly

AaBb

Page 67: Kiji cassandra la   june 2014 - v02 clint-kelly

AaBb

Page 68: Kiji cassandra la   june 2014 - v02 clint-kelly

AaBb

Page 69: Kiji cassandra la   june 2014 - v02 clint-kelly

Experiments / Deployment

Page 70: Kiji cassandra la   june 2014 - v02 clint-kelly

Experiments / Deployment

Page 71: Kiji cassandra la   june 2014 - v02 clint-kelly

Experiments / Deploymentc

d

c

d

Page 72: Kiji cassandra la   june 2014 - v02 clint-kelly

Experiments / Deploymentc

d

c

d

Page 73: Kiji cassandra la   june 2014 - v02 clint-kelly

3

Page 74: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in / out

Page 75: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in / out(REST)

Page 76: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect and train

Page 77: Kiji cassandra la   june 2014 - v02 clint-kelly

Score

Page 78: Kiji cassandra la   june 2014 - v02 clint-kelly

Score(real-time)

Page 79: Kiji cassandra la   june 2014 - v02 clint-kelly

!

?

Page 80: Kiji cassandra la   june 2014 - v02 clint-kelly

!!

Kiji

Page 81: Kiji cassandra la   june 2014 - v02 clint-kelly

How Kiji works

Page 82: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji History

Page 83: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji History

Page 84: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji History

Page 85: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Page 86: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

EngineeringData

Science

Page 87: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Data Science

Write

Engineering

Page 88: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Data Science

Write

Channels Engineering

Page 89: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Data Science

WriteLogs

DBs

EngineeringChannels

Page 90: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Data Science

WriteLogs

DBs

Kij

iMR

EngineeringChannels

Page 91: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Data Science

Write

Kij

iRE

ST

Stream

EngineeringChannels

Page 92: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

Kiji

Data Science

Write

Read

Kij

iRE

ST

Stream

EngineeringChannels

Page 93: Kiji cassandra la   june 2014 - v02 clint-kelly

How does it work?

KijiSchema(Cassandra)

Data Science

Write

Read

Kij

iRE

ST

Stream

EngineeringChannels

Page 94: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

EngineeringChannels

Page 95: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

C

C

C

EngineeringChannels

Page 96: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

C

C

C

EngineeringChannels

Page 97: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

C

C

C

EngineeringChannels

Page 98: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

C

C

C

EngineeringChannels

Page 99: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

Data

C

C

C

EngineeringChannels

Page 100: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

Data

C

C

C

EngineeringChannels

Page 101: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

Data

C

C

C

EngineeringChannels

Page 102: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

Data

C

C

C

EngineeringChannels

Page 103: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiMR

C

C

C

EngineeringChannels

Data

Page 104: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

C

C

C

EngineeringChannels

Data

Page 105: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

Scorer

C

C

C

EngineeringChannels

Data

Page 106: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

Scorer

C

C

C

EngineeringChannels

Data

Page 107: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

Scorer

C

C

C

R

EngineeringChannels

Data

Page 108: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

Scorer

C

C

C

EngineeringChannels

Data

Page 109: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

Scorer

C

C

C

EngineeringChannels

Data

Page 110: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMR

Scorer

C

C

C

R

R

R

EngineeringChannels

Data

Page 111: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 112: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 113: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 114: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 115: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 116: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 117: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 118: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

Page 119: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

R

Page 120: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

R

R

Page 121: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

How does it work?Data

Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

QueryKijiHive

KijiExpress

KijiMRK

ijiS

cori

ng

C

C

C

R

Kiji Model Repository

EngineeringChannels

Data

Scorer

R

R

R

Page 122: Kiji cassandra la   june 2014 - v02 clint-kelly

3

Page 123: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in / outKijiRESTKijiMR

Page 124: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect and trainKijiHiveKijiMR

KijiExpress

Page 125: Kiji cassandra la   june 2014 - v02 clint-kelly

Score(real-time)

KijiModelRepositoryKijiScoring

Page 126: Kiji cassandra la   june 2014 - v02 clint-kelly

Modular

Page 127: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji in production

Page 128: Kiji cassandra la   june 2014 - v02 clint-kelly

In production now

Fortune 500 retailer : Personalized recommendations

Opower: Energy usage and analytics reporting

Page 129: Kiji cassandra la   june 2014 - v02 clint-kelly

Fortune 500 retailer

Serving personalized recommendations

Page 130: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji

WriteLogs

DBs

Kij

iMR

EngineeringChannels

Bulk load

Page 131: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

Data Science

User 1

User 2

User 3

KijiExpress

KijiMR

C

C

C

Data

Train

Page 132: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema(Cassandra)

Data Science

Write

Read

Kij

iRE

ST

Stream

User 1

User 2

User 3

Kij

iSco

rin

g

C

C

C

R

Kiji Model Repository

EngineeringChannels

Scorer

Score

Page 133: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji on Cassandra

Page 134: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema

Page 135: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema

Page 136: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema

Cassandra

Page 137: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema

Cassandra

Page 138: Kiji cassandra la   june 2014 - v02 clint-kelly

KijiSchema

HBase

Page 139: Kiji cassandra la   june 2014 - v02 clint-kelly

Kiji ~ BigTable

Page 140: Kiji cassandra la   june 2014 - v02 clint-kelly

table

Page 141: Kiji cassandra la   june 2014 - v02 clint-kelly

table

rowrowrowrowrowrowrowrowrowrowrowrow

Page 142: Kiji cassandra la   june 2014 - v02 clint-kelly

row

Page 143: Kiji cassandra la   june 2014 - v02 clint-kelly

Row key = entity ID

entity ID data

Page 144: Kiji cassandra la   june 2014 - v02 clint-kelly

Composite entity IDs

data0xfa “bob”

Page 145: Kiji cassandra la   june 2014 - v02 clint-kelly

Column families

payment0xfa “bob” interactions recommendations

Page 146: Kiji cassandra la   june 2014 - v02 clint-kelly

inter:clicks

inter:search0xfa “bob” payment:

cardnumpayment:address

rec:scorer1

rec:scorer2

Columns

Page 147: Kiji cassandra la   june 2014 - v02 clint-kelly

Timestamped versions

songs:let it be

inter:search0xfa “bob” songs:

let it besongs:let it besongs:

let it beinter:clicks

1396560123

payment:cardnum

payment:address

rec:scorer2

rec:scorer3rec:

scorer3rec:scorer3

rec:scorer1

1395650231

Page 148: Kiji cassandra la   june 2014 - v02 clint-kelly

Complex data types

record Search { string search_term; long session_id; device_type device;}

songs:let it be

inter:search0xfa “bob” songs:

let it besongs:let it besongs:

let it beinter:clicks

1396560123

payment:cardnum

payment:address

rec:scorer2

rec:scorer3rec:

scorer3rec:scorer3

rec:scorer1

1395650231

Page 149: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group

Page 150: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group

Column families

Page 151: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group

Page 152: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group

Batch Batch Batch

Page 153: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group

Batch Batch BatchReal-time

Real-time

Real-time

Page 154: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group

Batch BatchReal-time

Real-time

Real-time

Batch

Page 155: Kiji cassandra la   june 2014 - v02 clint-kelly

locality_group_real_timelocality_group_batch

Locality group

Batch BatchReal-time

Real-time

Real-time

Batch

Page 156: Kiji cassandra la   june 2014 - v02 clint-kelly

locality_group_real_timelocality_group_batch

Locality group

Batch Batch

Real-time

Real-time

Real-time

Batch

Page 157: Kiji cassandra la   june 2014 - v02 clint-kelly

locality_group_real_timelocality_group_batch

Locality group

Batch Batch Real-time

Real-time

Real-timeBatch

Page 158: Kiji cassandra la   june 2014 - v02 clint-kelly

locality_group_real_timelocality_group_batch

Locality group

Batch Batch Real-time

Real-time

Real-timeBatch

On disk.Compressed.

Page 159: Kiji cassandra la   june 2014 - v02 clint-kelly

locality_group_real_timelocality_group_batch

Locality group

Batch Batch Real-time

Real-time

Real-timeBatch

On disk.Compressed. In memory.

Page 160: Kiji cassandra la   june 2014 - v02 clint-kelly

Row ➔ transactional consistency

Page 161: Kiji cassandra la   june 2014 - v02 clint-kelly

Locality group ➔ Column family

CREATE TABLE loc_grp

songs:let it be

inter:search0xfa “bob” songs:

let it besongs:let it besongs:

let it beinter:clicks

1396560123

payment:cardnum

payment:address

rec:scorer2

rec:scorer3rec:

scorer3rec:scorer3

rec:scorer1

1395650231

Page 162: Kiji cassandra la   june 2014 - v02 clint-kelly

Entity ID ➔ Primary key

CREATE TABLE loc_grp (city text, user text,

PRIMARY KEY (city, user) )

WITH CLUSTERING ORDER BY (user ASC);

songs:let it be

inter:search0xfa “bob” songs:

let it besongs:let it besongs:

let it beinter:clicks

1396560123

payment:cardnum

payment:address

rec:scorer2

rec:scorer3rec:

scorer3rec:scorer3

rec:scorer1

1395650231

Page 163: Kiji cassandra la   june 2014 - v02 clint-kelly

Family, Qualifier, Version ➔ Clustering Columns

CREATE TABLE loc_grp (city text, user text,

family text, qualifier text, version bigint,

PRIMARY KEY (city, user, family, qualifier, version) )

WITH CLUSTERING ORDER BY (user ASC, family ASC, qualifier ASC, version DESC);

songs:let it be

inter:search0xfa “bob” songs:

let it besongs:let it besongs:

let it beinter:clicks

1396560123

payment:cardnum

payment:address

rec:scorer2

rec:scorer3rec:

scorer3rec:scorer3

rec:scorer1

1395650231

Page 164: Kiji cassandra la   june 2014 - v02 clint-kelly

Column values ➔ Blobs

CREATE TABLE loc_grp (city text, user text,

family text, qualifier text, version bigint, value blob,

PRIMARY KEY (city, user, family, qualifier, version) )

WITH CLUSTERING ORDER BY (user ASC, family ASC, qualifier ASC, version DESC);

songs:let it be

inter:search0xfa “bob” songs:

let it besongs:let it besongs:

let it beinter:clicks

1396560123

payment:cardnum

payment:address

rec:scorer2

rec:scorer3rec:

scorer3rec:scorer3

rec:scorer1

1395650231

Page 165: Kiji cassandra la   june 2014 - v02 clint-kelly

Implementation notes

Page 166: Kiji cassandra la   june 2014 - v02 clint-kelly

Implementation notes

DataStax Java driver

Page 167: Kiji cassandra la   june 2014 - v02 clint-kelly

Implementation notes

DataStax Java driverCassandra 2.0.6

Page 168: Kiji cassandra la   june 2014 - v02 clint-kelly

Implementation notes

DataStax Java driverCassandra 2.0.6

Async API

Page 169: Kiji cassandra la   june 2014 - v02 clint-kelly

Implementation notes

DataStax Java driverCassandra 2.0.6

Async APINew MapReduce InputFormat

Page 170: Kiji cassandra la   june 2014 - v02 clint-kelly

Issues

Page 171: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groups

Page 172: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Page 173: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Page 174: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups

Page 175: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups➔ multiple C* reads (async API!)

Page 176: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups➔ multiple C* reads (async API!)

Page 177: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups➔ multiple C* reads (async API!)

Compare-and-set across locality groups

Page 178: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups➔ multiple C* reads (async API!)

Compare-and-set across locality groups➔ not allowed in C* Kiji

Page 179: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups➔ multiple C* reads (async API!)

Compare-and-set across locality groups➔ not allowed in C* Kiji

Page 180: Kiji cassandra la   june 2014 - v02 clint-kelly

Operations across locality groupsKiji locality group ➔ C* column family

Read across locality groups➔ multiple C* reads (async API!)

Compare-and-set across locality groups➔ not allowed in C* Kiji

Lose transactional consistency

Page 181: Kiji cassandra la   june 2014 - v02 clint-kelly

Filters

HBase ➔ Rich server-side filtersCassandra ➔ WHERE clauses

Page 182: Kiji cassandra la   june 2014 - v02 clint-kelly

Filters

HBase ➔ Rich server-side filtersCassandra ➔ WHERE clauses

Client-side filtering

Page 183: Kiji cassandra la   june 2014 - v02 clint-kelly

Project status

Page 184: Kiji cassandra la   june 2014 - v02 clint-kelly

Components working with Cassandra

KijiSchemaKijiMR

KijiRESTKijiExpress

Page 186: Kiji cassandra la   june 2014 - v02 clint-kelly

All code available with tutorial within 1-2 months

Page 187: Kiji cassandra la   june 2014 - v02 clint-kelly

Summary

Page 188: Kiji cassandra la   june 2014 - v02 clint-kelly

3

Page 189: Kiji cassandra la   june 2014 - v02 clint-kelly

Data in / outKijiRESTKijiMR

Page 190: Kiji cassandra la   june 2014 - v02 clint-kelly

Inspect and trainKijiHiveKijiMR

KijiExpress

Page 191: Kiji cassandra la   june 2014 - v02 clint-kelly

Score(real-time)

KijiModelRepositoryKijiScoring

Page 192: Kiji cassandra la   june 2014 - v02 clint-kelly

Thanks to Cassandra community

Mailing listsMeetups, webinars, conferences

Page 194: Kiji cassandra la   june 2014 - v02 clint-kelly