HBase schema design Big Data TechCon Boston

1

Headline Goes HereSpeaker Name or Subhead Goes Here

HBase schema designAmandeep Khurana | Solu7ons ArchitectBig Data TechCon, Boston, April 2013

Friday, April 12, 13

About me

•Solu@ons Architect, Cloudera Inc•Amazon Web Services•Interested in large scale distributed systems•Co-‐author, HBase In Ac@on•TwiHer: amansk

2

M A N N I N G

Nick DimidukAmandeep Khurana


About the talk

•Data model recap•Data modeling thought process•Tools and techniques

3Friday, April 12, 13

HBase is ...

•Column family oriented database•Column family oriented•Tables consis@ng of rows and columns

•Persisted Map•Sparse•Mul@ dimensional•Sorted• Indexed by rowkey, column and @mestamp

•Key Value store• [rowkey, col family, col qualifier, @mestamp] -‐> cell value


HBase is not ...

•A rela@onal database•No SQL query language•No joins•No secondary indexing•No transac@ons


6

It’s not a rela7onal database system

Data Model recap


Important terms

• Table• Consists of rows and columns

• Row• Has a bunch of columns.• Iden@fied by a rowkey (primary key)

• Column Qualifier• Dynamic column name

• Column Family• Column groups -‐ logical and physical (Similar access paHern)

• Cell• The actual element that contains the data for a row-‐column intersec@on

• Version• Every cell has mul@ple versions.


Data coordinates

•Row is addressed using rowkey•Cell is addressed using [rowkey + family + qualifier]


Tabular representa@on

9

Column Family - Info

password

abc123

abc123

abc123

[email protected]

TheRealMT

HMS_Surprise

[email protected]

Sir Arthur Conan Doyle [email protected]

[email protected]

SirDoyle

GrandpaD

Fyodor Dostoyevsky

Patrick O'Brien

Mark Twain

Rowkey name email

Cells

Each cell has multiple versions,

typically represented by the timestamp

of when they were inserted into the table

(ts2>ts1)

ts1=1329088321289 ts2=1329088818321

The table is lexicographicallysorted on the rowkeys

Langhorneabc123

12

3

4

The coordinates used to identify data in an HBase table are:(1) rowkey, (2) column family, (3) column qualifier, (4) version


Key-‐Value store

10

[TheRealMT, info, password, 1329088818321]

[TheRealMT, info, password, 1329088321289]

abc123

Langhorne

Keys Values

A single KeyValue instance


Key-‐Value store

11

[TheRealMT, info, password, 1329088818321] abc123

[TheRealMT, info, password] 1329088818321 : "abc123",1329088321289 : "Langhorne"

}

{

[TheRealMT, info] },

"name" : { 1329088321289 : "Mark Twain"

"email" : { 1329088321289 : "[email protected]" },

"password" : { 1329088818321 : "abc123", 1329088321289 : "Langhorne" } }

{

},

"info" : {

"name" : { 1329088321289 : "Mark Twain"


"password" : { 1329088818321 : "abc123", 1329088321289 : "Langhorne" } } }

{

[TheRealMT]

Keys

Values

Start with coordinates of full precision1

Drop version and you're left with a map of version to values2

Omit qualifier and you have a map of qualifiers to the previous maps3

Finally, drop the column family and you have a row, a map of maps4


Sorted map of maps

12

Rowkey

Column family

Column qualifiers

Versions

Values

{

},

"TheRealMT" : { "info" : {

"name" : { 1329088321289 : "Mark Twain"


"password" : { 1329088818321 : "abc123", 1329088321289 : "Langhorne" } } }}


HFiles and physical data model

•HFiles are•Immutable•Sorted on rowkey + qualifier + @mestamp•In the context of a column family per region

13

, , , "TheRealMT" "info" "password" , 1329088321289 "Langhorne", , , "TheRealMT" "info" "password" , 1329088818321 "abc123",

, , , , "TheRealMT" "info" "email" 1329088321289 "[email protected]", , , , "TheRealMT" "info" "name" 1329088321289 "Mark Twain"

HFile for the info column family in the users table


14

... it’s a database a?er-‐all

Thinking through the design


But isn’t HBase schema-‐less?

•Number of tables•Rowkey design •Number of column families per table. What goes into what column family•Column qualifier names•What goes into the cells•Number of versions


Rowkeys

•Rowkey design is the single most important aspect of HBase table designs•The only way to address rows in HBase


TwitBase rela@onships

•Users follow users•Rela@onships need to be persisted for usage later on•Model tables for the expected access paHerns•Read paHern•Who does A follow?•Who follows A?•Does A follow B?

•Write paHern•A follows B•A unfollows B


Start simple

•Adjacency list

18

Column Family : followsrow key:userid

cell value: followed userid

column qualifier: followed user number

4:HRogers1:HRogers

3:Olivia1:TheRealMTTheFakeMT2:Olivia

2:MTFanBoyTheRealMT

followsCol Qualifier

Cell value


Op@mizing the adjacency list

•We need a count•Where does a new followed user go?

19

2:Olivia count:2count:4TheFakeMT 4:HRogers3:Olivia2:MTFanBoy1:TheRealMT

TheRealMT 1:HRogers

follows


Adding a new user

20

2:Olivia count:2count:4TheFakeMT 4:HRogers3:Olivia2:MTFanBoy1:TheRealMT

TheRealMT 1:HRogers

follows

Row that needs to be updated

Client code:Step 1: Get current countStep 2: Update countStep 3: Add new entryStep 4: Write the new data to HBase

1

2

TheFakeMT : follows: {count -> 4}

increment count

TheFakeMT : follows: {count -> 5}

3 add new entry

TheFakeMT : follows: {5 -> MTFanBoy2, count -> 5}

count:52:Olivia count:2

5:MTFanBoy2TheFakeMT 4:HRogers3:Olivia2:MTFanBoy1:TheRealMTTheRealMT 1:HRogers

follows

4


Transac@ons == not good

•HBase doesn’t have na@ve support (think scale)•Don’t want to complicate client side logic•Only solu@on -‐> simplify schema

21

Olivia:1MTFanBoy:1TheFakeMTOlivia:1HRogers:1

HRogers:1TheRealMT:1TheRealMT

follows


Revisit the ques@ons

•Read paHern•Who all does A follow?•Who all follows A?•Does A follow B?

•Write paHern•A follows B•A unfollows B


Revisit the ques@ons


Denormaliza@on

•Second table for reverse rela@onship•Otherwise scan across en@re table and affect read performance

23

DenormalizationPoor design

DreamlandNormalization

Read performance

Writ

e pe

rform

ance


More op@miza@ons

•Convert into tall-‐narrow table•Leverage rowkey indexing beHer•Gets -‐> short Scans

24

CF : f

row key:follower + followed

cell value: 1

CQ: followed user's nameThe + in the row key refers to concatenating

the two values. You could delimitateusing any character you like.

eg: A-B or A,B

Keeping the column family and column qualifiernames short reduces the data transferred over thenetwork back to the client. The KeyValue objects

become smaller.


Tall-‐narrow table example

•Denormaliza@on is the way to go

25

TheRealMT+HRogers Henry Rogers:1Olivia Clemens:1TheRealMT+Olivia

Amandeep Khurana:1TheFakeMT+MTFanBoyOlivia Clemens:1

Mark Twain:1TheFakeMT+TheRealMT

TheFakeMT+OliviaHenry Rogers:1TheFakeMT+HRogers

f Putting the user name in the columnqualifier saves you from looking upthe users table for the name of theuser given an id. You can simply

list out names or ids while lookingat relationships just from this table.

The downside of this is that you needto update the name in all the cellsif the user updates their name in

their profile.This is classic Denormalization.


Uniform rowkey length

•MD5 the userids -‐> 16 bytes + 16 bytes rowkeys•BeHer distribu@on of load

26

CF : f

row key:md5(follower)md5(followed)

cell value: followed users name

CQ: followed useridUsing MD5 of the user ids gives you

fixed lengths instead of variablelength user ids. You don't needconcatenation logic anymore.


Uniform rowkey length (con@nued)

27

MD5(TheRealMT) MD5(HRogers) HRogers:Henry RogersOlivia:Olivia ClemensMD5(TheRealMT) MD5(Olivia)

MTFanBoy:Amandeep KhuranaMD5(TheFakeMT) MD5(MTFanBoy)Olivia:Olivia Clemens

TheRealMT:Mark TwainMD5(TheFakeMT) MD5(TheRealMT)

MD5(TheFakeMT) MD5(Olivia)HRogers:Henry RogersMD5(TheFakeMT) MD5(HRogers)

f


Tall v/s Wide tables storage footprint

28

r5 c3:v5 c7:v8c1:v1

r4 c2:v4

c2:v3 c5:v6r3

c1:v2 c3:v6r2

c6:v2c1:v9c1:v1r1

CF1 CF2

r1:CF1:c1:t1:v1r2:CF1:c1:t2:v2r2:CF1:c3:t3:v6r3:CF1:c2:t1:v3r4:CF1:c2:t1:v4r5:CF1:c1:t2:v1r5:CF1:c3:t3:v5

r1:CF2:c1:t1:v9r1:CF2:c6:t4:v2r3:CF2:c5:t4:v6r5:CF2:c7:t3:v8

HFile for CF1 HFile for CF2

r5:cf2:c7:t3:v8r5:CF1:c3:t3:v5r5:CF1:c1:t2:v1

Result object returned for a Get() on row r5

KeyValue objects

Cell Value

TimeStamp

Col Qual

Col Fam

Row Key

Key Value

Logical representation of an HBase table.We'll look at what it means to Get() row r5 from this table. Actual physical storage of the table

Structure of a KeyValue object


Rowkey design

•Single most important aspect of designing tables•Depends on expected access paHerns•HFiles are sorted on Key part of KeyValue objects

29

, , , "TheRealMT" "info" "password" , 1329088321289 "Langhorne", , , "TheRealMT" "info" "password" , 1329088818321 "abc123",

, , , , "TheRealMT" "info" "email" 1329088321289 "[email protected]", , , , "TheRealMT" "info" "name" 1329088321289 "Mark Twain"

HFile for the info column family in the users table


Write op@mized

•Distribute writes across the cluster•Issue most pronounced with @me series data

•Hashinghash("TheRealMT") -> random byte[]

•Sal@ngint salt = new Integer(new Long(timestamp).hashCode()).shortValue() % <number of region servers>;byte[] rowkey = Bytes.add(Bytes.toBytes(salt) + Bytes.toBytes("|") + Bytes.toBytes(timestamp));


Read op@mized

•Data to be accessed together should be stored together•eg: twit streams -‐ last 10 twits by the users I follow

31

Olivia1Olivia2Olivia5Olivia7Olivia9TheFakeMT2TheFakeMT3TheFakeMT4TheFakeMT5TheFakeMT6TheRealMT1TheRealMT2TheRealMT5TheRealMT8

1Olivia1TheRealMT2Olivia2TheFakeMT2TheRealMT3TheFakeMT4TheFakeMT5Olivia5TheFakeMT5TheRealMT6TheFakeMT7Olivia8TheRealMT9Olivia


Rela@onal to Non rela@onal

•Rela@onal concepts•En@@es•AHributes•Rela@onships

•En@@es•Table is a table. Not much going on there•Users table contains... users. Those are en@@es•Good place to start


Rela@onal to Non rela@onal

•AHributes•Iden@fying•Primary keys. Compound keys•Maps to rowkeys

•Non-‐iden@fying•Other columns•Maps to column qualifiers and cells

•Rela@onships•Foreign keys, junc@on tables, joins.•Non-‐existent in HBase. Instead try to denormalize


Nested En@@es

•Column Qualifiers can contain data instead of just a column name

34

hbase tablerow key

column family

repeating entity

fixed qualifier → timestamp → value

variable qualifier → timestamp → value

Nested entities


Schema design summary

•Schema can make or break the performance you get•Rowkey is the single most important thing•Use tricks like hashing and sal@ng

•Denormalize to your advantage•There are no joins

• Isolate access paHerns•Separate CFs or even separate tables

•Shorter names -‐> lower storage footprint•Column qualifiers can be used to store data and not just column names•Big difference as compared to RDBMS



Technology

HBase schema design Big Data TechCon Boston