Download pdf - Ycsb benchmarking

YCSB++ Benchmarking Tool Performance Debugging Advanced

Features of Scalable Table Stores

Swapnil Patil Milo Polte, Wittawat Tantisiriroj, Kai Ren, Lin Xiao, Julio Lopez, Garth Gibson, Adam Fuchs *, Billie Rinaldi *

Carnegie Mellon University * National Security Agency

Open Cirrus Summit, October 2011, Atlanta GA

Scalable table stores are critical systems

Swapnil Patil, CMU 2

•  For data processing & analysis (e.g. Pregel, Hive) •  For systems services (e.g., Google Colossus metadata)

Evolution of scalable table stores

Simple, lightweight complex, feature-rich stores Supports a broader range of applications and services Hard to debug and understand performance problems Complex behavior and interaction of various components

Growing set of HBase features

2008 2009 2010 2011+

RangeRowFilters Batch updates

Bulk load tools RegEx !ltering

Scan optimizations

HBASE release

Co-processors Access Control

⏏

⏏

⏏

3 Swapnil Patil, CMU

YCSB++ FUNCTIONALITY

ZooKeeper-based distributed and coordinated testing API and extensions the new Apache ACCUMULO DB Fine-grained, correlated monitoring using OTUS

FEATURES TESTED USING YCSB++ Batch writing Table pre-splitting Bulk loading

Weak consistency Server-side !ltering Fine-grained security

Tool released at http://www.pdl.cmu.edu/ycsb++ Swapnil Patil, CMU 4

Need richer tools for understanding advanced features in table stores …

Outline •  Problem •  YCSB++ design •  Illustrative examples •  Ongoing work and summary


Yahoo Cloud Serving Benchmark [Cooper2010]


•  For CRUD (create-read-update-delete) benchmarking •  Single-node system with an extensible API

Storage Servers

HBASE

OTHER DBS

Workload Executor

Threads

Stats

DB Cl

ients

Command-line Parameters

Workload Parameter

File

YCSB++: New extensions


Added support for the new Apache ACCUMULO DB − New parameters and workload executors

Storage Servers

HBASE

OTHER DBS

Workload Executor

Threads

Stats

DB Cl

ients Workload

Parameter File


EXTENSIONS EXTENSIONS ACCUMULO

YCSB++: Distributed & parallel tests


Multi-client, multi-phase coordination using ZooKeeper − Enables testing at large scales and testing asymmetric features

Storage Servers

HBASE

OTHER DBS

Workload Executor

Threads

Stats

DB Cl

ients Workload

Parameter File


EXTENSIONS EXTENSIONS

MULTI-PHASE

ACCUMULO

YCSB clients COORDINATION

YCSB++: Collective monitoring


OTUS monitor built on Ganglia [Ren2011]

− Collects information from YCSB, table stores, HDFS and OS

Storage Servers

HBASE

OTHER DBS

Workload Executor

Threads

Stats

DB Cl

ients Workload

Parameter File


EXTENSIONS EXTENSIONS

MULTI-PHASE

ACCUMULO

YCSB clients COORDINATION OTUS MONITORING

Example of YCSB++ debugging


OTUS collects !ne-grained information

− Both HDFS process and TabletServer process on same node

0

20

40

60

80

100

00:00 04:00 08:00 12:00 16:00 20:00 00:00 04:00

0

8

16

24

32

40

CP

U U

sag

e (

%)

Avg

Nu

mb

er

of

Sto

reF

iles

Pe

r T

ab

let

Time (Minutes)

Monitoring Resource Usage and TableStore Metrics

Accumulo Avg. StoreFiles per TabletHDFS DataNode CPU Usage

Accumulo TabletServer CPU Usage

Outline •  Problem •  YCSB++ design •  Illustrative examples − YCSB++ on HBASE and ACCUMULO (Bigtable-like stores)

•  Ongoing work and summary


Tablet Servers

Recap of Bigtable-like table stores


HDFS nodes

Tablet TN

Memtable

(Fewer) Sorted

Indexed Files

Sorted Indexed

Files

MINOR COMPACTION

MAJOR COMPACTION

Write Ahead

Log

Data Insertion 1 2

3

Write-path: in-memory buffering & async FS writes 1) Mutations logged in memory tables (unsorted order) 2) Minor compaction: Memtables -> sorted, indexed !les in HDFS 3) Major compaction: LSM-tree based !le merging in background

Read-path: lookup both memtables and on-disk !les

Apache ACCUMULO Started at NSA; now an Apache Incubator project − Designed for for high-speed ingest and scan workloads − http://incubator.apache.org/projects/accumulo.html

New features in ACCUMULO − Iterator framework for user-speci!ed programs placed in

between different stages of the DB pipeline   E.g., Support joins and stream processing using iterators

− Also supports !ne-grained cell-level access control


ILLUSTRATIVE EXAMPLE #1

Analyzing the fast inserts vs. weak consistency

tradeoff using YCSB++


Client-side batch writing

Feature: clients batch inserts, delay writes to server •  Improves insert throughput and latency •  Newly inserted data may not be immediately visible to

other clients Swapnil Patil, CMU 15

⏏

⏏Table store servers

ZooKeeper Cluster

Manager YCSB++

Store client

Batch

YCSB++

Store client

CLIENT #1 CLIENT #2

Read {K}

Batch writing improves throughput

6 clients creating 9 million 1-Kbyte records on 6 servers − Small batches - high client CPU utilization, limits throughput − Large batches - saturate servers, limited bene!t from batching


0 10 20 30 40 50 60

10 KB 100 KB 1 MB 10 MB

Inse

rts pe

r sec

ond (

1000

s)

Batch size

Hbase Accumulo

Table store servers

ZooKeeper

Batch writing causes weak consistency


Test setup: ZooKeeper-based client coordination •  Share producer-consumer queue between readers/writers •  R-W lag = delay before C2 can read C1’s most recent write

YCSB++

Store client

Batch

YCSB++

Store client

1 2 3 4

CLIENT #1 CLIENT #2

Insert {K:V}

(106 records)

Enqueue K (sample 1% records)

Poll and dequeue K

Read {K}

Batch writing causes weak consistency

Deferred write wins, but lag can be ~100 seconds − (N%) = fraction of requests that needed multiple read()s − Implementation of batching affects the median latency


0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

Fra

ctio

n o

f re

qu

est

s

read-after-write time lag (ms)

(a) HBase: Time lag for different buffer sizes

10 KB ( <1%)100 KB (7.4%)

1 MB ( 17%)10 MB ( 23%)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

Fra

ctio

n o

f re

quest

s

read-after-write time lag (ms)

(b) Accumulo: Time lag for different buffer sizes

10 KB ( <1%)100 KB (1.2%)

1 MB ( 14%)10 MB ( 33%)

ILLUSTRATIVE EXAMPLE #2

Benchmarking high-speed ingest features

using YCSB++


Features for high-speed insertions Most table stores have high-speed ingest features − Periodically insert large amounts of data or migrate old

data in bulk − Classic relational DB techniques applied to new stores

Two features: bulk loading and table pre-splitting •  Less data migration during inserts •  Engages more tablet servers immediately •  Need careful tuning and con!guration [Sasha2002]


⏏

⏏

⏏

8-phase test setup: table bulk loading Bulk loading involves two steps

− Hadoop-based data formatting − Importing store !les into table store

Pre-load phase (1 and 2) − Bulk load 6M rows in an empty table − Goal: parallelism by engaging all servers

Load phase (4 and 5) − Load 48M new rows − Goal: study rebalancing during ingest

R/U measurements (3, 6 and 7) − Correlate latency with rebalancing work


Load (importing)

Read/Update workload

Load (re-formatting)


Sleep


Pre-Load (importing)

Pre-Load (re-formatting)

Phases

1

2

3

4

5

6

7

8

Read latency affected by rebalancing work


Load (importing)




Sleep




Phases

1

2

3

4

5

6

7

8

1

10

100

1000

0 60 120 180 240 300

Accu

mul

o Rea

d Lat

ency

(ms)

Measurement Phase Running Time (Seconds)

R/U 1 (Phase 3) R/U 2 (Phase 6) R/U 3 (Phase 8)

•  High latency after high insertion periods that cause servers to rebalance (compactions)

•  Latency drops after store is in a steady state

Rebalancing on ACCUMULO servers


Load (importing)




Sleep




Phases

1

2

3

4

5

6

7

8 •  OTUS monitor shows the server-side

compactions during post-ingest measurement phases

1

10

100

1000

0 300 600 900 1200 1500 1800 Experiment Running Time (sec)

StoreFiles

Tablets

Compactions

HBASE is slower: Different compaction policies


1

10

100

1000

10000

0 60 120 180 240 300

Accu

mul

o Rea

d Lat

ency

(ms)


R/U 1 (Phase 3) R/U 2 (Phase 6) R/U 3 (Phase 8)

1

10

100

1000

0 300 600 900 1200 1500 1800 Accumulo Experiment Running Time (sec)

StoreFiles

Tablets

Compactions

1

10

100

1000

10000

0 60 120 180 240 300

HBas

e Rea

d Lat

ency

(ms)


1

10

100

1000

0 300 600 900 1200 1500 1800 HBase Experiment Running Time (sec)

Extending to table pre-splitting


Table

pre-s

plitti

ng te

st Load

Pre-load

Pre-split into N ranges


Sleep


Load (importing)




Sleep




Bulk loading test

Pre-split a key range into N partitions to avoid splitting during insertion

Outline •  Problem •  YCSB++ design •  Illustrative examples •  Ongoing work and summary


Things not covered in this talk More features: function shipping to servers − Data !ltering at the servers − Fine-grained, cell-level access control

More details in the ACM SOCC 2011 paper

Ongoing work − Analyze more table stores: Cassandra, CouchDB, MongoDB − Continue research through the new Intel Science and

Technology Center for Cloud Computing at CMU (with GaTech)


Summary: YCSB++ tool •  Tool for performance debugging and benchmarking

advanced features using new extensions to YCSB

•  Two case-studies: Apache HBASE and ACCUMULO •  Tool available at http://www.pdl.cmu.edu/ycsb++

Weak consistency semantics Distributed clients using ZooKeeper

Fast insertions (pre-splits & bulk loads) Multi-phase testing (with Hadoop)

Server-side !ltering New workload generators and database client API extensions Fine-grained access control

28 Swapnil Patil, CMU