51
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Vertica’s Design: Basics, Successes, and Failures Chuck Bear CIDR 2015 January 5, 2015

Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Vertica’s Design: Basics, Successes, and Failures Chuck Bear

CIDR 2015 ‟ January 5, 2015

Page 2: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

1. Vertica Basics: Storage Format

Page 3: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 3

Design Goals „ SQL (for the ecosystem and knowledge pool)

„ Clusters of commodity hardware (for cost)

„ Linux, x86, Ethernet

„ Software-only solution (for flexibility)

„ Special purpose hardware has poor track record in databases

„ Shared Nothing MPP

„ Cheaper, but puts more complexity in the software

„ Analytics: Run large queries many times faster than a legacy DB, load as fast, but feel free to snarl and growl at small UPDATEs and DELETEs

„ Work smart, and work hard.

„ Robust algorithms, query optimizer, vectorize, JIT, etc.

Page 4: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 4

Start from how data is stored on disk…

SELECT SUM(volume) FROM trades WHERE symbol = 'HPQ' AND date = '5/13/2011'

SYMBOL DATE TIME PRICE VOLUME ETC

… … … … … …

HPQ 05/13/11 01:02:02 PM 40.01 100 …

IBM 05/13/11 01:02:03 PM 171.22 10 …

AAPL 05/13/11 01:02:03 PM 338.02 5 …

GOOG 05/13/11 01:02:04 PM 524.03 150 …

HPQ 05/13/11 01:02:05 PM 39.97 40 …

AAPL 05/13/11 01:02:07 PM 338.02 20 …

GOOG 05/13/11 01:02:07 PM 524.02 40 …

… … … … … …

Page 5: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 5

Sorted Data

Sort by Symbol, Date, and Time

SYMBOL DATE TIME PRICE VOLUME ETC

… … … … … …

AAPL 05/13/11 01:02:07 PM 338.02 20 …

AAPL 05/13/11 01:02:03 PM 338.02 5 …

… … … … … …

GOOG 05/13/11 01:02:04 PM 524.03 150 …

GOOG 05/13/11 01:02:07 PM 524.02 40 …

… … … … … …

HPQ 05/13/11 01:02:02 PM 40.01 100 …

HPQ 05/13/11 01:02:05 PM 39.97 40 …

… … … … … …

IBM 05/13/11 01:02:03 PM 171.22 10 …

… … … … … …

Page 6: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 6

Column Files

Split into columns

SYMBOL DATE TIME PRICE VOLUME ETC

… … … … … …

AAPL 05/13/11 01:02:07 PM 338.02 20 …

AAPL 05/13/11 01:02:03 PM 338.02 5 …

… … … … … …

GOOG 05/13/11 01:02:04 PM 524.03 150 …

GOOG 05/13/11 01:02:07 PM 524.02 40 …

… … … … … …

HPQ 05/13/11 01:02:02 PM 40.01 100 …

HPQ 05/13/11 01:02:05 PM 39.97 40 …

… … … … … …

IBM 05/13/11 01:02:03 PM 171.22 10 …

… … … … … …

Page 7: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 7

Compression + RLE

SYMBOL DATE VOLUME

(8K Distinct) (250/yr)

… 22

GOOG (x18M) 05/13/2011 (x150K) 150

… 40

… 99

HPQ (x22M) 05/13/2011 (x220K) 100

… 40

… 200

IBM (x19M) 05/13/2011 (x150K) 10

… 18

Page 8: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 8

Position Index (NOT Row ID) Last PSize

Comp SzCRC

Min/MaxNull Count

Last PSize

Comp SzCRC

Min/MaxNull Count

Last PSize

Comp SzCRC

Min/MaxNull Count

CmpData

CmpData

CmpData

Page 9: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

2. Vertica Basics: Updates & Deletes

Page 10: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 10

Q: How do you update this?

SYMBOL DATE VOLUME

(8K Distinct) (250/yr)

… 22

GOOG (x18M) 05/13/2011 (x150K) 150

… 40

… 99

HPQ (x22M) 05/13/2011 (x220K) 100

… 40

… 200

IBM (x19M) 05/13/2011 (x150K) 10

… 18

Page 11: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 11

A: You Do Not!

„ Multiple sets of sorted files loaded

‟ Or keep things in memory for a while

„ Update is INSERT+DELETE

„ Delete is just a mark ‟ nice sorted list of positions

Page 12: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 12

So you need to compaction, or whatever. We call ours the tuple mover.

It’ll Get Dirty….

Maximum ROS size (~1/2 disk or less)

Negligible ROS size (1 MB/column or less)

ROS Size (Sum of all columns). Log scale.

Merge Strata (Blue dots represent ROSs)

Start Epoch, End Epoch

Full Stratum

(ready to merge)

...up to the number of strata

Stratum

Height

Stratum 0

Stratum 1

Stratum 3

Stratum 4

Page 13: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 13

Not by glamour, etc.

How Do You Judge a Tuple Mover?

„ Magical: no problems, no backlogs, no errors

„ Latency and freshness: How much batching is needed?

„ Sustained load rate (consider machine capacity + retention interval)

„ Efficiency will be required

Page 14: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

3. Vertica Basics: Transactions & Recovery

Page 15: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 15

Transactions

Vertica offers full ACID ( just at low TPS)

Queries take a snapshot of the relevant list of files, and need no locks at READ COMMITTED isolation

Tuple Mover (etc.) doesn’t interfere

Loads do not conflict with each other

COMMIT ‟ keep the new files

ROLLBACK ‟ discard them

Table level locks for SERIALIZABLE

All Operations are On-Line

Database is essentially its own undo / redo log Recovery can be as simple as file copies

Page 16: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 16

Node 1

1A 2B

Node 2

1B 2C

Node 4

1D 2A

Node 3

1C 2D

a) All nodes up

Page 17: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 17

Node 1

1A 2B

Node 2

1B 2C

Node 4

1D 2A

Node 3

1C 2D

a) All nodes up

Node 1

1A 2B

Node 2

1B 2C

Node 4

1D 2A

Node 3

1C 2D

b) Node 2 down

All data still available, in several combinations: 2A, 2B, 1C, 1D (shown) 1A, 2B, 1C, 1D 2A, 2B, 1C, 2D 1A, 2B, 1C, 2D (never chosen)

Page 18: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 18

Node 1

1A 2B

Node 2

1B 2C

Node 4

1D 2A

Node 3

1C 2D

a) All nodes up

Node 1

1A 2B

Node 2

1B 2C

Node 4

1D 2A

Node 3

1C 2D

b) Node 2 down

Node 1

1A 2B

Node 2

1B 2C

Node 4

1D 2A

Node 3

1C 2D

c) Recovery

All data still available, in several combinations: 2A, 2B, 1C, 1D (shown) 1A, 2B, 1C, 1D 2A, 2B, 1C, 2D 1A, 2B, 1C, 2D (never chosen)

Page 19: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

4. Mistake: Execution Engine Design

Page 20: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 20

Simple Design

„ Use iterators

‟ open

‟ getNext

‟ close

„ If there’s trouble, use a temp relation

Page 21: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 21

(You have to vectorize. And do JIT compiling.)

Too Slow!

0 2 4 6 8 10 12

0

1000

2000

3000

4000

5000

6000

Original (ms)

Vectorized Copy (ms)

JIT Compiled (ms)

Number of Merge Streams

Merg

e T

ime (

ms,

Nehale

m)

Page 22: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 22

“Push Model” DAG Executor You might even get parallelism for free

Page 23: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 23

“Push Model” DAG Executor

Page 24: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 24

“Push Model” DAG Executor

Page 25: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 25

“Push Model” DAG Executor

Page 26: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 26

“Push Model” DAG Executor

Page 27: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 27

“Push Model” DAG Executor

Page 28: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 28

Problems with DAG Execution

„ Free-for-all

‟ And that parallelism thing didn’t pan out after all

„ Resource usage: could do better

„ Diamond problem

„ Need to give clues to upstream operations

‟ Imagine subqueries?

Page 29: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 29

We threw it away, and went back to the “pull” model

End Result?

„ Block iterators

‟ open

‟ getNextBatch (w/ optimizations to avoid tuple copies)

‟ close

‟ Also, send information back upstream

„ When it gets tricky, use coroutines or other tactics

„ We still push data when there are multiple targets

‟ Such as loading multiple projections, UPDATEs, etc.

Page 30: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

5. Evolution of Joins in Vertica: The Good, Bad, and Ugly

Page 31: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 31

Scan all columns

Perform Join of 'fk' against 'pk' IN list

Final 'sv' SUM

Pre-SUM 'sv' data Against 'fk' join key

(optional)

a) No SIPS, EMJ

SELECT SUM(sv) FROM fact WHERE fk IN (SELECT pk FROM d)

Early Materialized Joins

Page 32: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 32

Scan all columns

Perform Join of 'fk' against 'pk' IN list

Final 'sv' SUM

Pre-SUM 'sv' data Against 'fk' join key

(optional)

Scan 'fk' key column

Perform Join of 'fk' against 'pk' IN list

Materialize columns from rows

that joined

SUM 'sv' from rows

a) No SIPS, EMJ b) No SIPS, LMJ

SELECT SUM(sv) FROM fact WHERE fk IN (SELECT pk FROM d)

Late Materialized Joins

Page 33: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 33

SCANfact

SCANd

JOIN

1

23

Sideways Information Passing (SIPS)

Page 34: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 34

Scan 'fk' key column (Filter out keys that

will not join)

Perform Join of 'fk' against 'pk' IN list

Materialize columns from rows that joined

SUM 'sv' from rows

Scan 'fk' key column (Filter out keys that

will not join)

Perform Join of 'fk' against 'pk' IN list

Final 'sv' SUM

Pre-SUM 'sv' data against 'fk' join key

(optional)

c) SIPS, EMJ d) SIPS, LMJ

SELECT SUM(sv) FROM fact WHERE fk IN (SELECT pk FROM d)

Late Materialized Joins

Page 35: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 35

Outcome?

Selectivity Neither Feature LMJ only SIPS only SIPS+LMJ

0.00% 1206 39 23 271.00% 1202 63 33 39

2.00% 1200 75 50 57

3.00% 1208 121 75 79

5.00% 1207 151 93 116

10.00% 1200 195 141 191

20.00% 1202 362 405 360

50.00% 1202 1050 1086 1047

100.00% 1204 1720 1222 1724

Page 36: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 36

a) Good join order, no SIPS

A B

CJoin

Join

10M

1M

10

100

10M

Robustness to Join Order Errors

Page 37: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 37

b) Bad join order, no SIPS

A C

BJoin

Join

10M

100M

100

10

10M

Robustness to Join Order Errors

Page 38: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 38

c) Good join order, w/ SIPS d) Bad join order, w/ SIPS

A B

CJoin

Join

1M

1M

10

100

10M

A C

BJoin

Join

1M

10M

100

10

10M

Robustness to Join Order Errors

Page 39: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

6. Mistake: Partitioned Hash Join

Page 40: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 40

Partitioned Hash Join vs. Sort Merge Join

„ (There are papers about these)

„ PHJ was the first one tried

„ SMJ was simpler to implement

„ Sometimes one relation is sorted already

„ Sometimes, you need to sort for other reasons

„ Much more compatible with SIPS

Page 41: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 41

Also, There’s Performance

0 1 2 3 4 5 6 7 8 9

0

5000

10000

15000

20000

25000

PHJ

SMJ

Page 42: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

7. Good Idea: Data Collection

Page 43: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 43

A database that doesn’t self-collect is hypocrisy at its worst

Big Data Mentality

„ How busy is the machine compared to historical trends?

„ What have my users been doing?

„ How long will this job take to finish?

„ What is the most common error?

„ When was the last time we made a backup?

„ My request’s run-time changed… why?

„ Have there been changes from the standard configuration?

„ Are there problems that the customer hasn’t called about?

„ Which features have been used?

„ Where do customer machines burn the most CPU cycles?

Page 44: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 44

Unexpected Questions

Page 45: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 45

Don’t Compromise on the Design

„ Data collector can’t kill the system

„ Like a log, lots of little appends

„ Shouldn’t accidentally monitor itself

„ Should be able to analyze off-line

„ Result: Separate data management scheme

Page 46: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

8. Good Idea: Dynamic Workload Management

Page 47: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 47

Static (Known) Workload Management

Don't want reports to take over the entire system, preventing loads or tactical queries

Keep some resources (e.g. memory) reserved so that high-priority queries can always begin

Apply run-time prioritization to manage CPU and I/O

Page 48: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 48

Unpredictable Workload: Short Query Bias

Page 49: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 49

Q: Are optimizer cost model estimates really that bad?

Dynamic Prioritization

Page 50: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 50

Q: Are optimizer cost model estimates really that bad?

A: Doesn’t matter!

Dynamic Prioritization

0 50 100 150 200 0

20

40

60

80

100

120

Time (s)

Cum

ula

tive

Co

mp

leti

on

(%

)

Unprioritized

Dynamic Priority

Page 51: Vertica’s Design: Basics, Successes, and Failurescidrdb.org/cidr2015/Slides/33_CIDR15_Slides_Paper33.pdfTitle Vertica’s Design: Basics, Successes, and Failures Author Chuck Edward

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Thank you

Please come visit our development team in:

Boston (Cambridge and Andover), MA

Pittsburgh, PA

Sunnyvale, CA