70
CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU www.cs.cmu.edu/~christos

CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

Embed Size (px)

Citation preview

Page 1: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826: Multimedia Databases and Data Mining

Lecture#2: Primary key indexing – B-trees

Christos Faloutsos - CMU www.cs.cmu.edu/~christos

Page 2: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

Reading Material

[Ramakrishnan & Gehrke, 3rd ed, ch. 10]

15-826 Copyright: C. Faloutsos (2012) 2

Page 3: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 3

Problem

Given a large collection of (multimedia) records, find similar/interesting things, ie:

• Allow fast, approximate queries, and

• Find rules/patterns

Page 4: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 4

Outline

Goal: ‘Find similar / interesting things’

• Intro to DB

• Indexing - similarity search

• Data Mining

Page 5: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 5

Indexing - Detailed outline

• primary key indexing– B-trees and variants– (static) hashing– extendible hashing

• secondary key indexing• spatial access methods• text• ...

Page 6: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 6

In even more detail:

• B – trees

• B+ - trees, B*-trees

• hashing

Page 7: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 7

Primary key indexing

• find employee with ssn=123

Page 8: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 8

B-trees

• the most successful family of index schemes (B-trees, B+-trees, B*-trees)

• Can be used for primary/secondary, clustering/non-clustering index.

• balanced “n-way” search trees

Page 9: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 9

Citation

• Rudolf Bayer and Edward M. McCreight, Organization and Maintenance of Large Ordered Indices, Acta Informatica, 1:173-189, 1972.

• Received the 2001 SIGMOD innovations award• among the most cited db publications

•www.informatik.uni-trier.de/~ley/db/about/top.html

Page 10: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 10

B-trees

Eg., B-tree of order 3:

1 3

6

7

9

13

<6

>6 <9 >9

Page 11: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 11

B - tree properties:

• each node, in a B-tree of order n:– Key order– at most n pointers– at least n/2 pointers (except root)– all leaves at the same level– if number of pointers is k, then node has exactly k-1

keys– (leaves are empty)

v1 v2 … vn-1

p1 pn

Page 12: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 12

Properties

• “block aware” nodes: each node -> disk page

• O(log (N)) for everything! (ins/del/search)

• typically, if n = 50 - 100, then 2 - 3 levels

• utilization >= 50%, guaranteed; on average 69%

Page 13: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 13

Queries

• Algo for exact match query? (eg., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9

Page 14: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 14

Queries

• Algo for exact match query? (eg., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9

Page 15: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 15

Queries

• Algo for exact match query? (eg., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9

Page 16: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 16

Queries

• Algo for exact match query? (eg., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9

Page 17: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 17

Queries

• Algo for exact match query? (eg., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9H steps (= disk accesses)

Page 18: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 18

Queries

• what about range queries? (eg., 5<salary<8)

• Proximity/ nearest neighbor searches? (eg., salary ~ 8 )

Page 19: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 19

Queries• what about range queries? (eg., 5<salary<8)

• Proximity/ nearest neighbor searches? (eg., salary ~ 8 )

1 3

6

7

9

13

<6

>6 <9 >9

Page 20: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 20

Queries• what about range queries? (eg., 5<salary<8)

• Proximity/ nearest neighbor searches? (eg., salary ~ 8 )

1 3

6

7

9

13

<6

>6 <9 >9

Page 21: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 21

B-trees: Insertion

• Insert in leaf; on overflow, push middle up (recursively)

• split: preserves B - tree properties

Page 22: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 22

B-trees

Easy case: Tree T0; insert ‘8’

1 3

6

7

9

13

<6

>6 <9 >9

Page 23: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 23

B-trees

Tree T0; insert ‘8’

1 3

6

7

9

13

<6

>6 <9 >9

8

Page 24: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 24

B-trees

Hardest case: Tree T0; insert ‘2’

1 3

6

7

9

13

<6

>6 <9 >9

2

Page 25: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 25

B-trees

Hardest case: Tree T0; insert ‘2’

1 2

6

7

9

133

push middle up

Page 26: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 26

B-trees

Hardest case: Tree T0; insert ‘2’

6

7

9

131 3

22Ovf; push middle

Page 27: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 27

B-trees

Hardest case: Tree T0; insert ‘2’

7

9

131 3

2

6

Final state

Page 28: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 28

B-trees: Insertion

• Q: What if there are two middles? (eg, order 4)

• A: either one is fine

Page 29: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 29

B-trees: Insertion

• Insert in leaf; on overflow, push middle up (recursively – ‘propagate split’)

• split: preserves all B - tree properties (!!)• notice how it grows: height increases when

root overflows & splits• Automatic, incremental re-organization

Page 30: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 30

Overview

• B – trees

– Dfn, Search, insertion, deletion

• B+ - trees

• hashing

Page 31: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 31

Deletion

Rough outline of algo:• Delete key;• on underflow, may need to merge

In practice, some implementors just allow underflows to happen…

Page 32: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 32

B-trees – Deletion

Easiest case: Tree T0; delete ‘3’

1 3

6

7

9

13

<6

>6 <9 >9

Page 33: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 33

B-trees – Deletion

Easiest case: Tree T0; delete ‘3’

1

6

7

9

13

<6

>6 <9 >9

Page 34: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 34

B-trees – Deletion

Easiest case: Tree T0; delete ‘3’

1

6

7

9

13

<6

>6 <9 >9

Page 35: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 35

B-trees – Deletion

• Case1: delete a key at a leaf – no underflow

• Case2: delete non-leaf key – no underflow

• Case3: delete leaf-key; underflow, and ‘rich sibling’

• Case4: delete leaf-key; underflow, and ‘poor sibling’

Page 36: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 36

B-trees – Deletion

• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)

1 3

6

7

9

13

<6

>6 <9 >9

Delete & promote, ie:

Page 37: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 37

B-trees – Deletion

• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)

1 3 7

9

13

<6

>6 <9 >9

Delete & promote, ie:

Page 38: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 38

B-trees – Deletion

• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)

1 7

9

13

<6

>6 <9 >9

Delete & promote, ie:3

Page 39: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 39

B-trees – Deletion

• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)

1 7

9

13

<3

>3 <9 >9

3FINAL TREE

Page 40: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 40

B-trees – Deletion

• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)

• Q: How to promote? • A: pick the largest key from the left sub-tree

(or the smallest from the right sub-tree)• Observation: every deletion eventually

becomes a deletion of a leaf key

Page 41: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 41

B-trees – Deletion

• Case1: delete a key at a leaf – no underflow

• Case2: delete non-leaf key – no underflow

• Case3: delete leaf-key; underflow, and ‘rich sibling’

• Case4: delete leaf-key; underflow, and ‘poor sibling’

Page 42: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 42

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1 3

6

7

9

13

<6

>6 <9 >9

Delete & borrow, ie:

Page 43: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 43

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1 3

6 9

13

<6

>6 <9 >9

Delete & borrow, ie:

Rich sibling

Page 44: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 44

B-trees – Deletion

• Case3: underflow & ‘rich sibling’

• ‘rich’ = can give a key, without underflowing

• ‘borrowing’ a key: THROUGH the PARENT!

Page 45: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 45

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1 3

6 9

13

<6

>6 <9 >9

Delete & borrow, ie:

Rich sibling

NO!!

Page 46: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 46

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1 3

6 9

13

<6

>6 <9 >9

Delete & borrow, ie:

Page 47: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 47

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1 3

9

13

<6

>6 <9 >9

Delete & borrow, ie:

6

Page 48: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 48

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1

3 9

13

<6

>6 <9 >9

Delete & borrow, ie:

6

Page 49: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 49

B-trees – Deletion

• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)

1

3 9

13

<3

>3 <9 >9

Delete & borrow, through the parent

6

FINAL TREE

Page 50: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 50

B-trees – Deletion

• Case1: delete a key at a leaf – no underflow

• Case2: delete non-leaf key – no underflow

• Case3: delete leaf-key; underflow, and ‘rich sibling’

• Case4: delete leaf-key; underflow, and ‘poor sibling’

Page 51: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 51

B-trees – Deletion

• Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0)

1 3

6

7

9

13

<6

>6 <9 >9

Page 52: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 52

B-trees – Deletion

• Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0)

1 3

6

7

9<6

>6 <9 >9

Page 53: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 53

B-trees – Deletion

• Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0)

1 3

6

7

9<6

>6 <9 >9

A: merge w/ ‘poor’ sibling

Page 54: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 54

B-trees – Deletion

• Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0)

• Merge, by pulling a key from the parent

• exact reversal from insertion: ‘split and push up’, vs. ‘merge and pull down’

• Ie.:

Page 55: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 55

B-trees – Deletion

• Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0)

1 3

6

7

<6

>6

A: merge w/ ‘poor’ sibling

9

Page 56: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 56

B-trees – Deletion

• Case4: underflow & ‘poor sibling’ (eg., delete 13 from T0)

1 3

6

7

<6

>69

FINAL TREE

Page 57: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 57

B-trees – Deletion

• Case4: underflow & ‘poor sibling’

• -> ‘pull key from parent, and merge’

• Q: What if the parent underflows?

Page 58: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 58

B-trees – Deletion

• Case4: underflow & ‘poor sibling’

• -> ‘pull key from parent, and merge’

• Q: What if the parent underflows?

• A: repeat recursively

Page 59: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 59

Overview

• B – trees

• B+ - trees, B*-trees

• hashing

Page 60: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 60

B+ trees - Motivationif we want to store the whole record with the key –> problems (what?)

1 3

6

7

9

13

<6

>6 <9 >9

Page 61: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 61

Solution: B+ - trees

• They string all leaf nodes together

• AND

• replicate keys from non-leaf nodes, to make sure every key appears at the leaf level

Page 62: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 62

B+ trees

1 3

6

6

9

9

<6

>=6 <9 >=9

7 13

Page 63: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 63

B+ trees - insertion

1 3

6

6

9

9

<6

>=6 <9 >=9

7 13

Eg., insert ‘8’

Page 64: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 64

Overview

• B – trees

• B+ - trees, B*-trees

• hashing

Page 65: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 65

B*-trees

• splits drop util. to 50%, and maybe increase height

• How to avoid them?

Page 66: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 66

B*-trees: deferred split!• Instead of splitting, LEND keys to sibling!

(through PARENT, of course!)

1 3

6

7

9

13

<6

>6 <9 >9

2

Page 67: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 67

B*-trees: deferred split!• Instead of splitting, LEND keys to sibling!

(through PARENT, of course!)

1 2

3

6

9

13

<3

>3 <9 >9

2

7

FINAL TREE

Page 68: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 68

B*-trees: deferred split!• Notice: shorter, more packed, faster tree

• It’s a rare case, where space utilization and speed improve together

• BUT: What if the sibling has no room for our ‘lending’?

Page 69: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 69

B*-trees: deferred split!• BUT: What if the sibling has no room for

our ‘lending’?

• A: 2-to-3 split: get the keys from the sibling, pool them with ours (and a key from the parent), and split in 3.

• Details: too messy (and even worse for deletion)

Page 70: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU christos

CMU SCS

15-826 Copyright: C. Faloutsos (2012) 70

Conclusions• Main ideas: recursive; block-aware; on

overflow -> split; defer splits

• All B-tree variants have excellent, O(logN) worst-case performance for ins/del/search

• B+ tree is the prevailing indexing method

• More details: [Knuth vol 3.] or [Ramakrishnan & Gehrke, 3rd ed, ch. 10]