CS 245Notes 41 CS 245: Database System Principles Notes 4: Indexing Hector Garcia-Molina

CS 245 Notes 4 1

CS 245: Database System Principles

Notes 4: Indexing

Hector Garcia-Molina

CS 245 Notes 4 2

Indexing & Hashing

value

Chapter 4

value

record

CS 245 Notes 4 3

Topics

• Conventional indexes• B-trees• Hashing schemes

CS 245 Notes 4 4

Sequential File

2010

4030

6050

8070

10090

CS 245 Notes 4 5

Sequential File

2010

4030

6050

8070

10090

Dense Index

10203040

50607080

90100110120

CS 245 Notes 4 6

Sequential File

2010

4030

6050

8070

10090

Sparse Index

10305070

90110130150

170190210230

CS 245 Notes 4 7

Sequential File

2010

4030

6050

8070

10090

Sparse 2nd level

10305070

90110130150

170190210230

1090

170250

330410490570

CS 245 Notes 4 8

• Comment:{FILE,INDEX} may be contiguous

or not (blocks chained)

CS 245 Notes 4 9

Question:

• Can we build a dense, 2nd level index for a dense index?

CS 245 Notes 4 10

Notes on pointers:

(1) Block pointer (sparse index) can be smaller than record pointer

BP

RP

CS 245 Notes 4 11

(2) If file is contiguous, then we can omitpointers (i.e., compute them)

Notes on pointers:

CS 245 Notes 4 12

K1

K3

K4

K2

R1

R2

R3

R4

CS 245 Notes 4 13

K1

K3

K4

K2

R1

R2

R3

R4

say:1024 Bper block

• if we want K3 block: get it at offset (3-1)1024 = 2048 bytes

CS 245 Notes 4 14

Sparse vs. Dense Tradeoff

• Sparse: Less index space per record can keep more of

index in memory• Dense: Can tell if any record exists

without accessing file

(Later: – sparse better for insertions– dense needed for secondary indexes)

CS 245 Notes 4 15

Terms

• Index sequential file• Search key ( primary key)• Primary index (on Sequencing field)• Secondary index• Dense index (all Search Key values in)• Sparse index• Multi-level index

CS 245 Notes 4 16

Next:

• Duplicate keys

• Deletion/Insertion

• Secondary indexes

CS 245 Notes 4 17

Duplicate keys

1010

2010

3020

3030

4540

CS 245 Notes 4 18

1010

2010

3020

3030

4540

10101020

20303030

1010

2010

3020

3030

4540

10101020

20303030

Dense index, one way to implement?

Duplicate keys

CS 245 Notes 4 19

1010

2010

3020

3030

4540

10203040

Dense index, better way?

Duplicate keys

CS 245 Notes 4 20

1010

2010

3020

3030

4540

10102030

Sparse index, one way?

Duplicate keys

CS 245 Notes 4 21

1010

2010

3020

3030

4540

10102030

Sparse index, one way?

Duplicate keys

care

ful if lookin

gfo

r 2

0 o

r 3

0!

CS 245 Notes 4 22

1010

2010

3020

3030

4540

10203030

Sparse index, another way?

Duplicate keys

– place first new key from block

CS 245 Notes 4 23

1010

2010

3020

3030

4540

10203030

Sparse index, another way?

Duplicate keys

– place first new key from block

shouldthis be40?

CS 245 Notes 4 24

Duplicate values, primary index

• Index may point to first instance ofeach value only

File Index

Summary

aaa

b

CS 245 Notes 4 25

Deletion from sparse index

2010

4030

6050

8070

10305070

90110130150

CS 245 Notes 4 26


2010

4030

6050

8070

10305070

90110130150

– delete record 40

CS 245 Notes 4 27


2010

4030

6050

8070

10305070

90110130150


CS 245 Notes 4 28


2010

4030

6050

8070

10305070

90110130150


CS 245 Notes 4 29


2010

4030

6050

8070

10305070

90110130150


4040

CS 245 Notes 4 30


2010

4030

6050

8070

10305070

90110130150

– delete records 30 & 40

CS 245 Notes 4 31


2010

4030

6050

8070

10305070

90110130150


CS 245 Notes 4 32


2010

4030

6050

8070

10305070

90110130150


5070

CS 245 Notes 4 33

Deletion from dense index

2010

4030

6050

8070

10203040

50607080

CS 245 Notes 4 34


2010

4030

6050

8070

10203040

50607080


CS 245 Notes 4 35


2010

4030

6050

8070

10203040

50607080


40

CS 245 Notes 4 36


2010

4030

6050

8070

10203040

50607080


4040

CS 245 Notes 4 37

Insertion, sparse index case

2010

30

5040

60

10304060

CS 245 Notes 4 38


2010

30

5040

60

10304060

– insert record 34

CS 245 Notes 4 39


2010

30

5040

60

10304060


34

• our lucky day! we have free space where we need it!

CS 245 Notes 4 40


2010

30

5040

60

10304060


CS 245 Notes 4 41


2010

30

5040

60

10304060


15

2030

20

CS 245 Notes 4 42


2010

30

5040

60

10304060


15

2030

20

• Illustrated: Immediate reorganization• Variation:

– insert new block (chained file)– update index

CS 245 Notes 4 43


2010

30

5040

60

10304060


CS 245 Notes 4 44


2010

30

5040

60

10304060


25

overflow blocks(reorganize later...)

CS 245 Notes 4 45

Insertion, dense index case

• Similar

• Often more expensive . . .

CS 245 Notes 4 46

Secondary indexesSequencefield

5030

7020

4080

10100

6090

CS 245 Notes 4 47


5030

7020

4080

10100

6090

• Sparse index

302080

100

90...

CS 245 Notes 4 48


5030

7020

4080

10100

6090

• Sparse index

302080

100

90...

does not make sense!

CS 245 Notes 4 49


5030

7020

4080

10100

6090

• Dense index

CS 245 Notes 4 50


5030

7020

4080

10100

6090

• Dense index10203040

506070...

CS 245 Notes 4 51


5030

7020

4080

10100

6090

• Dense index10203040

506070...

105090...

sparsehighlevel

CS 245 Notes 4 52

With secondary indexes:

• Lowest level is dense• Other levels are sparse

Also: Pointers are record pointers

(not block pointers; not computed)

CS 245 Notes 4 53

Duplicate values & secondary indexes

1020

4020

4010

4010

4030

CS 245 Notes 4 54


1020

4020

4010

4010

4030

10101020

20304040

4040...

one option...

CS 245 Notes 4 55


1020

4020

4010

4010

4030

10101020

20304040

4040...

one option...

Problem:excess overhead!

• disk space• search time

CS 245 Notes 4 56


1020

4020

4010

4010

4030

10

another option...

4030

20

CS 245 Notes 4 57


1020

4020

4010

4010

4030

10

another option...

4030

20Problem:variable sizerecords inindex!

CS 245 Notes 4 58


1020

4020

4010

4010

4030

10203040

5060...

Another idea (suggested in class):Chain records with same key?

CS 245 Notes 4 59


1020

4020

4010

4010

4030

10203040

5060...

Another idea (suggested in class):Chain records with same key?

Problems:• Need to add fields to records• Need to follow chain to know records

CS 245 Notes 4 60


1020

4020

4010

4010

4030

10203040

5060...

buckets

CS 245 Notes 4 61

Why “bucket” idea is useful

Indexes RecordsName: primary EMP

(name,dept,floor,...)

Dept: secondaryFloor: secondary

CS 245 Notes 4 62

Query: Get employees in

(Toy Dept) ^ (2nd floor)

Dept. index EMP Floor index

Toy 2nd

CS 245 Notes 4 63

Query: Get employees in

(Toy Dept) ^ (2nd floor)

Dept. index EMP Floor index

Toy 2nd

Intersect toy bucket and 2nd Floor

bucket to get set of matching EMP’s

CS 245 Notes 4 64

This idea used in text information retrieval

Documents

...the cat is fat ...

...was raining cats and dogs...

...Fido the dog ...

CS 245 Notes 4 65

This idea used in text information retrieval

Documents

...the cat is fat ...

...was raining cats and dogs...

...Fido the dog ...

Inverted lists

cat

dog

CS 245 Notes 4 66

IR QUERIES

• Find articles with “cat” and “dog”• Find articles with “cat” or “dog”• Find articles with “cat” and not “dog”

CS 245 Notes 4 67

IR QUERIES

• Find articles with “cat” and “dog”• Find articles with “cat” or “dog”• Find articles with “cat” and not “dog”

• Find articles with “cat” in title• Find articles with “cat” and “dog”

within 5 words

CS 245 Notes 4 68

Common technique: more info in inverted

list

cat Title 5

Title 100

Author 10

Abstract 57

Title 12

d3d2

d1

dog

typeposit

ion

locatio

n

CS 245 Notes 4 69

Posting: an entry in inverted list.Represents occurrence

ofterm in article

Size of a list: 1 Rare words or (in postings) miss-spellings

106 Common words

Size of a posting: 10-15 bits (compressed)

CS 245 Notes 4 70

IR DISCUSSION

• Stop words• Truncation• Thesaurus• Full text vs. Abstracts• Vector model

CS 245 Notes 4 71

Vector space model

w1 w2 w3 w4 w5 w6 w7 …

DOC = <1 0 0 1 1 0 0 …>

Query= <0 0 1 1 0 0 0 …>

CS 245 Notes 4 72

Vector space model

w1 w2 w3 w4 w5 w6 w7 …

DOC = <1 0 0 1 1 0 0 …>

Query= <0 0 1 1 0 0 0 …>

PRODUCT = 1 + ……. = score

CS 245 Notes 4 73

• Tricks to weigh scores + normalize

e.g.: Match on common word not asuseful as match on rare

words...

CS 245 Notes 4 74

• How to process V.S. Queries?

w1 w2 w3 w4 w5 w6 …Q = < 0 0 0 1 1 0 … >

CS 245 Notes 4 75

• Try Stanford Libraries• Try Google, Yahoo, ...

CS 245 Notes 4 76

Summary so far

• Conventional index– Basic Ideas: sparse, dense, multi-

level…– Duplicate Keys– Deletion/Insertion– Secondary indexes

– Buckets of Postings List

CS 245 Notes 4 77

Conventional indexes

Advantage:- Simple- Index is sequential file

good for scans

Disadvantage:- Inserts expensive,

and/or- Lose sequentiality &

balance

CS 245 Notes 4 78

Example Index (sequential)

continuous

free space

102030

405060

708090

CS 245 Notes 4 79

Example Index (sequential)

continuous

free space

102030

405060

708090

39313536

323834

33

overflow area(not sequential)

CS 245 Notes 4 80

Outline:

• Conventional indexes• B-Trees NEXT• Hashing schemes

CS 245 Notes 4 81

• NEXT: Another type of index– Give up on sequentiality of index– Try to get “balance”

CS 245 Notes 4 82

Root

B+Tree Example n=3

100

120

150

180

30

3 5 11

30

35

100

101

110

120

130

150

156

179

180

200

CS 245 Notes 4 83

Sample non-leaf

to keys to keys to keys to keys

< 57 57 k<81 81k<95 95

57

81

95

CS 245 Notes 4 84

Sample leaf node:

From non-leaf node

to next leafin

sequence5

7

81

95

To r

eco

rd

wit

h k

ey 5

7

To r

eco

rd

wit

h k

ey 8

1

To r

eco

rd

wit

h k

ey 8

5

CS 245 Notes 4 85

In textbook’s notationn=3

Leaf:

Non-leaf:

30

35

30

30 35

30

CS 245 Notes 4 86

Size of nodes: n+1 pointersn keys

(fixed)

CS 245 Notes 4 87

Don’t want nodes to be too empty

• Use at least

Non-leaf: (n+1)/2pointers

Leaf: (n+1)/2 pointers to data

CS 245 Notes 4 88

Full nodemin. node

Non-leaf

Leaf

n=3

12

01

50

18

0

30

3 5 11

30

35

counts

even if

null

CS 245 Notes 4 89

B+tree rules tree of order n

(1) All leaves at same lowest level(balanced tree)

(2) Pointers in leaves point to records except for “sequence pointer”

CS 245 Notes 4 90

(3) Number of pointers/keys for B+tree

Non-leaf(non-root) n+1 n (n+1)/2 (n+1)/2- 1

Leaf(non-root) n+1 n

Root n+1 n 1 1

Max Max Min Min ptrs keys ptrsdata keys

(n+1)/2 (n+1)/2

CS 245 Notes 4 91

Insert into B+tree

(a) simple case– space available in leaf

(b) leaf overflow(c) non-leaf overflow(d) new root

CS 245 Notes 4 92

(a) Insert key = 32 n=33 5 11

30

31

30

100

CS 245 Notes 4 93

(a) Insert key = 32 n=33 5 11

30

31

30

100

32

CS 245 Notes 4 94

(a) Insert key = 7 n=3

3 5 11

30

31

30

100

CS 245 Notes 4 95


3 5 11

30

31

30

100

3 5

7

CS 245 Notes 4 96


3 5 11

30

31

30

100

3 5

7

7

CS 245 Notes 4 97

(c) Insert key = 160 n=3

10

0

120

150

180

150

156

179

180

200

CS 245 Notes 4 98


10

0

120

150

180

150

156

179

180

200

160

179

CS 245 Notes 4 99


10

0

120

150

180

150

156

179

180

200

18

0

160

179

CS 245 Notes 4 100


10

0

120

150

180

150

156

179

180

200

160

18

0

160

179

CS 245 Notes 4 101

(d) New root, insert 45 n=3

10

20

30

1 2 3 10

12

20

25

30

32

40

CS 245 Notes 4 102


10

20

30

1 2 3 10

12

20

25

30

32

40

40

45

CS 245 Notes 4 103


10

20

30

1 2 3 10

12

20

25

30

32

40

40

45

40

CS 245 Notes 4 104


10

20

30

1 2 3 10

12

20

25

30

32

40

40

45

40

30new root

CS 245 Notes 4 105

(a) Simple case - no example

(b) Coalesce with neighbor (sibling)

(c) Re-distribute keys(d) Cases (b) or (c) at non-leaf

Deletion from B+tree

CS 245 Notes 4 106

(b) Coalesce with sibling– Delete 50

10

40

100

10

20

30

40

50

n=4

CS 245 Notes 4 107

(b) Coalesce with sibling– Delete 50

10

40

100

10

20

30

40

50

n=4

40

CS 245 Notes 4 108

(c) Redistribute keys– Delete 50

10

40

100

10

20

30

35

40

50

n=4

CS 245 Notes 4 109

(c) Redistribute keys– Delete 50

10

40

100

10

20

30

35

40

50

n=4

35

35

CS 245 Notes 4 110

40

45

30

37

25

26

20

22

10

141 3

10

20

30

40

(d) Non-leaf coalese– Delete 37

n=4

25

CS 245 Notes 4 111

40

45

30

37

25

26

20

22

10

141 3

10

20

30

40


n=4

30

25

CS 245 Notes 4 112

40

45

30

37

25

26

20

22

10

141 3

10

20

30

40


n=4

40

30

25

CS 245 Notes 4 113

40

45

30

37

25

26

20

22

10

141 3

10

20

30

40


n=4

40

30

25

25

new root

CS 245 Notes 4 114

B+tree deletions in practice

– Often, coalescing is not implemented– Too hard and not worth it!

CS 245 Notes 4 115

Comparison: B-trees vs. static indexed sequential

fileRef #1: Held & Stonebraker

“B-Trees Re-examined”CACM, Feb. 1978

CS 245 Notes 4 116

Ref # 1 claims:- Concurrency control harder in B-Trees

- B-tree consumes more space

For their comparison:block = 512 byteskey = pointer = 4 bytes4 data records per block

CS 245 Notes 4 117

Example: 1 block static index

127 keys

(127+1)4 = 512 Bytes-> pointers in index implicit! up to 127

blocks

k1

k2

k3

k1

k2

k3

1 datablock

CS 245 Notes 4 118

Example: 1 block B-tree

63 keys

63x(4+4)+8 = 512 Bytes-> pointers needed in B-tree up to 63

blocks because index is blocksnot contiguous

k1

k2

...

k63

k1

k2

k3

1 datablock

next

-

CS 245 Notes 4 119

Size comparison Ref. #1Size comparison Ref. #1

Static Index B-tree# data # datablocks height blocks

height

2 -> 127 2 2 -> 63 2128 -> 16,129 3 64 -> 3968 316,130 -> 2,048,383 4 3969 -> 250,047 4

250,048 -> 15,752,961 5

CS 245 Notes 4 120

Ref. #1 analysis claims

• For an 8,000 block file,after 32,000 inserts

after 16,000 lookups Static index saves enough accesses

to allow for reorganization

CS 245 Notes 4 121

Ref. #1 analysis claims

• For an 8,000 block file,after 32,000 inserts

after 16,000 lookups Static index saves enough accesses

to allow for reorganization

Ref. #1 conclusion Static index better!!

CS 245 Notes 4 122

Ref #2: M. Stonebraker, “Retrospective on a

database system,” TODS, June 1980

Ref. #2 conclusion B-trees better!!

CS 245 Notes 4 123

• DBA does not know when to reorganize• DBA does not know how full to load

pages of new index


CS 245 Notes 4 124

• Buffering– B-tree: has fixed buffer requirements– Static index: must read several

overflow blocks to be efficient (large & variable size buffers needed for this)


CS 245 Notes 4 125

• Speaking of buffering… Is LRU a good policy for B+tree

buffers?

CS 245 Notes 4 126

• Speaking of buffering… Is LRU a good policy for B+tree

buffers? Of course not! Should try to keep root in memory

at all times(and perhaps some nodes from second level)

CS 245 Notes 4 127

Interesting problem:

For B+tree, how large should n be?

…

n is number of keys / node

CS 245 Notes 4 128

Sample assumptions:(1) Time to read node from disk is

(S+Tn) msec.

CS 245 Notes 4 129


(S+Tn) msec.(2) Once block in memory, use

binary search to locate key:(a + b LOG2

n) msec.For some constants a,b; Assume a <<

S

CS 245 Notes 4 130


(S+Tn) msec.(2) Once block in memory, use

binary search to locate key:(a + b LOG2

n) msec.For some constants a,b; Assume a <<

S(3) Assume B+tree is full, i.e.,

# nodes to examine is LOGn N where N = # records

CS 245 Notes 4 131

Can get: f(n) = time to find a record

f(n)

nopt n

CS 245 Notes 4 132

FIND nopt by f’(n) = 0

Answer is nopt = “few hundred”

(see homework for details)

CS 245 Notes 4 133

FIND nopt by f’(n) = 0

Answer is nopt = “few hundred”

(see homework for details)

What happens to nopt as

• Disk gets faster?• CPU get faster?

Exercise

• f(n)= lognN*[S+T*n+a+b*log2(n)]

CS 245 Notes 4 134

S= 14000

T= 0.2

b= 0.002

a= 0

N= 10000000

N=10 million records

CS 245 Notes 4 135

S= 14000T= 0.2b= 0.002a= 0

N=1000000

0

times in microseconds

N=100 million records

CS 245 Notes 4 136

S= 14000T= 0.2b= 0.002a= 0

N=1000000

0

times in microseconds

CS 245 Notes 4 137

Variation on B+tree: B-tree (no +)• Idea:

– Avoid duplicate keys– Have record pointers in non-leaf nodes

CS 245 Notes 4 138

to record to record to record with K1 with K2 with K3

to keys to keys to keys to keys < K1 K1<x<K2 K2<x<k3 >k3

K1 P1 K2 P2 K3 P3

CS 245 Notes 4 139

B-tree example n=2

65

125

145

165

85

105

25

45

10

20

30

40

110

120

90

100

70

80

170

180

50

60

130

140

150

160

CS 245 Notes 4 140

B-tree example n=2

65

125

145

165

85

105

25

45

10

20

30

40

110

120

90

100

70

80

170

180

50

60

130

140

150

160

• sequence pointers not useful now! (but keep space for simplicity)

CS 245 Notes 4 141

Note on inserts

• Say we insert record with key = 25

10

20

30 n=3

leaf

CS 245 Notes 4 142

Note on inserts

• Say we insert record with key = 25

10

20

30 n=3

leaf

10

– 20 –

25

30

• Afterwards:

CS 245 Notes 4 143

So, for B-trees:

MAX MINTree Rec Keys Tree Rec KeysPtrs Ptrs Ptrs Ptrs

Non-leafnon-root n+1 n n (n+1)/2 (n+1)/2-1

(n+1)/2-1Leafnon-root 1 n n 1 n/2 n/2

Rootnon-leafn+1 n n 2 1 1

RootLeaf 1 n n 1 1 1

CS 245 Notes 4 144

Tradeoffs:

B-trees have faster lookup than B+trees

in B-tree, non-leaf & leaf different sizes in B-tree, deletion more complicated

CS 245 Notes 4 145

Tradeoffs:

B-trees have faster lookup than B+trees

in B-tree, non-leaf & leaf different sizes in B-tree, deletion more complicated

B+trees preferred!

CS 245 Notes 4 146

But note:

• If blocks are fixed size (due to disk and buffering restrictions)

Then lookup for B+tree is actually better!!

CS 245 Notes 4 147

Example:- Pointers 4 bytes- Keys 4 bytes- Blocks 100 bytes (just example)- Look at full 2 level tree

CS 245 Notes 4 148

Root has 8 keys + 8 record pointers+ 9 son pointers

= 8x4 + 8x4 + 9x4 = 100 bytes

B-tree:

CS 245 Notes 4 149


= 8x4 + 8x4 + 9x4 = 100 bytes

B-tree:

Each of 9 sons: 12 rec. pointers (+12 keys)

= 12x(4+4) + 4 = 100 bytes

CS 245 Notes 4 150


= 8x4 + 8x4 + 9x4 = 100 bytes

B-tree:

Each of 9 sons: 12 rec. pointers (+12 keys)

= 12x(4+4) + 4 = 100 bytes2-level B-tree, Max # records =

12x9 + 8 = 116

CS 245 Notes 4 151

Root has 12 keys + 13 son pointers= 12x4 + 13x4 = 100 bytes

B+tree:

CS 245 Notes 4 152


B+tree:

Each of 13 sons: 12 rec. ptrs (+12 keys)

= 12x(4 +4) + 4 = 100 bytes

CS 245 Notes 4 153


B+tree:

Each of 13 sons: 12 rec. ptrs (+12 keys)

= 12x(4 +4) + 4 = 100 bytes2-level B+tree, Max # records

= 13x12 = 156

CS 245 Notes 4 154

So...

ooooooooooooo ooooooooo 156 records 108 records

Total = 116

B+ B

8 records

CS 245 Notes 4 155

So...

ooooooooooooo ooooooooo 156 records 108 records

Total = 116

B+ B

8 records

• Conclusion:– For fixed block size,– B+ tree is better because it is bushier

CS 245 Notes 4 156

An Interesting Problem...

• What is a good index structure when:– records tend to be inserted with keys

that are larger than existing values?(e.g., banking records with growing data/time)

– we want to remove older data

CS 245 Notes 4 157

One Solution: Multiple Indexes

• Example: I1, I2

day days indexed days indexed I1 I210 1,2,3,4,5 6,7,8,9,1011 11,2,3,4,5 6,7,8,9,1012 11,12,3,4,5 6,7,8,9,1013 11,12,13,4,5 6,7,8,9,10

•advantage: deletions/insertions from smaller index•disadvantage: query multiple indexes

CS 245 Notes 4 158

Another Solution (Wave Indexes)

day I1 I2 I3 I410 1,2,3 4,5,6 7,8,9 1011 1,2,3 4,5,6 7,8,9 10,1112 1,2,3 4,5,6 7,8,9 10,11, 1213 13 4,5,6 7,8,9 10,11, 1214 13,14 4,5,6 7,8,9 10,11, 1215 13,14,15 4,5,6 7,8,9 10,11, 1216 13,14,15 16 7,8,9 10,11, 12

•advantage: no deletions•disadvantage: approximate windows

CS 245 Notes 4 159

Outline/summary

• Conventional Indexes• Sparse vs. dense• Primary vs. secondary

• B trees• B+trees vs. B-trees• B+trees vs. indexed sequential

• Hashing schemes --> Next