70
Wolf-Tilo Balke Denis Nagel Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Relational Database Systems 2 3. Indexing and Access Paths

Relational Database Systems 2 - ifis.cs.tu-bs.de

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Relational Database Systems 2 - ifis.cs.tu-bs.de

Wolf-Tilo Balke

Denis Nagel

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

Relational Database Systems 23. Indexing and Access Paths

Page 2: Relational Database Systems 2 - ifis.cs.tu-bs.de

3.1 Introduction to access paths

3.2 Files and blocks

3.3 Indexing

– Single-level indexes

– Multi-level indexes

– Hash indexes

3.4 Physical Tuning

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 2

3 Indexing and Access Paths

Page 3: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Databases persistently store data

• Major problem: efficiency of data access

– Obviously depends on the hardware used

• But also depends to a large degree

– On the allocation of data on the disks

– On intelligent buffer management

– On creating access paths and indexing data

• Physical tuning of the database is a main task of

database administrators

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 3

3.1 Introduction to Access Paths

Page 4: Relational Database Systems 2 - ifis.cs.tu-bs.de

• For persistent storage databases are mapped into a

number of files

– Located in specially protected parts of the file system

(tablespaces, etc.)

– Actually maintained by the operating system

• Each file is partitioned into fixed length blocks (pages)

– Smallest unit transferred from/to storage

– Block size is specified on DB creation

– Is a multiple of the OS block size

– A block usually contains several data records

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 4SKS 10.5

3.1 Introduction to Access Paths

Page 5: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Harddisk Sectors are abstracted by the file system to blocks

• DBMS abstracts FS blocks to DBMS blocks

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 5

3.1 Sectors and Blocks

DISK

FS Block 1 DBMS Block 1

DBMS Block 2

DBMS Block 5

DBMS Block 3

DBMS Block 4

DBMS Block 6

FS Block 2

FS Block 3

FS Block 4

FS Block 5

FS Block 6

Page 6: Relational Database Systems 2 - ifis.cs.tu-bs.de

• For data access always complete blocks (pages) are transferred from disk into the main memory

– The part used for holding copies of blocks is referred to as DB buffer and managed by the buffer manager

– Optimized (pre-)fetching of blocks greatly improves performance

• If a data record is requested by the DB

– If the block is buffered, return main memory address

– Else allocate space in the buffer, fetch the block from disk, and return main memory address

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 6SKS 10.5

3.1 Buffer Management

Page 7: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Once new blocks are fetched, generally currently buffered blocks have to be evicted

– If blocks have been modified, they must be written back to disk (which is not always possible)

– Writing of blocks depends on the recovery strategy:

• Blocks that cannot yet be written are called pinned blocks

• Before checkpoints there might be a forced output of blocks

• Several buffer replacement strategies can be applied

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 7SKS 10.5

3.1 Buffer Management

Page 8: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Least recently used (LRU): the ‘oldest’ block is

replaced

– Usually used in OS buffer management

– Simple, yet effective strategy

– Does not use semantics of the data

• Toss immediate: After the DB has finished to process

a block, the block is immediately replaced

– Example: block-nested loop join between tables R, S

• Take one block from R and compare against all blocks in S

• The block from R can be thrown out, but no block from S

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 8SKS 10.5

3.1 Replacement Strategies

Page 9: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Expected Reuse: statistics about query frequencies

assess how useful a buffered block is

– The block with least expected usefulness is evicted

– Example: index blocks are more often

addressed than data blocks

• In any case, whatever the strategy:

Once a block has been modified and has not yet been

written to disk, it cannot be evicted!

– Note: recovery subsystem has to agree before writing a block

to disk

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 9SKS 10.5

3.1 Replacement Strategies

Page 10: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Overhead & Payload (e.g., ORACLE data

blocks)

– Header contains general block information like

block address, type of segment (data, index,…)

– Table directory contains

tables that have rows

in the block

– Row directory contains

information about the

actual rows in the block

(IDs, addresses,…)

– Row data is actual data

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 10

3.2 Blocks (Pages)

Page 11: Relational Database Systems 2 - ifis.cs.tu-bs.de

• An Extent is a logical collection

of blocks (usually adjacent)

– Fixed size, once more data space is

needed for rows a new extent is

allocated

– Remember: reading adjacent

blocks improves access time

• Segments are collections of

logically connected extents

– For instance tables, index segments,

or rollback segments

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 11

3.2 Extents and Segments

Page 12: Relational Database Systems 2 - ifis.cs.tu-bs.de

• A tablespace is the logical storage space needed

for the data in a table

– A grouping of multiple files allocated on disk

• Example: Data1.ora, system.ora, test.dbf

– Good practice to have one tablespace for tables, and a

different one for indexes

CREATE TABLESPACE user_data

DATAFILE ‘udata.ora’ SIZE 10M

EXTENT MANAGEMENT LOCAL

SEGMENT SPACE MANAGEMENT AUTO

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 12

3.2 Tablespaces

Page 13: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Records of different relations should be stored in

individual files

– Storing related records in the same block minimizes

disk accesses

• Multitable clustering file organization may

store records of different tables in the same block

– Good e.g. for some often occurring joins

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 13

3.2 File Organization

Page 14: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Files reserve space in the file system

– File size can be specified or change dynamically

– Default values strongly differ (e.g., DB2 allocates 100 MB)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 14

3.2 File Organization

Schema file

Index file with index blocks

Files related to table “content_type_lecture”

Data file with data blocks

Page 15: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Organization of records in the file

– Heap – a record can be placed anywhere in the file

where there is space

– Sequential – store records in sequential order, based

on the value of the search key of each record

– Hashing – a hash function is computed on some

attribute of each record. The result specifies in which

block of the file the record should be placed

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 15

3.2 File Organization

Page 16: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Data records have to be written in a file such that the

entire record can be accessed with minimal disk accesses

– Fixed length records (easy to implement)

– Variable length records (storage space efficient)

TYPE deposit = RECORD

name : CHAR(22);

accNo : CHAR(10);

balance : REAL;

END

– If each char is 1 byte and a real

8 bytes, a record takes 40 bytes

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 16SKS 10.6

3.2 File Organization

record Name AccNo Balance

0 Burg 864442 654,55

1 Myers 967531 56,45

2 Smith 145288 457,75

Page 17: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Fixed length records

– 40 bytes for the first record, 40 bytes for the second,…

– Problem: if the block size is not a multiple of 40, records may

cross boundaries

– Problem: deleting a record can be either done by marking it

as deleted, or by replacing it with some other record of the file

• Keeping deleted items? Reading slow!

• Move up all other items? Efficiency!

• Fill space with next inserted item?

Changes sequence!

• Pointer lists? Wastes space in each

record!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 17SKS 10.6

3.2 File Organization

record Name AccNo Balance

0 Burg 864442 654,55

1 Myers 967531 56,45

2 Smith 145288 457,75

Page 18: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Variable length records

– Necessary for multi-table files or records that allow variable

length attributes

– Problem: How to know when a record ends?

• Introduce ‘end-of-record’ symbols or store record length at

beginning of the record. But what if a record is updated?

• Slotted page structure: header contains number of entries, end of

free space and pointers to location/size of entries. Records are

moved to use

up space: no

fragmentation!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 18SKS 10.6

3.2 File Organization

Page 19: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Typical database records like in our banking example

easily fit into one block

– Name, account number, balance,…

– Usually multiple records per block

• Unspanned record organization

fills blocks only with

complete records

• Spanned organization uses

pointers to divide records

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 19EN 4.4

3.2 File Organization

i

i+1

record 1 record 2 record 3

record 4 record 5

i

i+1

record 1 record 2 record 3 4

record 54

p

Page 20: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Necessary for large objects: binary large objects

(blobs) and character large objects (clobs)

• Large objects are not interpreted in databases

– Text documents, images, audio and video data

– Need to be stored in a contiguous sequence of bytes

– If an object is bigger than a block, contiguous pages

of the buffer pool have to be allocated for storage

– Sometimes preferable to

disallow direct access, but

only allow access through

file-system-like API to allow

for fragmentation

20

3.2 Non-Standard Blocks

Page 21: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Indexes help to locate records in a DB file

– Creation of indexes is part of the physical tuning task of

database administrators

– Indexes often influence the actual

location of storage for a record

• Example: sequential storage,

storage via a hash function

• If the location is determined by

the index

– Not all attributes can

be directly indexed (but secondary

access paths may be used)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 21

3.3 Indexing

Page 22: Relational Database Systems 2 - ifis.cs.tu-bs.de

• When items have to be found quickly, indexing

mechanisms are used to

speed up access

– Alphabetical author

catalog in libraries

– Lexicographic ordering

– …

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 22

3.3 Basic Concepts

Page 23: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Ordered by indexing field (search key)

– Attribute(s) used to look up records in a file

• An index file consists of index entries

– Records of the form

• Two basic kinds of queries

– Exact matches

• Locating a record(s) with a certain value

– Range queries

• Locating all records in a certain value range

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 23

3.3 Basic Concepts

search key record location(s)

Page 24: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Two basic kinds of indexes

– Ordered indexes: search keys are stored in sorted

order

• Single-level ordered indexes

• Multi-level ordered indexes

– Hash indexes: search keys are distributed uniformly

across blocks using some hash function

• Hash function distributes records uniformly over blocks

• Hashed value of search key decides for storage block

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 24

3.3 Basic Concepts

Page 25: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Index links indexing fields to logical file/block locations

– Dense indexes index all items, sparse indexes only selected ones

– Much smaller file than the data file

• Hence, a binary search on the index file requires fewer block accesses than a binary search on the data file

– Works exactly like a book’s index

• Only few pages at beginning/end, alphabetically ordered, references to respective page(s)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 25

3.3 Single-Level Ordered Indexes

Page 26: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Primary indexes

– Order data by some usually unique attribute as indexing field

(primary key), store database records in this order

– Index record contains pointer to the respective storage place

(block address)

• To save entries usually there

is only a single index entry

for each block (block anchor)

– But sparse indexes do not directly

show, whether some record is in the DB

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 26EN 14.1

3.3 Single-Level Ordered Indexes

AccNo 1 1, Adams, $ 887,00

2, Bertram, $19,99AccNo 4

AccNo 73, Behaim, $ 167,00

4, Cesar, $ 1866,00

5, Miller, $179,99

6, Naders, $ 682,56

7, Ruth, $ 8642,78

8, Smith, $675,99

9, Tarrens, $ 99,00

Page 27: Relational Database Systems 2 - ifis.cs.tu-bs.de

– Advantages

• Number of blocks needed for storing the index is small

compared to data

– Not all records are indexed (non-dense)

– Index entries are smaller than data records

• Can often be kept in buffer

– Disadvantages

• Insertions and Deletions need to move data

in storage and to update index entries

affected by the shifts

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 27

3.3 Single-Level Ordered Indexes

Page 28: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Clustering indexes store data records in the

order of a non-unique indexing field

– One entry in the index per distinct value of the

indexing field

– Search keys are linked to the block address of the

containing the search key

– Is a sparse index

• But existence of records with a certain key can be assessed

by index look-up

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 28

3.3 Single-Level Ordered Indexes

Page 29: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Still problems with insertion/deletion

– Often one or more complete

blocks are reserved for each

single search key to allow for

block anchors

– Blocks do not have to be

adjacent, but can use block

pointers, if more space is needed

• Structurally very similar to a hash index

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 29

3.3 Single-Level Ordered Indexes

Adams 1, Adams, $ 887,00

2, Adams, $19,99Miller

Tarrans

4, Miller, $ 1866,00

5, Miller, $179,99

6, Miller, $ 682,56

7, Miller, $ 8642,78

8, Tarrens, $ 856,78

Page 30: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Secondary indexes point to locations of

records regarding a non-ordering attribute

– Indexing does not affect the storage order

– There can be multiple secondary indexes for the same

DB file

• Secondary indexes are usually dense

– Objects with same or adjacent values are usually not

adjacent on disk

– If the indexing field has unique values (secondary

key) definitely all records have to be indexed

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 30

3.3 Single-Level Ordered Indexes

Page 31: Relational Database Systems 2 - ifis.cs.tu-bs.de

• If the indexing field is not unique there are several possibilities to create a secondary index

– Create a dense index by including duplicate search keys (one for each record)

– Use variable-length index entries, where each search key is assigned a list of pointers

– Keep fixed-length index entries, but point to a blockcontaining (multiple) pointers to the actual records

• Introduces a level of indirection, but allows for a sparse index

• Usually used in practice

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 31

3.3 Single-Level Ordered Indexes

Page 32: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Dense secondary index or sparse with indirection

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 32

3.3 Single-Level Ordered Indexes

Adams

1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

Miller

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Tarrens

Snyder

Adams

1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $ 179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Snyder

p

p

p

p

p

p

p

p

p

Page 33: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Characteristics of secondary indexes

– Speeds up retrieval, because if it does not exist, the

entire file would have to be scanned linearly

• For non-existent primary indexes files can still be scanned in

a binary search fashion

– Uses more search time and space, because it is dense

– Secondary indexes provides a logical ordering

• Accessing records in that order might not be the most

efficient way regarding block accesses

• Each record access may fetch a new block into the buffer

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 33

3.3 Single-Level Ordered Indexes

Page 34: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Improving secondary indexes for range queries

– Same block may be (unnecesarrily) accessed multiple times

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 34

3.3 Single-Level Ordered Indexes

Adams

1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

Miller

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Tarrens

Snyder

• Simple Example: Cesar-Orwell(buffer holds only 1 block)− Cesar: fetch first block− Miller: evict first block,

fetch second block − Miller: evict second

block, fetch first block− Orwell: evict first block,

fetch second block

Page 35: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Improving secondary indexes for range queries

– Remember: working on blocks in main memory is

cheap, fetching blocks from disk is expensive!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 35

3.3 Single-Level Ordered Indexes

Adams 1, Cesar, $ 887,00

2, Tarrens, $19,99Cesar

Miller

4, Orwell, $ 1866,00

5, Miller, $179,99

6, Tarrens, $ 682,56

7, Post, $ 8642,78

8, Adams, $ 856,78

Miller

3, Miller, $19,99

9, Snyder, $ 856,78

Orwell

Post

Tarrens

Tarrens

Snyder

• Improved Example: Cesar-Orwell• Fetch ‘Cesar’ and scan whole block

• Output ‘1’ & ‘3’• Mark block as completed and

evict• Fetch ‘Miller’ and scan whole block

• Output ‘4’ & ‘5’• Mark block as completed and

evict• Skip second ‘Miller’• Skip ‘Orwell’

Done

Page 36: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Short Summary

– Primary and cluster indexes affect storage order, secondary indexes don’t

– Primary and clustering indexes may be sparse, secondary indexes are usually dense

• Primary and clustering indexes can use block anchors

– Index sizes

• Primary index – number of blocks

• Clustering index – number of distinct search keys

• Secondary index – up to number of records in table

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 36

3.3 Single-Level Ordered Indexes

Page 37: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Example: What if a primary index does not fit in

main memory?

– Bad look-up efficiency!

• Simple multi-level solution

– Treat primary index on disk as a sequential file and

construct a sparse index on it

• outer index – sparse index of primary index

• inner index – primary index file

– If even outer index is too large to fit in main memory,

yet another level of index can be created, and so on

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 37

3.3 Multi-Level Ordered Indexes

Page 38: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Multi-level indexes can further improve search

speed

– In single level indexes a binary search can be applied

to locate pointers to blocks

• Efficiency of log2N

– Multi-level indexes allow for higher search efficiency

• Fan-out of the index should be higher than 2

• Efficiency of logfan-outN

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 38EN 14.2

3.3 Multi-Level Ordered Indexes

Page 39: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Concept of Multi-Level Indexes can be applied to primary,

clustering and secondary indexes as long as values are distinct

• Example: 2-Level Primary

Index

– Level 1 is Primary Index

– Level 2 is Primary Index of

level 1

– Efficiency of logfan-outN

• fan-out: Number of 1st level

blocks per 2nd level block

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 39EN 14.2

3.3 Multi-Level Ordered Indexes

1

1, Adams, $ 887,00

2, Bertram, $19,99

3, Behaim, $ 167,00

4, Cesar, $ 1866,00

5, Miller, $179,99

6, Naders, $ 682,56

7, Ruth, $ 8642,78

8, Smith, $675,99

9, Tarrens, $ 99,00

10, Arnold, $ 3442,76

11, Marks, $435,19

12, Black, $ 319,44

3

5

7

9

11

1

7

11

Level 1

Level 2

Page 40: Relational Database Systems 2 - ifis.cs.tu-bs.de

• There is a huge amount of different tree

structures for multi-level indexing

– Classical structures B-tree, B*-tree, etc.

– Multidimensional structures R-trees, Quad-trees, etc.

– Lots of literature

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 40

3.3 Multi-Level Ordered Indexes

Page 41: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Index Structure based on Hashing– Idealized Access Speed: O(1)

• Basic Idea: – Fixed-Size directly addressable Index Space [0..M] containing

M+1 buckets• Buckets contain links to data

• Single link in internal hashing (i.e. in memory)

• Multiple links for external hashing (i.e. on harddisk)

– Hash Function h: ℤ → [0..M]• Maps any number to [0..M]

• Should be uniform (i.e.: same probability for any x ∈ [0..M])– Especially, should also be surjective (i.e. use the full range of [0..M])

• Needs to be deterministic (i.e.: same input → same output)

• Should be simple and fast

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 41EN 13.8.1

3.3 Hash Indexes

Page 42: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Basic Idea– Convert value which is to be indexed to numeric representation

– Hash the value

– Store block anchor of value in the bucket with computed hash index

• Example: M=8

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 42EN 13.8.1

3.3 Hash Indexes

1, Cesar, $ 887.00

5, Miller, $179.99

9, Snyder, $ 856.78

Miller

Cesar

Snyder

h(Miller) = 4

h(Cesar) = 7

H(Snyder) = 1

0

1

2

3

4

5

6

7

8

Page 43: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Problem: Collision– Hash functions are not injective – collision may happen

• Block pointers for full blocks (overflow list)

– Probability of collision increases with load factor• Deteriorates to linear behavior in worst case

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 43EN 13.8.1

3.3 Hash Indexes

1, Cesar, $ 887.00

5, Miller, $179.99

9, Snyder, $ 856.78

Miller

Cesar

Snyder

h(Miller) = 4

h(Cesar) = 1

h(Snyder) = 1

0

1

2

3

4

5

6

7

8

Smith h(Smith) = 1

Rogers h(Rogers) = 1

3, Rogers, $ 1866.58

7, Smith, $ 0.29

Page 44: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Problem: Static Address Range– Addressable range is fixed to [0..M]

– What happens if address range is exhausted?• E.g., load factor too high, too many collisions occur

– Possible solutions• Rehashing :

– Create a new larger hashmap and rehash all values

– For example, java hashmap follow that policy

• Extendible Hashing:– Uses dynamic directory to double and half number of buckets

• Linear Hashing:– Buckets are split into two one after another and are individually

rehashed with an additional hash-function

– Not in this lecture

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 44EN 13.8.3

3.3 Hash Indexes

Page 45: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Extendible Hashing

– Hash values are positive integers between [0..2n]

• Numbers can be represented in binary

• Uses a so-called trie structure

• Choose large n

– Create a directory of depth d with 2d entries for the first d bits of values

• d is global depth of directory

• Directory cells link to buckets containing links to data with hash value starting with given bit pattern

– Adjacent cells may link to the same bucket depending on local depth of bucket

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 45EN 13.8.3

3.3 Hash Indexes

Page 46: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Extendible Hashing (Bucket size 3)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 46EN 13.8.3

3.3 Hash Indexes

000

001

010

011

100

101

110

111

Global Depth = 3

Local depth = 1

Local depth = 2

Local depth = 3

Local depth = 3

All hash values starting with 0

All hash values starting with 10

All hash values starting with 110

All hash values starting with 111

0_0010

0_0100

0_1010

10_000

10_101

10_110

110_000

110_010

110_100

111_001

111_010

111_111

Page 47: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Insert 010000

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 47EN 13.8.3

3.3 Hash Indexes

000

001

010

011

100

101

110

111

Global Depth = 3

Local depth = 1

Local depth = 2

Local depth = 3

Local depth = 3

All hash values starting with 0

All hash values starting with 10

All hash values starting with 110

All hash values starting with 111

0_0010

0_0100

0_1010

10_000

10_101

10_110

110_000

110_010

110_100

111_001

111_010

111_111

Split this bucket, add 010000

Page 48: Relational Database Systems 2 - ifis.cs.tu-bs.de

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 48EN 13.8.3

3.3 Hash Indexes

000

001

010

011

100

101

110

111

Global Depth = 3

Local depth = 2

Local depth = 2

Local depth = 3

Local depth = 3

All hash values starting with 00

All hash values starting with 10

All hash values starting with 110

All hash values starting with 111

00_010

00_100

01_010

10_000

10_101

10_110

110_000

110_010

110_100

111_001

111_010

111_111

01_000All hash values starting with 01

Local depth = 2

Page 49: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Extendible Hashing

– Allows to dynamically split and coalesce buckets as database grows and shrinks

• If bucket is full and local depth is lower than global depth:

– More than one cell links to this bucket

– Bucket is split into two with increased local depth and values are distributed accordingly

– Links in cells are adjusted accordingly

• If local depth already equals global depth: – Global depth is increased by one and thus the size of the directory

is doubled

– Each cell is replaced by two cells, both of which contain the same pointer as the original cell

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 49EN 13.8.3

3.3 Hash Indexes

Page 50: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Summary Hash Indexes: May perform in O(1)

– But: Management overhead can be quite large for

growing data collections

• Overflow lists have small management overhead, but may

decrease access performance to O(n)

• Rehashing is very expensive for each growth stage

• Extendible hashing has a smaller overhead and is especially

suitable for external storage hashing

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 50EN 13.8

3.3 Hash Indexes

Page 51: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Why not simply index every attribute?

– Physical index on primary key, logical indexes on every

other attribute

• Results in good read efficiency

• But… whenever a DB file is modified, every

index on the file has to be updated

– Updating indices imposes overhead on database

modification (insert, delete, update)

• Especially for multi-level indexes all levels may have to be

updated

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 51

3.3 Indexing

Page 52: Relational Database Systems 2 - ifis.cs.tu-bs.de

• One of the tasks of the database administrator

• Black Magic…

– Get DB performance statistics

– Try something that seems sensible

– Get new performance statistics

– If it did not improve, reset it to previous configuration

– Try something different until satisfied...

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 52

3.4 Physical Tuning

Page 53: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Black Magic today needs advanced

tools to operate correctly

• For database tuning, you need

– Snapshot Monitors

• Continuously collect cumulative statistics for

the DB within a given time span

– Event Monitors

• Hook into certain events and analyze them

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 53

3.4 Physical Tuning

Page 54: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Snapshot Monitors

– Monitor collects cumulative statistics

– Internal counters set at global or session levels

– Collected statistics can be evaluated afterwards

for system tuning on general level

– Data collection is explicitly enabled or disabled

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 54

3.4 Diagnostic Database Tools

Page 55: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Useful for collecting point-in-time information regarding overall DB/DBMS behavior – Locking –lists all locks & types held by all applications

– Bufferpools – cumulative stats for memory, physical & logical I/O’s, synch/asynch

– Sorts – provides detailed statistics regard sort-heap usage, overflows, # active etc.

– Tablespaces – detailed I/O and activity statistics for a given tablespace

– UOW – displays status of application Unit of Work at point-in-time

– Dynamic SQL – shows contents of package cache and related stats at point-in-time

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 55

3.4 Diagnostic Database Tools

Page 56: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Snapshot Monitors

– Overhead continuously is in the 3% - 5% range

• But no I/O since counters are usually RAM resident, not written to disk

– Quick list of useful metrics to identify particular problems include:

• bufferpool hit ratios

• sort overflows/problems

• effectiveness of prefetch

• need for indexing

• transaction logging issues

• single query/application resource domination, etc.

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 56

3.4 Diagnostic Database Tools

Page 57: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Snapshot Monitor: IBM DB2 Performance Expert

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 57

3.4 Diagnostic Database Tools

http://www.redbooks.ibm.com/abstracts/sg246470.html

Page 58: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Event Monitors

– Collects information regarding events or chains of

events

• No continuously collected data as with snapshots

– Must be explicitly created and activated via DBMS

commands or API’s

– Are the best way to effectively diagnose and resolve

transaction problems (e.g., deadlock issues)

– Output may be directed to a database table and

results then analyzed using SQL

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 58

3.4 Diagnostic Database Tools

Page 59: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Event Monitor can be used to

– Identify and rank most frequently executed or most

time consuming SQL

– Track the sequence of statements leading up to a lock

timeout or deadlock

– Track connections to the database

– Breakdown overall SQL statement execution time

into sub-operations such as sort time and use or

system CPU time

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 59

3.4 Diagnostic Database Tools

Page 60: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Event Monitor: DBI Brother Panther

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 60

3.4 Diagnostic Database Tools

http://www.dbisoftware.com

Page 61: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Physical DB Tuning is a complicated and intransparenttask

• Usually trial and error– Measure some hopefully meaningful metrics of your DB

usage statistics

– Adjust around on something• Mostly indexes and tablespace properties

– Measure again• If results better : Great! Continue tuning!

• If result worse : Bad! Undo everything you did and try something else.

• But what to measure and what to do?– Use best practices or try yourself!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 61

3.4 The Black Art of Physical Tuning

Page 62: Relational Database Systems 2 - ifis.cs.tu-bs.de

• What to measure?

• First, what kind of database you have?

– OLTP (Online Transaction Processing) : Database with many small, concurrent read / write queries

• Optimize for fast read write accesses

• Keep an eye on “real-time” response times

• Optimize for concurrency

– Warehouse (OLAP – Online Analytical Processing): Using larger and more complex queries for primary retrieving data

• Optimize for read accesses

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 62

3.4 Physical Tuning

Page 63: Relational Database Systems 2 - ifis.cs.tu-bs.de

• There is a huge amount of knobs and screws to

turn

– DB planning and tuning is for professionals

– Special certificates by each database vendor

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 63

3.4 Tuning Metrics

http://www.microsoft.com/learning/mcp/mcdba/

http://www-03.ibm.com/certify/certs/dm_index.shtml

IBM DB2 IBM Certified Database AssociateIBM Certified Database Administrator IBM Certified Application Developer DB2IBM Certified Advanced Database Administrator

http://education.oracle.com

Page 64: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Scott Hayes

– President & CEO of Database-Brothers Inc (DBI)

• If I only had five minutes to assess the status of a

database, I would look at…

– Average Result Set Size

– Index Read Efficiency

– Synchronous Read Percentage

– Average Rows Read per DB Transaction per Table

– Average Read Time

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 64

3.4 Tuning Metrics

Page 65: Relational Database Systems 2 - ifis.cs.tu-bs.de

• ARSS - Average Result Set Size

– ARSS = RowsSelected / NumSelectQueries

– Rule of Thumb:

• ARSS ≤ 10 indicates a OLTP database

– If you think you have OLTP and ARSS>>10, there is something

wrong…

• ARSS > 10 indicates a warehouse database (OLAP)

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 65

3.4 Tuning Metrics

Page 66: Relational Database Systems 2 - ifis.cs.tu-bs.de

• IREF - Index Read Efficiency

– “How many rows have to be read to retrieve one row?”• i.e. How much overhead is within the where predicates?

– IREF = RowsRead / RowsReturned• Well-designed indexes may evaluate a where-clause without

reading additional rows!

– Rule of Thumb:• IREF should be less than 10 for OLTP Databases

• May be higher for Warehouses (>100)

– What to do?• Introduce more efficient indexes!

• User simpler queries!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 66

3.4 Tuning Metrics

Page 67: Relational Database Systems 2 - ifis.cs.tu-bs.de

• SRP – Synchronous Read Percentage

– Synchronous Read means the DBMS reads just the index file and the required data page (good)

– Asynchronous Read means the DBMS employs prefetching to scan for an index or data (BAD)

– Rule of Thumb:

• >90% for OLTP (>50% for warehouses) : Good

• 80%-90% (25%-50%) : Ok

• 50%-80% (<25%) Bad

– What to do?

• Improve index structures!

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 67

3.4 Tuning Metrics

Page 68: Relational Database Systems 2 - ifis.cs.tu-bs.de

• TBRRTX – Average Rows Read per DB Transaction per Table

– TBRRTXtable = rowsRead / (attemptedCommits + attemptedRollBacks)

– Rule of Thumb:• <10 for OLTP : Good

• 10-100 : Ok

• >100 : Bad

• >1000 : Horrific

– What to do? • Improve index structures!

• Improve Queries

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 68

3.4 Tuning Metrics

Page 69: Relational Database Systems 2 - ifis.cs.tu-bs.de

• ART – Average Read Time

– The average time per read

– RT = ReadTime / (NumDataRead + NumIndexRead)

– If you think you did everything within your application

and indices, optimize read time

• Adjust size of tablespace containers

• Distribute containers across disks

– Avoid OS paging device

• Move highly access tablespaces to faster devices

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 69

3.4 Tuning Metrics

Page 70: Relational Database Systems 2 - ifis.cs.tu-bs.de

• Tree-Index Structures

– Binary Tree

• Naïve Binary Tree

• Balancing Tree

– B-Trees

• B

• B+

• B*

Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 70

Next Lecture