Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Wolf-Tilo Balke
Denis Nagel
Institut für Informationssysteme
Technische Universität Braunschweig
http://www.ifis.cs.tu-bs.de
Relational Database Systems 23. Indexing and Access Paths
3.1 Introduction to access paths
3.2 Files and blocks
3.3 Indexing
– Single-level indexes
– Multi-level indexes
– Hash indexes
3.4 Physical Tuning
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 2
3 Indexing and Access Paths
• Databases persistently store data
• Major problem: efficiency of data access
– Obviously depends on the hardware used
• But also depends to a large degree
– On the allocation of data on the disks
– On intelligent buffer management
– On creating access paths and indexing data
• Physical tuning of the database is a main task of
database administrators
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 3
3.1 Introduction to Access Paths
• For persistent storage databases are mapped into a
number of files
– Located in specially protected parts of the file system
(tablespaces, etc.)
– Actually maintained by the operating system
• Each file is partitioned into fixed length blocks (pages)
– Smallest unit transferred from/to storage
– Block size is specified on DB creation
– Is a multiple of the OS block size
– A block usually contains several data records
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 4SKS 10.5
3.1 Introduction to Access Paths
• Harddisk Sectors are abstracted by the file system to blocks
• DBMS abstracts FS blocks to DBMS blocks
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 5
3.1 Sectors and Blocks
DISK
FS Block 1 DBMS Block 1
DBMS Block 2
DBMS Block 5
DBMS Block 3
DBMS Block 4
DBMS Block 6
FS Block 2
FS Block 3
FS Block 4
FS Block 5
FS Block 6
• For data access always complete blocks (pages) are transferred from disk into the main memory
– The part used for holding copies of blocks is referred to as DB buffer and managed by the buffer manager
– Optimized (pre-)fetching of blocks greatly improves performance
• If a data record is requested by the DB
– If the block is buffered, return main memory address
– Else allocate space in the buffer, fetch the block from disk, and return main memory address
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 6SKS 10.5
3.1 Buffer Management
• Once new blocks are fetched, generally currently buffered blocks have to be evicted
– If blocks have been modified, they must be written back to disk (which is not always possible)
– Writing of blocks depends on the recovery strategy:
• Blocks that cannot yet be written are called pinned blocks
• Before checkpoints there might be a forced output of blocks
• Several buffer replacement strategies can be applied
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 7SKS 10.5
3.1 Buffer Management
• Least recently used (LRU): the ‘oldest’ block is
replaced
– Usually used in OS buffer management
– Simple, yet effective strategy
– Does not use semantics of the data
• Toss immediate: After the DB has finished to process
a block, the block is immediately replaced
– Example: block-nested loop join between tables R, S
• Take one block from R and compare against all blocks in S
• The block from R can be thrown out, but no block from S
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 8SKS 10.5
3.1 Replacement Strategies
• Expected Reuse: statistics about query frequencies
assess how useful a buffered block is
– The block with least expected usefulness is evicted
– Example: index blocks are more often
addressed than data blocks
• In any case, whatever the strategy:
Once a block has been modified and has not yet been
written to disk, it cannot be evicted!
– Note: recovery subsystem has to agree before writing a block
to disk
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 9SKS 10.5
3.1 Replacement Strategies
• Overhead & Payload (e.g., ORACLE data
blocks)
– Header contains general block information like
block address, type of segment (data, index,…)
– Table directory contains
tables that have rows
in the block
– Row directory contains
information about the
actual rows in the block
(IDs, addresses,…)
– Row data is actual data
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 10
3.2 Blocks (Pages)
• An Extent is a logical collection
of blocks (usually adjacent)
– Fixed size, once more data space is
needed for rows a new extent is
allocated
– Remember: reading adjacent
blocks improves access time
• Segments are collections of
logically connected extents
– For instance tables, index segments,
or rollback segments
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 11
3.2 Extents and Segments
• A tablespace is the logical storage space needed
for the data in a table
– A grouping of multiple files allocated on disk
• Example: Data1.ora, system.ora, test.dbf
– Good practice to have one tablespace for tables, and a
different one for indexes
CREATE TABLESPACE user_data
DATAFILE ‘udata.ora’ SIZE 10M
EXTENT MANAGEMENT LOCAL
SEGMENT SPACE MANAGEMENT AUTO
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 12
3.2 Tablespaces
• Records of different relations should be stored in
individual files
– Storing related records in the same block minimizes
disk accesses
• Multitable clustering file organization may
store records of different tables in the same block
– Good e.g. for some often occurring joins
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 13
3.2 File Organization
• Files reserve space in the file system
– File size can be specified or change dynamically
– Default values strongly differ (e.g., DB2 allocates 100 MB)
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 14
3.2 File Organization
Schema file
Index file with index blocks
Files related to table “content_type_lecture”
Data file with data blocks
• Organization of records in the file
– Heap – a record can be placed anywhere in the file
where there is space
– Sequential – store records in sequential order, based
on the value of the search key of each record
– Hashing – a hash function is computed on some
attribute of each record. The result specifies in which
block of the file the record should be placed
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 15
3.2 File Organization
• Data records have to be written in a file such that the
entire record can be accessed with minimal disk accesses
– Fixed length records (easy to implement)
– Variable length records (storage space efficient)
TYPE deposit = RECORD
name : CHAR(22);
accNo : CHAR(10);
balance : REAL;
END
– If each char is 1 byte and a real
8 bytes, a record takes 40 bytes
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 16SKS 10.6
3.2 File Organization
record Name AccNo Balance
0 Burg 864442 654,55
1 Myers 967531 56,45
2 Smith 145288 457,75
• Fixed length records
– 40 bytes for the first record, 40 bytes for the second,…
– Problem: if the block size is not a multiple of 40, records may
cross boundaries
– Problem: deleting a record can be either done by marking it
as deleted, or by replacing it with some other record of the file
• Keeping deleted items? Reading slow!
• Move up all other items? Efficiency!
• Fill space with next inserted item?
Changes sequence!
• Pointer lists? Wastes space in each
record!
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 17SKS 10.6
3.2 File Organization
record Name AccNo Balance
0 Burg 864442 654,55
1 Myers 967531 56,45
2 Smith 145288 457,75
• Variable length records
– Necessary for multi-table files or records that allow variable
length attributes
– Problem: How to know when a record ends?
• Introduce ‘end-of-record’ symbols or store record length at
beginning of the record. But what if a record is updated?
• Slotted page structure: header contains number of entries, end of
free space and pointers to location/size of entries. Records are
moved to use
up space: no
fragmentation!
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 18SKS 10.6
3.2 File Organization
• Typical database records like in our banking example
easily fit into one block
– Name, account number, balance,…
– Usually multiple records per block
• Unspanned record organization
fills blocks only with
complete records
• Spanned organization uses
pointers to divide records
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 19EN 4.4
3.2 File Organization
i
i+1
record 1 record 2 record 3
record 4 record 5
i
i+1
record 1 record 2 record 3 4
record 54
p
• Necessary for large objects: binary large objects
(blobs) and character large objects (clobs)
• Large objects are not interpreted in databases
– Text documents, images, audio and video data
– Need to be stored in a contiguous sequence of bytes
– If an object is bigger than a block, contiguous pages
of the buffer pool have to be allocated for storage
– Sometimes preferable to
disallow direct access, but
only allow access through
file-system-like API to allow
for fragmentation
20
3.2 Non-Standard Blocks
• Indexes help to locate records in a DB file
– Creation of indexes is part of the physical tuning task of
database administrators
– Indexes often influence the actual
location of storage for a record
• Example: sequential storage,
storage via a hash function
• If the location is determined by
the index
– Not all attributes can
be directly indexed (but secondary
access paths may be used)
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 21
3.3 Indexing
• When items have to be found quickly, indexing
mechanisms are used to
speed up access
– Alphabetical author
catalog in libraries
– Lexicographic ordering
– …
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 22
3.3 Basic Concepts
• Ordered by indexing field (search key)
– Attribute(s) used to look up records in a file
• An index file consists of index entries
– Records of the form
• Two basic kinds of queries
– Exact matches
• Locating a record(s) with a certain value
– Range queries
• Locating all records in a certain value range
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 23
3.3 Basic Concepts
search key record location(s)
• Two basic kinds of indexes
– Ordered indexes: search keys are stored in sorted
order
• Single-level ordered indexes
• Multi-level ordered indexes
– Hash indexes: search keys are distributed uniformly
across blocks using some hash function
• Hash function distributes records uniformly over blocks
• Hashed value of search key decides for storage block
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 24
3.3 Basic Concepts
• Index links indexing fields to logical file/block locations
– Dense indexes index all items, sparse indexes only selected ones
– Much smaller file than the data file
• Hence, a binary search on the index file requires fewer block accesses than a binary search on the data file
– Works exactly like a book’s index
• Only few pages at beginning/end, alphabetically ordered, references to respective page(s)
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 25
3.3 Single-Level Ordered Indexes
• Primary indexes
– Order data by some usually unique attribute as indexing field
(primary key), store database records in this order
– Index record contains pointer to the respective storage place
(block address)
• To save entries usually there
is only a single index entry
for each block (block anchor)
– But sparse indexes do not directly
show, whether some record is in the DB
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 26EN 14.1
3.3 Single-Level Ordered Indexes
AccNo 1 1, Adams, $ 887,00
2, Bertram, $19,99AccNo 4
AccNo 73, Behaim, $ 167,00
4, Cesar, $ 1866,00
5, Miller, $179,99
6, Naders, $ 682,56
7, Ruth, $ 8642,78
8, Smith, $675,99
9, Tarrens, $ 99,00
– Advantages
• Number of blocks needed for storing the index is small
compared to data
– Not all records are indexed (non-dense)
– Index entries are smaller than data records
• Can often be kept in buffer
– Disadvantages
• Insertions and Deletions need to move data
in storage and to update index entries
affected by the shifts
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 27
3.3 Single-Level Ordered Indexes
• Clustering indexes store data records in the
order of a non-unique indexing field
– One entry in the index per distinct value of the
indexing field
– Search keys are linked to the block address of the
containing the search key
– Is a sparse index
• But existence of records with a certain key can be assessed
by index look-up
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 28
3.3 Single-Level Ordered Indexes
• Still problems with insertion/deletion
– Often one or more complete
blocks are reserved for each
single search key to allow for
block anchors
– Blocks do not have to be
adjacent, but can use block
pointers, if more space is needed
• Structurally very similar to a hash index
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 29
3.3 Single-Level Ordered Indexes
Adams 1, Adams, $ 887,00
2, Adams, $19,99Miller
Tarrans
4, Miller, $ 1866,00
5, Miller, $179,99
6, Miller, $ 682,56
7, Miller, $ 8642,78
8, Tarrens, $ 856,78
• Secondary indexes point to locations of
records regarding a non-ordering attribute
– Indexing does not affect the storage order
– There can be multiple secondary indexes for the same
DB file
• Secondary indexes are usually dense
– Objects with same or adjacent values are usually not
adjacent on disk
– If the indexing field has unique values (secondary
key) definitely all records have to be indexed
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 30
3.3 Single-Level Ordered Indexes
• If the indexing field is not unique there are several possibilities to create a secondary index
– Create a dense index by including duplicate search keys (one for each record)
– Use variable-length index entries, where each search key is assigned a list of pointers
– Keep fixed-length index entries, but point to a blockcontaining (multiple) pointers to the actual records
• Introduces a level of indirection, but allows for a sparse index
• Usually used in practice
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 31
3.3 Single-Level Ordered Indexes
• Dense secondary index or sparse with indirection
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 32
3.3 Single-Level Ordered Indexes
Adams
1, Cesar, $ 887,00
2, Tarrens, $19,99Cesar
Miller
4, Orwell, $ 1866,00
5, Miller, $179,99
6, Tarrens, $ 682,56
7, Post, $ 8642,78
8, Adams, $ 856,78
Miller
3, Miller, $19,99
9, Snyder, $ 856,78
Orwell
Post
Tarrens
Tarrens
Snyder
Adams
1, Cesar, $ 887,00
2, Tarrens, $19,99Cesar
Miller
4, Orwell, $ 1866,00
5, Miller, $ 179,99
6, Tarrens, $ 682,56
7, Post, $ 8642,78
8, Adams, $ 856,78
3, Miller, $19,99
9, Snyder, $ 856,78
Orwell
Post
Tarrens
Snyder
p
p
p
p
p
p
p
p
p
• Characteristics of secondary indexes
– Speeds up retrieval, because if it does not exist, the
entire file would have to be scanned linearly
• For non-existent primary indexes files can still be scanned in
a binary search fashion
– Uses more search time and space, because it is dense
– Secondary indexes provides a logical ordering
• Accessing records in that order might not be the most
efficient way regarding block accesses
• Each record access may fetch a new block into the buffer
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 33
3.3 Single-Level Ordered Indexes
• Improving secondary indexes for range queries
– Same block may be (unnecesarrily) accessed multiple times
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 34
3.3 Single-Level Ordered Indexes
Adams
1, Cesar, $ 887,00
2, Tarrens, $19,99Cesar
Miller
4, Orwell, $ 1866,00
5, Miller, $179,99
6, Tarrens, $ 682,56
7, Post, $ 8642,78
8, Adams, $ 856,78
Miller
3, Miller, $19,99
9, Snyder, $ 856,78
Orwell
Post
Tarrens
Tarrens
Snyder
• Simple Example: Cesar-Orwell(buffer holds only 1 block)− Cesar: fetch first block− Miller: evict first block,
fetch second block − Miller: evict second
block, fetch first block− Orwell: evict first block,
fetch second block
• Improving secondary indexes for range queries
– Remember: working on blocks in main memory is
cheap, fetching blocks from disk is expensive!
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 35
3.3 Single-Level Ordered Indexes
Adams 1, Cesar, $ 887,00
2, Tarrens, $19,99Cesar
Miller
4, Orwell, $ 1866,00
5, Miller, $179,99
6, Tarrens, $ 682,56
7, Post, $ 8642,78
8, Adams, $ 856,78
Miller
3, Miller, $19,99
9, Snyder, $ 856,78
Orwell
Post
Tarrens
Tarrens
Snyder
• Improved Example: Cesar-Orwell• Fetch ‘Cesar’ and scan whole block
• Output ‘1’ & ‘3’• Mark block as completed and
evict• Fetch ‘Miller’ and scan whole block
• Output ‘4’ & ‘5’• Mark block as completed and
evict• Skip second ‘Miller’• Skip ‘Orwell’
Done
• Short Summary
– Primary and cluster indexes affect storage order, secondary indexes don’t
– Primary and clustering indexes may be sparse, secondary indexes are usually dense
• Primary and clustering indexes can use block anchors
– Index sizes
• Primary index – number of blocks
• Clustering index – number of distinct search keys
• Secondary index – up to number of records in table
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 36
3.3 Single-Level Ordered Indexes
• Example: What if a primary index does not fit in
main memory?
– Bad look-up efficiency!
• Simple multi-level solution
– Treat primary index on disk as a sequential file and
construct a sparse index on it
• outer index – sparse index of primary index
• inner index – primary index file
– If even outer index is too large to fit in main memory,
yet another level of index can be created, and so on
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 37
3.3 Multi-Level Ordered Indexes
• Multi-level indexes can further improve search
speed
– In single level indexes a binary search can be applied
to locate pointers to blocks
• Efficiency of log2N
– Multi-level indexes allow for higher search efficiency
• Fan-out of the index should be higher than 2
• Efficiency of logfan-outN
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 38EN 14.2
3.3 Multi-Level Ordered Indexes
• Concept of Multi-Level Indexes can be applied to primary,
clustering and secondary indexes as long as values are distinct
• Example: 2-Level Primary
Index
– Level 1 is Primary Index
– Level 2 is Primary Index of
level 1
– Efficiency of logfan-outN
• fan-out: Number of 1st level
blocks per 2nd level block
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 39EN 14.2
3.3 Multi-Level Ordered Indexes
1
1, Adams, $ 887,00
2, Bertram, $19,99
3, Behaim, $ 167,00
4, Cesar, $ 1866,00
5, Miller, $179,99
6, Naders, $ 682,56
7, Ruth, $ 8642,78
8, Smith, $675,99
9, Tarrens, $ 99,00
10, Arnold, $ 3442,76
11, Marks, $435,19
12, Black, $ 319,44
3
5
7
9
11
1
7
11
Level 1
Level 2
• There is a huge amount of different tree
structures for multi-level indexing
– Classical structures B-tree, B*-tree, etc.
– Multidimensional structures R-trees, Quad-trees, etc.
– Lots of literature
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 40
3.3 Multi-Level Ordered Indexes
• Index Structure based on Hashing– Idealized Access Speed: O(1)
• Basic Idea: – Fixed-Size directly addressable Index Space [0..M] containing
M+1 buckets• Buckets contain links to data
• Single link in internal hashing (i.e. in memory)
• Multiple links for external hashing (i.e. on harddisk)
– Hash Function h: ℤ → [0..M]• Maps any number to [0..M]
• Should be uniform (i.e.: same probability for any x ∈ [0..M])– Especially, should also be surjective (i.e. use the full range of [0..M])
• Needs to be deterministic (i.e.: same input → same output)
• Should be simple and fast
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 41EN 13.8.1
3.3 Hash Indexes
• Basic Idea– Convert value which is to be indexed to numeric representation
– Hash the value
– Store block anchor of value in the bucket with computed hash index
• Example: M=8
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 42EN 13.8.1
3.3 Hash Indexes
1, Cesar, $ 887.00
5, Miller, $179.99
9, Snyder, $ 856.78
Miller
Cesar
Snyder
h(Miller) = 4
h(Cesar) = 7
H(Snyder) = 1
0
1
2
3
4
5
6
7
8
• Problem: Collision– Hash functions are not injective – collision may happen
• Block pointers for full blocks (overflow list)
– Probability of collision increases with load factor• Deteriorates to linear behavior in worst case
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 43EN 13.8.1
3.3 Hash Indexes
1, Cesar, $ 887.00
5, Miller, $179.99
9, Snyder, $ 856.78
Miller
Cesar
Snyder
h(Miller) = 4
h(Cesar) = 1
h(Snyder) = 1
0
1
2
3
4
5
6
7
8
Smith h(Smith) = 1
Rogers h(Rogers) = 1
3, Rogers, $ 1866.58
7, Smith, $ 0.29
• Problem: Static Address Range– Addressable range is fixed to [0..M]
– What happens if address range is exhausted?• E.g., load factor too high, too many collisions occur
– Possible solutions• Rehashing :
– Create a new larger hashmap and rehash all values
– For example, java hashmap follow that policy
• Extendible Hashing:– Uses dynamic directory to double and half number of buckets
• Linear Hashing:– Buckets are split into two one after another and are individually
rehashed with an additional hash-function
– Not in this lecture
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 44EN 13.8.3
3.3 Hash Indexes
• Extendible Hashing
– Hash values are positive integers between [0..2n]
• Numbers can be represented in binary
• Uses a so-called trie structure
• Choose large n
– Create a directory of depth d with 2d entries for the first d bits of values
• d is global depth of directory
• Directory cells link to buckets containing links to data with hash value starting with given bit pattern
– Adjacent cells may link to the same bucket depending on local depth of bucket
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 45EN 13.8.3
3.3 Hash Indexes
• Extendible Hashing (Bucket size 3)
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 46EN 13.8.3
3.3 Hash Indexes
000
001
010
011
100
101
110
111
Global Depth = 3
Local depth = 1
Local depth = 2
Local depth = 3
Local depth = 3
All hash values starting with 0
All hash values starting with 10
All hash values starting with 110
All hash values starting with 111
0_0010
0_0100
0_1010
10_000
10_101
10_110
110_000
110_010
110_100
111_001
111_010
111_111
• Insert 010000
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 47EN 13.8.3
3.3 Hash Indexes
000
001
010
011
100
101
110
111
Global Depth = 3
Local depth = 1
Local depth = 2
Local depth = 3
Local depth = 3
All hash values starting with 0
All hash values starting with 10
All hash values starting with 110
All hash values starting with 111
0_0010
0_0100
0_1010
10_000
10_101
10_110
110_000
110_010
110_100
111_001
111_010
111_111
Split this bucket, add 010000
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 48EN 13.8.3
3.3 Hash Indexes
000
001
010
011
100
101
110
111
Global Depth = 3
Local depth = 2
Local depth = 2
Local depth = 3
Local depth = 3
All hash values starting with 00
All hash values starting with 10
All hash values starting with 110
All hash values starting with 111
00_010
00_100
01_010
10_000
10_101
10_110
110_000
110_010
110_100
111_001
111_010
111_111
01_000All hash values starting with 01
Local depth = 2
• Extendible Hashing
– Allows to dynamically split and coalesce buckets as database grows and shrinks
• If bucket is full and local depth is lower than global depth:
– More than one cell links to this bucket
– Bucket is split into two with increased local depth and values are distributed accordingly
– Links in cells are adjusted accordingly
• If local depth already equals global depth: – Global depth is increased by one and thus the size of the directory
is doubled
– Each cell is replaced by two cells, both of which contain the same pointer as the original cell
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 49EN 13.8.3
3.3 Hash Indexes
• Summary Hash Indexes: May perform in O(1)
– But: Management overhead can be quite large for
growing data collections
• Overflow lists have small management overhead, but may
decrease access performance to O(n)
• Rehashing is very expensive for each growth stage
• Extendible hashing has a smaller overhead and is especially
suitable for external storage hashing
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 50EN 13.8
3.3 Hash Indexes
• Why not simply index every attribute?
– Physical index on primary key, logical indexes on every
other attribute
• Results in good read efficiency
• But… whenever a DB file is modified, every
index on the file has to be updated
– Updating indices imposes overhead on database
modification (insert, delete, update)
• Especially for multi-level indexes all levels may have to be
updated
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 51
3.3 Indexing
• One of the tasks of the database administrator
• Black Magic…
– Get DB performance statistics
– Try something that seems sensible
– Get new performance statistics
– If it did not improve, reset it to previous configuration
– Try something different until satisfied...
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 52
3.4 Physical Tuning
• Black Magic today needs advanced
tools to operate correctly
• For database tuning, you need
– Snapshot Monitors
• Continuously collect cumulative statistics for
the DB within a given time span
– Event Monitors
• Hook into certain events and analyze them
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 53
3.4 Physical Tuning
• Snapshot Monitors
– Monitor collects cumulative statistics
– Internal counters set at global or session levels
– Collected statistics can be evaluated afterwards
for system tuning on general level
– Data collection is explicitly enabled or disabled
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 54
3.4 Diagnostic Database Tools
• Useful for collecting point-in-time information regarding overall DB/DBMS behavior – Locking –lists all locks & types held by all applications
– Bufferpools – cumulative stats for memory, physical & logical I/O’s, synch/asynch
– Sorts – provides detailed statistics regard sort-heap usage, overflows, # active etc.
– Tablespaces – detailed I/O and activity statistics for a given tablespace
– UOW – displays status of application Unit of Work at point-in-time
– Dynamic SQL – shows contents of package cache and related stats at point-in-time
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 55
3.4 Diagnostic Database Tools
• Snapshot Monitors
– Overhead continuously is in the 3% - 5% range
• But no I/O since counters are usually RAM resident, not written to disk
– Quick list of useful metrics to identify particular problems include:
• bufferpool hit ratios
• sort overflows/problems
• effectiveness of prefetch
• need for indexing
• transaction logging issues
• single query/application resource domination, etc.
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 56
3.4 Diagnostic Database Tools
• Snapshot Monitor: IBM DB2 Performance Expert
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 57
3.4 Diagnostic Database Tools
http://www.redbooks.ibm.com/abstracts/sg246470.html
• Event Monitors
– Collects information regarding events or chains of
events
• No continuously collected data as with snapshots
– Must be explicitly created and activated via DBMS
commands or API’s
– Are the best way to effectively diagnose and resolve
transaction problems (e.g., deadlock issues)
– Output may be directed to a database table and
results then analyzed using SQL
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 58
3.4 Diagnostic Database Tools
• Event Monitor can be used to
– Identify and rank most frequently executed or most
time consuming SQL
– Track the sequence of statements leading up to a lock
timeout or deadlock
– Track connections to the database
– Breakdown overall SQL statement execution time
into sub-operations such as sort time and use or
system CPU time
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 59
3.4 Diagnostic Database Tools
• Event Monitor: DBI Brother Panther
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 60
3.4 Diagnostic Database Tools
http://www.dbisoftware.com
• Physical DB Tuning is a complicated and intransparenttask
• Usually trial and error– Measure some hopefully meaningful metrics of your DB
usage statistics
– Adjust around on something• Mostly indexes and tablespace properties
– Measure again• If results better : Great! Continue tuning!
• If result worse : Bad! Undo everything you did and try something else.
• But what to measure and what to do?– Use best practices or try yourself!
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 61
3.4 The Black Art of Physical Tuning
• What to measure?
• First, what kind of database you have?
– OLTP (Online Transaction Processing) : Database with many small, concurrent read / write queries
• Optimize for fast read write accesses
• Keep an eye on “real-time” response times
• Optimize for concurrency
– Warehouse (OLAP – Online Analytical Processing): Using larger and more complex queries for primary retrieving data
• Optimize for read accesses
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 62
3.4 Physical Tuning
• There is a huge amount of knobs and screws to
turn
– DB planning and tuning is for professionals
– Special certificates by each database vendor
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 63
3.4 Tuning Metrics
http://www.microsoft.com/learning/mcp/mcdba/
http://www-03.ibm.com/certify/certs/dm_index.shtml
IBM DB2 IBM Certified Database AssociateIBM Certified Database Administrator IBM Certified Application Developer DB2IBM Certified Advanced Database Administrator
http://education.oracle.com
• Scott Hayes
– President & CEO of Database-Brothers Inc (DBI)
• If I only had five minutes to assess the status of a
database, I would look at…
– Average Result Set Size
– Index Read Efficiency
– Synchronous Read Percentage
– Average Rows Read per DB Transaction per Table
– Average Read Time
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 64
3.4 Tuning Metrics
• ARSS - Average Result Set Size
– ARSS = RowsSelected / NumSelectQueries
– Rule of Thumb:
• ARSS ≤ 10 indicates a OLTP database
– If you think you have OLTP and ARSS>>10, there is something
wrong…
• ARSS > 10 indicates a warehouse database (OLAP)
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 65
3.4 Tuning Metrics
• IREF - Index Read Efficiency
– “How many rows have to be read to retrieve one row?”• i.e. How much overhead is within the where predicates?
– IREF = RowsRead / RowsReturned• Well-designed indexes may evaluate a where-clause without
reading additional rows!
– Rule of Thumb:• IREF should be less than 10 for OLTP Databases
• May be higher for Warehouses (>100)
– What to do?• Introduce more efficient indexes!
• User simpler queries!
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 66
3.4 Tuning Metrics
• SRP – Synchronous Read Percentage
– Synchronous Read means the DBMS reads just the index file and the required data page (good)
– Asynchronous Read means the DBMS employs prefetching to scan for an index or data (BAD)
– Rule of Thumb:
• >90% for OLTP (>50% for warehouses) : Good
• 80%-90% (25%-50%) : Ok
• 50%-80% (<25%) Bad
– What to do?
• Improve index structures!
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 67
3.4 Tuning Metrics
• TBRRTX – Average Rows Read per DB Transaction per Table
– TBRRTXtable = rowsRead / (attemptedCommits + attemptedRollBacks)
– Rule of Thumb:• <10 for OLTP : Good
• 10-100 : Ok
• >100 : Bad
• >1000 : Horrific
– What to do? • Improve index structures!
• Improve Queries
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 68
3.4 Tuning Metrics
• ART – Average Read Time
– The average time per read
– RT = ReadTime / (NumDataRead + NumIndexRead)
– If you think you did everything within your application
and indices, optimize read time
• Adjust size of tablespace containers
• Distribute containers across disks
– Avoid OS paging device
• Move highly access tablespaces to faster devices
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 69
3.4 Tuning Metrics
• Tree-Index Structures
– Binary Tree
• Naïve Binary Tree
• Balancing Tree
– B-Trees
• B
• B+
• B*
Relational Database Systems 2 – Wolf-Tilo Balke– Institut für Informationssysteme 70
Next Lecture