12
  Indexing By: Arnold Mesa

Indexing 2

Embed Size (px)

DESCRIPTION

Indexing 2

Citation preview

  • IndexingBy: Arnold Mesa

  • IndexingYou can think of an index to a file like a catalogue to a library

  • There are two kinds...Ordered Indices - sorted ordering of the values.

    Hash Indices - a uniform distribution of values across a range of buckets. The distribution is based on a hash function.

  • Key ConceptsAccess Types - types of access that are supported efficientlyAccess Time - time it takes to access a particular data item

    Insertion Time - time it takes to insert a data itemDeletion Time - time it takes to delete a data itemSpace Overhead - additional space occupied by an index structure

  • There are two kinds of ordered indices

    Dense Index - An index record appears for every search-key value in the file. The index record contains the search-key value and a pointer to the first data record. The rest of the records with the same search key-value would be sequentially stored after the first record.

    Sparse Index - An index record appears for only some of the search key values. So you have a smaller number of index records. Each index contains a search key and a pointer to the first record, as with the dense index.

  • Hotel SofitelHiltonWestinMarriotThe RitzDense Index

  • Hotel SofitelWestinThe RitzSparse Tree

  • Hotel SofitelWestinThe RitzSuppose we want to find the Marriot #532...

  • Efficiency IssuesEven if we use a sparse index, the index itself may become too large for efficient processing

    If an index is sufficiently small to be kept in main memory, the search time would be low

    If the index is large that is kept on disk, a search may require several disk block reads

  • How to deal ...With a large index we should construct a sparse index on the primary index.

    Hotel SofitelHiltonWestinMarriotThe RitzHotel SofitelMarriotMarriot

  • Is this looking familiar?Remember B+-treesB+ trees are said to be of m-order. A number of the designers choosing.Each leaf has between m and [m-2] children.All data is stored at the leaf level.All leaves are at the same depth

  • Example?