Upload
rogervinay
View
221
Download
0
Embed Size (px)
Citation preview
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 1/17
Hashing
• A searching technique called Hashing or Hash
addressing which is essentially independent of
the number n.
• Hashing is used to index and retrieve items in a
database because it is faster to find the item
using the shorter hashed key than to find it using
the original value.• A bucket in a hash file is unit of storage (typically
a disk block) that can hold one or more records.
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 2/17
Hash function
• Hash functions are mostly used in hash tables,
to quickly locate a data record given its search
key.
– If element e has key k and h is hash function, then
e is stored in position h(k) of table
– To search for e, compute h(k) to locate position. If
no element, dictionary does not contain e.
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 3/17
• Specifically, the hash function is used to map thesearch key to the index of a slot in the tablewhere the corresponding record is supposedly
stored.• What are the characteristics of a good hash
function?
– A good hash function avoids collisions.
– A good hash function tends to spread keys evenly inthe array.
– A good hash function is easy to compute.
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 4/17
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 5/17
Popular hash functions
• Division Method:
Choose the number n larger then the number
n of the keys in K. The number n is usually
chosen to be a prime number or a number
with a small number of divisors. This
frequently minimize the number of collisions.
The hash function H is defined byH(k)=k(mod m) or H(k) = (mod m)+1
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 6/17
• Midsquare method:
The key k is squared then the hash function H
is defined by
H(k)=I
Where I is obtained by deleting digits from
both ends of K2
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 7/17
• Folding Method: The key K is partitioned into a number of parts k1,k2,….kr where each part except possibly the lasthas the same number of digits as the required
address. Then the parts are edit together ignoringthe last carry that isH(k) = k1+k2+…kr
Where the leading digit carries are ignored some
time for extra milling the even numbered partsk2,k4,… are each reversed before the addition. H(4502) = 54+20 = 74
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 8/17
There are 2 broad kinds of hashing,
open hashing, and closed hashing.
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 9/17
1. Open Hashing
• Each bucket in the hash table is the head of a
linked list
• All elements that hash to a particular bucket
are placed on that bucket’s linked list
• Records within a bucket can be ordered in
several ways
– by order of insertion, by key value order, or by
frequency of access order
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 10/17
Example
0
1
2
3
4
D-1
...
...
...
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 11/17
Example
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 12/17
2. Closed Hashing
• A closed hash table keeps the members of the
set in the bucket table rather than using that
table to store list headers.
• only one element is in any bucket.
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 13/17
Collision
• Multiple keys can hash to the same slot
0
m –1
h(k 1)
h(k 4)
h(k 2)=h(k 5)
h(k 3)
U
(universe of keys)
K
(actual
keys)
k 1
k 2
k 3
k 5
k 4collision
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 14/17
14
Collision Resolution Techniques
• There are two broad ways of collision resolution:
1. Separate Chaining: An array of linked list implementation.
2. Open Addressing: Array-based implementation.
(i) Linear probing (linear search)
(ii) Quadratic probing (nonlinear search)
(iii) Double hashing (uses two hash functions)
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 15/17
Collision Resolution by Chaining
k 2
0
m –1
U
(universe of keys)
K
(actual
keys)
k 1
k 2
k 3
k 5
k 4
k 6
k 7k 8
k 1 k 4
k 5 k 6
k 7 k 3
k 8
The hash table is implemented as an array of linked lists.
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 16/17
Open addressing
• A method in which a hash collision is resolved by
probing, or searching through alternate locations
in the array (the probe sequence) until either the
target record is found.
• Linear probing is a scheme in computer
programming for resolving hash collisions of
values of hash functions by sequentially searchingthe hash table for a free location
– newLocation = (startingValue + stepSize) % arraySize
7/31/2019 data structure notes for computer engineering
http://slidepdf.com/reader/full/data-structure-notes-for-computer-engineering 17/17
• Quadratic probing operates by taking the
original hash value and adding successive
values of an arbitrary quadratic polynomial to
the starting value.
• Double hashing here second hash function is
used.