Click here to load reader
Upload
emerson-ferreira
View
177
Download
2
Embed Size (px)
DESCRIPTION
Slide da cadeira de Estrutura de Dados, ministrado pelo Prof. Dr. Christian Pagot, na Universidade Federal da Paraíba.
Citation preview
Universidade Federal da ParaíbaCentro de Informática
Hash Tables IILecture 11
1107186 – Estrutura de Dados – Turma 02
Prof. Christian Azambuja PagotCI / UFPB
2Universidade Federal da ParaíbaCentro de Informática
Arbitrary Hash Functions
● Suppose the following hash table structure:– 10 buckets (from 0 to 9).
– Collision solved through separate chain.
● Suppose the following key source:– Keys are integers that are (for some obscure
reason) multiple of 5.
● Suppose the following hash and compression functions:
hash(x )= xcompressed hash ( y )= y mod 10
3Universidade Federal da ParaíbaCentro de Informática
Arbitrary Hash Functions
● ResultsKey Compressed
Hash
5 5
10 0
15 5
20 0
25 5
30 0
35 5
40 0
45 5
50 0
The number of buckets and the keys have terms
in common!
4Universidade Federal da ParaíbaCentro de Informática
Arbitrary Hash Functions
● Hashing strings:– The straightforward approach is to sum the ASCII
value of each character.
– However, since the words are reasonably short, and the sum won't be that large!
The hash of thousands of words will get concentrated
in the first buckets!
5Universidade Federal da ParaíbaCentro de Informática
Good Hash Functions
● In a good hash function, any key is equally likely to hash to any bucket:– Minimize collisions!
● It will also depend on the distribution of the keys:– That we usually do not know!
6Universidade Federal da ParaíbaCentro de Informática
The Division Method
● A key k is mapped into one of the N buckets by taking the remainder of k mod N:
h(k )=k mod N
k k mod 2 k mod 3 k mod 4 k mod 51 1 1 1 1
2 0 2 2 2
3 1 0 3 3
4 0 1 0 4
5 1 2 1 0
7Universidade Federal da ParaíbaCentro de Informática
The Division Method
● Good N candidates are prime numbers not so close to power-of-two numbers:– Suppose we want to create a hash table to store
2000 items.
– We don't mind if we have to read 3 elements in a search that fail.
– A good value to N is 701:
h(k )=k mod 701
8Universidade Federal da ParaíbaCentro de Informática
The Multiplication Method
● Operates in two steps:– 1) Multiply the key by a constant A (0 < A < 1)
and extract the fractional part.
– 2) Multiply the result by N (number of buckets) and take the floor.
h(k )=⌊N (Ak mod 1)⌋
9Universidade Federal da ParaíbaCentro de Informática
The Multiplication Method
k └N(kA mod 1)┘ N=5, A=0.1
└N(kA mod 1)┘ N=5, A=0.6180
1 0 3
2 1 1
3 1 4
4 2 2
5 2 0
6 3 3
7 3 1
8 4 4
9 4 2
10 0 0
10Universidade Federal da ParaíbaCentro de Informática
Universal Hashing
● A randomized algorithm H for constructing hash functions
is universal if for all x ≠ y in U, we have
h :U→ {1,…, N }
Pr [h(x )=h( y )]≤1N
11Universidade Federal da ParaíbaCentro de Informática
Universal Hashing
● Example– Choose a prime number p large enough so that
any key k is in Zp = {0, 1,..., p-1}.
– Z*p = {1,..., p-1}.
– a is a number in Z*p and b is a number in Zp.
– An universal hash function hab can be then defined
as
ha ,b(k )=((ak+b)mod p)mod N
12Universidade Federal da ParaíbaCentro de Informática
Perfect Hashing
● A hash function is perfect if the complexity of a search is O(1) (constant) in the worst case.
● Perfect hashing is accomplished by using universal hashing in two levels.– In the first level, a hash function selected from a
family of universal hash functions is used.
– Collisions are solved by inserting the keys in a secondary hash table with an associated hash function.