08 Hash Tables

Preview:

Citation preview

Analysis of AlgorithmsHash Tables

Andres Mendez-Vazquez

October 3, 2014

1 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

2 / 81

First: About Basic Data Structures

RemarkIt is quite interesting to notice that many data structures actually sharesimilar operations!!!

YesIf you think them as ADT

3 / 81

First: About Basic Data Structures

RemarkIt is quite interesting to notice that many data structures actually sharesimilar operations!!!

YesIf you think them as ADT

3 / 81

Examples

Search(S,k)Example: Search in a BST

8

3

1 6

4 7

10

14

13

k=7

4 / 81

Examples

Insert(S,x)Example: Insert in a linked list

CA EDB

firstNode

NULL

K

5 / 81

And Again

Delete(S,x)Example: Delete in a BST

8

3

1 4

7

10

14

13

8

3

1 6

4 7

10

14

13

Delete = 6

6 / 81

Basic data structures and operations.

ThereforeThis are basic structures, it is up to you to read about them.

Chapter 10 Cormen’s book

7 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

8 / 81

Hash tables: Concepts

DefinitionA hash table or hash map T is a data structure, most commonly anarray, that uses a hash function to efficiently map certain identifiers ofkeys (e.g. person names) to associated values.

AdvantagesThey have the advantage of having a expected complexity ofoperations of O(1 + α)

I Still, be aware of α

However, If you have a large number of keys, UThen, it is impractical to store a table of the size of |U|.Thus, you can use a hash function h : U→{0, 1, ...,m − 1}

9 / 81

Hash tables: Concepts

DefinitionA hash table or hash map T is a data structure, most commonly anarray, that uses a hash function to efficiently map certain identifiers ofkeys (e.g. person names) to associated values.

AdvantagesThey have the advantage of having a expected complexity ofoperations of O(1 + α)

I Still, be aware of α

However, If you have a large number of keys, UThen, it is impractical to store a table of the size of |U|.Thus, you can use a hash function h : U→{0, 1, ...,m − 1}

9 / 81

Hash tables: Concepts

DefinitionA hash table or hash map T is a data structure, most commonly anarray, that uses a hash function to efficiently map certain identifiers ofkeys (e.g. person names) to associated values.

AdvantagesThey have the advantage of having a expected complexity ofoperations of O(1 + α)

I Still, be aware of α

However, If you have a large number of keys, UThen, it is impractical to store a table of the size of |U|.Thus, you can use a hash function h : U→{0, 1, ...,m − 1}

9 / 81

When you have a small universe of keys, U

RemarksIt is not necessary to map the key values.Key values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(T , k)

I return T [k]

2 Direct-Address-Search(T , x)

I T [x .key ] = x3 Direct-Address-Delete(T , x)

I T [x .key ] = NIL

10 / 81

When you have a small universe of keys, U

RemarksIt is not necessary to map the key values.Key values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(T , k)

I return T [k]

2 Direct-Address-Search(T , x)

I T [x .key ] = x3 Direct-Address-Delete(T , x)

I T [x .key ] = NIL

10 / 81

When you have a large universe of keys, U

ThenThen, it is impractical to store a table of the size of |U|.

You can use a especial function for mapping

h : U→{0, 1, ...,m − 1} (1)

ProblemWith a large enough universe U, two keys can hash to the same value

This is called a collision.

11 / 81

When you have a large universe of keys, U

ThenThen, it is impractical to store a table of the size of |U|.

You can use a especial function for mapping

h : U→{0, 1, ...,m − 1} (1)

ProblemWith a large enough universe U, two keys can hash to the same value

This is called a collision.

11 / 81

When you have a large universe of keys, U

ThenThen, it is impractical to store a table of the size of |U|.

You can use a especial function for mapping

h : U→{0, 1, ...,m − 1} (1)

ProblemWith a large enough universe U, two keys can hash to the same value

This is called a collision.

11 / 81

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

12 / 81

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

12 / 81

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

12 / 81

Hash tables: Chaining

A Possible SolutionInsert the elements that hash to the same slot into a linked list.

U(Universe of Keys)

13 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

14 / 81

Analysis of hashing with Chaining: Assumptions

AssumptionsWe have a load factor α = n

m , where m is the size of the hash tableT , and n is the number of elements to store.Simple uniform hashing property:

I This means that any of the m slots can be selected.I This means that if n = n0 + n1 + ...+ nm−1, we have that E (nj) = α.

To simplify the analysis, you need to consider two casesUnsuccessful searchSuccessful search

15 / 81

Analysis of hashing with Chaining: Assumptions

AssumptionsWe have a load factor α = n

m , where m is the size of the hash tableT , and n is the number of elements to store.Simple uniform hashing property:

I This means that any of the m slots can be selected.I This means that if n = n0 + n1 + ...+ nm−1, we have that E (nj) = α.

To simplify the analysis, you need to consider two casesUnsuccessful searchSuccessful search

15 / 81

Why?

After allYou are always looking for keys when

SearchingInsertingDeleting

It is clear that we have two possibilitiesFinding the key or not finding the key

16 / 81

Why?

After allYou are always looking for keys when

SearchingInsertingDeleting

It is clear that we have two possibilitiesFinding the key or not finding the key

16 / 81

For this, we have the following theorems

Theorem 11.1In a hash table in which collisions are resolved by chaining, an unsuccessfulsearch takes average-case time Θ (1 + α), under the assumption of simpleuniform hashing.

Theorem 11.2In a hash table in which collisions are resolved by chaining, a successfulsearch takes average-case time Θ (1 + α) under the assumption of simpleuniform hashing.

17 / 81

For this, we have the following theorems

Theorem 11.1In a hash table in which collisions are resolved by chaining, an unsuccessfulsearch takes average-case time Θ (1 + α), under the assumption of simpleuniform hashing.

Theorem 11.2In a hash table in which collisions are resolved by chaining, a successfulsearch takes average-case time Θ (1 + α) under the assumption of simpleuniform hashing.

17 / 81

Analysis of hashing: Constant time.

FinallyThese two theorems tell us that if n = O(m)

α = nm = O(m)

m = O(1)

Or search time is constant.

18 / 81

Analysis of hashing: Which hash function?

Consider that:Good hash functions should maintain the property of simple uniformhashing!

The keys have the same probability 1/m to be hashed to any bucket!!!A uniform hash function minimizes the likelihood of an overflow whenkeys are selected at random.

Then:What should we use?

If we know how the keys are distributed uniformly at the followinginterval 0 ≤ k < 1 then h(k) = bkmc.

19 / 81

Analysis of hashing: Which hash function?

Consider that:Good hash functions should maintain the property of simple uniformhashing!

The keys have the same probability 1/m to be hashed to any bucket!!!A uniform hash function minimizes the likelihood of an overflow whenkeys are selected at random.

Then:What should we use?

If we know how the keys are distributed uniformly at the followinginterval 0 ≤ k < 1 then h(k) = bkmc.

19 / 81

What if...Question:What about something with keys in a normal distribution?

20 / 81

Possible hash functions when the keys are natural numbers

The division methodh(k) = k mod m.Good choices for m are primes not too close to a power of 2.

The multiplication methodh(k) = bm(kA mod 1)c with 0 < A < 1.The value of m is not critical.Easy to implement in a computer.

21 / 81

Possible hash functions when the keys are natural numbers

The division methodh(k) = k mod m.Good choices for m are primes not too close to a power of 2.

The multiplication methodh(k) = bm(kA mod 1)c with 0 < A < 1.The value of m is not critical.Easy to implement in a computer.

21 / 81

When they are not, we need to interpreting the keys asnatural numbers

Keys interpreted as natural numbersGiven a string “pt”, we can say p = 112 and t=116 (ASCII numbers)

ASCII has 128 possible symbols.I Then (128× 112) + 1280 × 116 = 14452

NeverthelessThis is highly dependent on the origins of the keys!!!

22 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

23 / 81

Hashing methods: The division method

Hash functionh(k) = k mod m

Problems with some selectionsm = 2p, h(k) is only the p lowest-order bits.m = 2p − 1, when k is interpreted as a character string interpreted inradix 2p, permuting characters in k does not change the value.

It is better to selectPrime numbers not too close to an exact power of two.

I For example, given n = 2000 elements.F We can use m = 701 because it is near to 2000/3 but not near a power

of two.

24 / 81

Hashing methods: The division method

Hash functionh(k) = k mod m

Problems with some selectionsm = 2p, h(k) is only the p lowest-order bits.m = 2p − 1, when k is interpreted as a character string interpreted inradix 2p, permuting characters in k does not change the value.

It is better to selectPrime numbers not too close to an exact power of two.

I For example, given n = 2000 elements.F We can use m = 701 because it is near to 2000/3 but not near a power

of two.

24 / 81

Hashing methods: The division method

Hash functionh(k) = k mod m

Problems with some selectionsm = 2p, h(k) is only the p lowest-order bits.m = 2p − 1, when k is interpreted as a character string interpreted inradix 2p, permuting characters in k does not change the value.

It is better to selectPrime numbers not too close to an exact power of two.

I For example, given n = 2000 elements.F We can use m = 701 because it is near to 2000/3 but not near a power

of two.

24 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

25 / 81

Hashing methods: The multiplication method

The multiplication method for creating hash functions has two steps1 Multiply the key k by a constant A in the range 0 < A < 1 and

extract the fractional part of kA.2 Then, you multiply the value by m an take the floor,

h(k) = bm (kA mod 1)c.

The mod allows to extract that fractional part!!!kA mod 1 = kA− bkAc, 0 < A < 1.

Advantages:m is not critical, normally m = 2p.

26 / 81

Hashing methods: The multiplication method

The multiplication method for creating hash functions has two steps1 Multiply the key k by a constant A in the range 0 < A < 1 and

extract the fractional part of kA.2 Then, you multiply the value by m an take the floor,

h(k) = bm (kA mod 1)c.

The mod allows to extract that fractional part!!!kA mod 1 = kA− bkAc, 0 < A < 1.

Advantages:m is not critical, normally m = 2p.

26 / 81

Hashing methods: The multiplication method

The multiplication method for creating hash functions has two steps1 Multiply the key k by a constant A in the range 0 < A < 1 and

extract the fractional part of kA.2 Then, you multiply the value by m an take the floor,

h(k) = bm (kA mod 1)c.

The mod allows to extract that fractional part!!!kA mod 1 = kA− bkAc, 0 < A < 1.

Advantages:m is not critical, normally m = 2p.

26 / 81

Implementing in a computer

FirstFirst, imagine that the word in a machine has w bits size and k fits onthose bits.

SecondThen, select an s in the range 0 < s < 2w and assume A = s

2w .

ThirdNow, we multiply k by the number s = A2w .

27 / 81

Implementing in a computer

FirstFirst, imagine that the word in a machine has w bits size and k fits onthose bits.

SecondThen, select an s in the range 0 < s < 2w and assume A = s

2w .

ThirdNow, we multiply k by the number s = A2w .

27 / 81

Implementing in a computer

FirstFirst, imagine that the word in a machine has w bits size and k fits onthose bits.

SecondThen, select an s in the range 0 < s < 2w and assume A = s

2w .

ThirdNow, we multiply k by the number s = A2w .

27 / 81

Example

FourthThe result of that is r12w + r0, a 2w -bit value word, where the first p-mostsignificative bits of r0 are the desired hash value.

Graphically

28 / 81

Example

FourthThe result of that is r12w + r0, a 2w -bit value word, where the first p-mostsignificative bits of r0 are the desired hash value.

Graphically

extract p bits

28 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

29 / 81

However

Sooner or LatterWe can pick up a hash function that does not give us the desired uniformrandomized property

ThusWe are required to analyze the possible clustering of the data by the hashfunction

30 / 81

However

Sooner or LatterWe can pick up a hash function that does not give us the desired uniformrandomized property

ThusWe are required to analyze the possible clustering of the data by the hashfunction

30 / 81

Measuring Clustering through a metric C

DefinitionIf bucket i contains ni elements, then

C =m

n − 1

[∑mi=1 n2

in − 1

](2)

Properties1 If C = 1, then you have uniform hashing.2 If C > 1, it means that the performance of the hash table is slowed

down by clustering by approximately a factor of C .3 If C < 1, the spread of the elements is more even than uniform!!! Not

going to happen!!!

31 / 81

Measuring Clustering through a metric C

DefinitionIf bucket i contains ni elements, then

C =m

n − 1

[∑mi=1 n2

in − 1

](2)

Properties1 If C = 1, then you have uniform hashing.2 If C > 1, it means that the performance of the hash table is slowed

down by clustering by approximately a factor of C .3 If C < 1, the spread of the elements is more even than uniform!!! Not

going to happen!!!

31 / 81

However

UnfortunatelyHash table do not give a way to measure clustering

Thus, table designersThey should provide some clustering estimation as part of the interface.

ThusThe reason the clustering measure works is because it is based on anestimate of the variance of the distribution of bucket sizes.

32 / 81

However

UnfortunatelyHash table do not give a way to measure clustering

Thus, table designersThey should provide some clustering estimation as part of the interface.

ThusThe reason the clustering measure works is because it is based on anestimate of the variance of the distribution of bucket sizes.

32 / 81

However

UnfortunatelyHash table do not give a way to measure clustering

Thus, table designersThey should provide some clustering estimation as part of the interface.

ThusThe reason the clustering measure works is because it is based on anestimate of the variance of the distribution of bucket sizes.

32 / 81

Thus

FirstIf clustering is occurring, some buckets will have more elements than theyshould, and some will have fewer.

SecondThere will be a wider range of bucket sizes than one would expect froma random hash function.

33 / 81

Thus

FirstIf clustering is occurring, some buckets will have more elements than theyshould, and some will have fewer.

SecondThere will be a wider range of bucket sizes than one would expect froma random hash function.

33 / 81

Analysis of C : First, keys are uniformly distributed

Consider the following random variableConsider bucket i containing ni elements, with Xij= I{element j lands inbucket i}

Then, given

ni =n∑

j=1Xij (3)

We have that

E [Xij ] =1m , E

[X 2

ij

]=

1m (4)

34 / 81

Analysis of C : First, keys are uniformly distributed

Consider the following random variableConsider bucket i containing ni elements, with Xij= I{element j lands inbucket i}

Then, given

ni =n∑

j=1Xij (3)

We have that

E [Xij ] =1m , E

[X 2

ij

]=

1m (4)

34 / 81

Analysis of C : First, keys are uniformly distributed

Consider the following random variableConsider bucket i containing ni elements, with Xij= I{element j lands inbucket i}

Then, given

ni =n∑

j=1Xij (3)

We have that

E [Xij ] =1m , E

[X 2

ij

]=

1m (4)

34 / 81

Next

We look at the dispersion of Xij

Var [Xij ] = E[X 2

ij

]− (E [Xij ])

2 =1m −

1m2 (5)

What about the expected number of elements at each bucket

E [ni ] = E

n∑j=1

Xij

=nm = α (6)

35 / 81

Next

We look at the dispersion of Xij

Var [Xij ] = E[X 2

ij

]− (E [Xij ])

2 =1m −

1m2 (5)

What about the expected number of elements at each bucket

E [ni ] = E

n∑j=1

Xij

=nm = α (6)

35 / 81

Then, we have

Because independence of {Xij}, the scattering of ni

Var [ni ] = Var

n∑j=1

Xij

=

n∑j=1

Var [Xij ]

= nVar [Xij ]

36 / 81

ThenWhat about the range of the possible number of elements at eachbucket?

Var [ni ] =nm −

nm2

= α− α

m

But, we have that

E[n2

i

]= E

n∑j=1

X 2ij +

n∑j=1

n∑k=1,k 6=j

XijXik

(7)

Or

E[n2

i

]=

nm +

n∑j=1

n∑k=1,k 6=j

1m2 (8)

37 / 81

ThenWhat about the range of the possible number of elements at eachbucket?

Var [ni ] =nm −

nm2

= α− α

m

But, we have that

E[n2

i

]= E

n∑j=1

X 2ij +

n∑j=1

n∑k=1,k 6=j

XijXik

(7)

Or

E[n2

i

]=

nm +

n∑j=1

n∑k=1,k 6=j

1m2 (8)

37 / 81

ThenWhat about the range of the possible number of elements at eachbucket?

Var [ni ] =nm −

nm2

= α− α

m

But, we have that

E[n2

i

]= E

n∑j=1

X 2ij +

n∑j=1

n∑k=1,k 6=j

XijXik

(7)

Or

E[n2

i

]=

nm +

n∑j=1

n∑k=1,k 6=j

1m2 (8)

37 / 81

Thus

We re-express the range on term of expected values of ni

E[n2

i

]=

nm +

n (n − 1)

m2 (9)

Then

E[n2

i

]− E [ni ]

2 =nm +

n (n − 1)

m2 − n2

m2

=nm −

nm2

= α− α

m

38 / 81

Thus

We re-express the range on term of expected values of ni

E[n2

i

]=

nm +

n (n − 1)

m2 (9)

Then

E[n2

i

]− E [ni ]

2 =nm +

n (n − 1)

m2 − n2

m2

=nm −

nm2

= α− α

m

38 / 81

Then

Finally, we have that

E[n2

i

]= α

(1− 1

m

)+ α2 (10)

39 / 81

Then, we have that

Now we build an estimator of the mean of n2i which is part of C

1n

m∑i=1

n2i (11)

Thus

E[1n

m∑i=1

n2i

]=

1n

m∑i=1

E[n2

i

]=

mn

(1− 1

m

)+ α2

]=

(1− 1

m

)+ α2

]= 1− 1

m + α

40 / 81

Then, we have that

Now we build an estimator of the mean of n2i which is part of C

1n

m∑i=1

n2i (11)

Thus

E[1n

m∑i=1

n2i

]=

1n

m∑i=1

E[n2

i

]=

mn

(1− 1

m

)+ α2

]=

(1− 1

m

)+ α2

]= 1− 1

m + α

40 / 81

Finally

We can plug back on C using the expected value

E [C ] =m

n − 1

[E[∑m

i=1 n2i

n

]− 1

]

=m

n − 1

[1− 1

m + α− 1]

=m

n − 1

[ nm −

1m

]=

mn − 1

[n − 1m

]= 1

41 / 81

Explanation

Using a hash table that enforce a uniform distribution in the bucketsWe get that C = 1 or the best distribution of keys

42 / 81

Now, we have a really horrible hash function ≡ It hits onlyone of every b bucketsThus

E [Xij ] = E[X 2

ij

]=

bm (12)

Thus, we have

E [ni ] = αb (13)

Then, we have

E[1n

m∑i=1

n2i

]=

1n

m∑i=1

E[n2

i

]= αb − b

m + 1

43 / 81

Now, we have a really horrible hash function ≡ It hits onlyone of every b bucketsThus

E [Xij ] = E[X 2

ij

]=

bm (12)

Thus, we have

E [ni ] = αb (13)

Then, we have

E[1n

m∑i=1

n2i

]=

1n

m∑i=1

E[n2

i

]= αb − b

m + 1

43 / 81

Now, we have a really horrible hash function ≡ It hits onlyone of every b bucketsThus

E [Xij ] = E[X 2

ij

]=

bm (12)

Thus, we have

E [ni ] = αb (13)

Then, we have

E[1n

m∑i=1

n2i

]=

1n

m∑i=1

E[n2

i

]= αb − b

m + 1

43 / 81

Finally

We can plug back on C using the expected value

E [C ] =m

n − 1

[E[∑m

i=1 n2i

n

]− 1

]

=m

n − 1

[αb − b

m + 1− 1]

=m

n − 1

[nbm −

bm

]=

mn − 1

[b (n − 1)

m

]= b

44 / 81

Explanation

Using a hash table that enforce a uniform distribution in the bucketsWe get that C = b > 1 or a really bad distribution of the keys!!!

Thus, you only need the following to evaluate a hash function1n

m∑i=1

n2i (14)

45 / 81

Explanation

Using a hash table that enforce a uniform distribution in the bucketsWe get that C = b > 1 or a really bad distribution of the keys!!!

Thus, you only need the following to evaluate a hash function1n

m∑i=1

n2i (14)

45 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

46 / 81

A Possible Solution: Universal Hashing

IssuesIn practice, keys are not randomly distributed.Any fixed hash function might yield retrieval Θ(n) time.

GoalTo find hash functions that produce uniform random table indexesirrespective of the keys.

IdeaTo select a hash function at random from a designed class of functions atthe beginning of the execution.

47 / 81

A Possible Solution: Universal Hashing

IssuesIn practice, keys are not randomly distributed.Any fixed hash function might yield retrieval Θ(n) time.

GoalTo find hash functions that produce uniform random table indexesirrespective of the keys.

IdeaTo select a hash function at random from a designed class of functions atthe beginning of the execution.

47 / 81

A Possible Solution: Universal Hashing

IssuesIn practice, keys are not randomly distributed.Any fixed hash function might yield retrieval Θ(n) time.

GoalTo find hash functions that produce uniform random table indexesirrespective of the keys.

IdeaTo select a hash function at random from a designed class of functions atthe beginning of the execution.

47 / 81

Hashing methods: Universal hashing

Example

Set of hash functions

Choose a hash function randomly

(At the beginning of the execution)

HASH TABLE

48 / 81

Definition of universal hash functions

DefinitionLet H = {h : U → {0, 1, ...,m − 1}} be a family of hash functions. H iscalled a universal family if

∀x , y ∈ U, x 6= y : Prh∈H

(h(x) = h(y)) ≤ 1m (15)

Main resultWith universal hashing the chance of collision between distinct keys k andl is no more than the 1

m chance of collision if locations h(k) and h(l) wererandomly and independently chosen from the set {0, 1, ...,m − 1}.

49 / 81

Definition of universal hash functions

DefinitionLet H = {h : U → {0, 1, ...,m − 1}} be a family of hash functions. H iscalled a universal family if

∀x , y ∈ U, x 6= y : Prh∈H

(h(x) = h(y)) ≤ 1m (15)

Main resultWith universal hashing the chance of collision between distinct keys k andl is no more than the 1

m chance of collision if locations h(k) and h(l) wererandomly and independently chosen from the set {0, 1, ...,m − 1}.

49 / 81

Hashing methods: Universal hashing

Theorem 11.3Suppose that a hash function h is chosen randomly from a universalcollection of hash functions and has been used to hash n keys into a tableT of size m, using chaining to resolve collisions. If key k is not in thetable, then the expected length E [nh(k)] of the list that key k hashes to isat most the load factor α = n

m . If key k is in the table, then the expectedlength E [nh(k)] of the list containing key k is at most 1 + α.

Corollary 11.4Using universal hashing and collision resolution by chaining in an initiallyempty table with m slots, it takes expected time Θ(n) to handle anysequence of n INSERT, SEARCH, and DELETE operations O(m) INSERToperations.

50 / 81

Hashing methods: Universal hashing

Theorem 11.3Suppose that a hash function h is chosen randomly from a universalcollection of hash functions and has been used to hash n keys into a tableT of size m, using chaining to resolve collisions. If key k is not in thetable, then the expected length E [nh(k)] of the list that key k hashes to isat most the load factor α = n

m . If key k is in the table, then the expectedlength E [nh(k)] of the list containing key k is at most 1 + α.

Corollary 11.4Using universal hashing and collision resolution by chaining in an initiallyempty table with m slots, it takes expected time Θ(n) to handle anysequence of n INSERT, SEARCH, and DELETE operations O(m) INSERToperations.

50 / 81

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal.

51 / 81

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal.

51 / 81

Example: Universal hash functions

Examplep = 977, m = 50, a and b random numbers

I ha,b(k) = ((ak + b) mod p) mod m

52 / 81

Example of key distribution

Example, mean = 488.5 and dispersion = 5

53 / 81

Example with 10 keys

Universal Hashing Vs Division Method

54 / 81

Example with 50 keys

Universal Hashing Vs Division Method

55 / 81

Example with 100 keys

Universal Hashing Vs Division Method

56 / 81

Example with 200 keys

Universal Hashing Vs Division Method

57 / 81

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

58 / 81

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

58 / 81

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

58 / 81

Proof of being a Universal Family

First fix assume that you have two different keys l 6= mWithout loosing generality assume the following

1 li 6= mi ⇒ li = 0 and mi = 12 lj = mj ∀j 6= i

ThusThe column i does not contribute to the final answer of h (l) because ofthe zero!!!

NowImagine that we fix all the other columns in h, thus there is only oneanswer for h (l)

59 / 81

Proof of being a Universal Family

First fix assume that you have two different keys l 6= mWithout loosing generality assume the following

1 li 6= mi ⇒ li = 0 and mi = 12 lj = mj ∀j 6= i

ThusThe column i does not contribute to the final answer of h (l) because ofthe zero!!!

NowImagine that we fix all the other columns in h, thus there is only oneanswer for h (l)

59 / 81

Proof of being a Universal Family

First fix assume that you have two different keys l 6= mWithout loosing generality assume the following

1 li 6= mi ⇒ li = 0 and mi = 12 lj = mj ∀j 6= i

ThusThe column i does not contribute to the final answer of h (l) because ofthe zero!!!

NowImagine that we fix all the other columns in h, thus there is only oneanswer for h (l)

59 / 81

NowFor ith columnThere are 2b possible columns when changing the ones and zeros

Thus, given the randomness of the zeros and onesThe probability that we get the zero column

00......0

(16)

is equal

12b (17)

with h (l) = h (m)60 / 81

NowFor ith columnThere are 2b possible columns when changing the ones and zeros

Thus, given the randomness of the zeros and onesThe probability that we get the zero column

00......0

(16)

is equal

12b (17)

with h (l) = h (m)60 / 81

Then

We get the probability

P (h (l) = h (m)) ≤ 12b (18)

61 / 81

Implementation of the column*vector mod 2

Code

i n t p roduc t ( i n t row , i n t v e c t o r ){

i n t i = row & v e c t o r ;

i = i − ( ( i >> 1) & 0 x55555555 ) ;i = ( i & 0 x33333333 ) + ( ( i >> 2) & 0 x33333333 ) ;i = ( ( ( i + ( i >> 4)) & 0x0F0F0F0F ) ∗ 0 x01010101 ) >> 24 ;

r e t u r n i & i & 0 x00000001 ;

}

62 / 81

Advantages of universal hashing

AdvantagesUniversal hashing provides good results on average, independently ofthe keys to be stored.Guarantees that no input will always elicit the worst-case behavior.Poor performance occurs only when the random choice returns aninefficient hash function; this has a small probability.

63 / 81

Open addressing

DefinitionAll the elements occupy the hash table itself.

What is it?We systematically examine table slots until either we find the desiredelement or we have ascertained that the element is not in the table.

AdvantagesThe advantage of open addressing is that it avoids pointers altogether.

64 / 81

Open addressing

DefinitionAll the elements occupy the hash table itself.

What is it?We systematically examine table slots until either we find the desiredelement or we have ascertained that the element is not in the table.

AdvantagesThe advantage of open addressing is that it avoids pointers altogether.

64 / 81

Open addressing

DefinitionAll the elements occupy the hash table itself.

What is it?We systematically examine table slots until either we find the desiredelement or we have ascertained that the element is not in the table.

AdvantagesThe advantage of open addressing is that it avoids pointers altogether.

64 / 81

Insert in Open addressing

Extended hash function to probeInstead of being fixed in the order 0, 1, 2, ...,m − 1 with Θ (n) searchtimeExtend the hash function toh : U × {0, 1, ...,m − 1} → {0, 1, ...,m − 1}This gives the probe sequence 〈h(k, 0), h(k, 1), ..., h(k,m − 1)〉

I A permutation of 〈0, 1, 2, ...,m − 1〉

65 / 81

Hashing methods in Open Addressing

HASH-INSERT(T , k)1 i = 02 repeat3 j = h (k , i)4 if T [j ] == NIL5 T [j ] = k6 return j7 else i = i + 18 until i == m9 error “Hash Table Overflow”

66 / 81

Hashing methods in Open Addressing

HASH-SEARCH(T,k)1 i = 02 repeat3 j = h (k , i)4 if T [j ] == k5 return j6 i = i + 17 until T [j ] == NIL or i == m8 return NIL

67 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

68 / 81

Linear probing: Definition and properties

Hash functionGiven an ordinary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) =(h′(k) + i

)mod m, (19)

Sequence of probesGiven key k, we first probe T [h′(k)], then T [h′(k) + 1] and so on untilT [m − 1]. Then, we wrap around T [0] to T [h′(k)− 1].

Distinct probesBecause the initial probe determines the entire probe sequence, there arem distinct probe sequences.

69 / 81

Linear probing: Definition and properties

Hash functionGiven an ordinary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) =(h′(k) + i

)mod m, (19)

Sequence of probesGiven key k, we first probe T [h′(k)], then T [h′(k) + 1] and so on untilT [m − 1]. Then, we wrap around T [0] to T [h′(k)− 1].

Distinct probesBecause the initial probe determines the entire probe sequence, there arem distinct probe sequences.

69 / 81

Linear probing: Definition and properties

Hash functionGiven an ordinary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) =(h′(k) + i

)mod m, (19)

Sequence of probesGiven key k, we first probe T [h′(k)], then T [h′(k) + 1] and so on untilT [m − 1]. Then, we wrap around T [0] to T [h′(k)− 1].

Distinct probesBecause the initial probe determines the entire probe sequence, there arem distinct probe sequences.

69 / 81

Linear probing: Definition and properties

DisadvantagesLinear probing suffers of primary clustering.Long runs of occupied slots build up increasing the average searchtime.

I Clusters arise because an empty slot preceded by i full slots gets fillednext with probability i+1

m .

Long runs of occupied slots tend to get longer, and the averagesearch time increases.

70 / 81

Example

Example using keys uniformly distributedIt was generated using the division method

Then

71 / 81

ExampleExample using keys uniformly distributedIt was generated using the division method

Then

71 / 81

Example

Example using Gaussian keysIt was generated using the division method

Then

72 / 81

ExampleExample using Gaussian keysIt was generated using the division method

Then

72 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

73 / 81

Quadratic probing: Definition and properties

Hash functionGiven an auxiliary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) = (h′(k) + c1i + c2i2) mod m, (20)

where c1, c2 are auxiliary constants

Sequence of probesGiven key k, we first probe T [h′(k)], later positions probed are offsetby amounts that depend in a quadratic manner on the probe numberi .The initial probe determines the entire sequence, and so only mdistinct probe sequences are used.

74 / 81

Quadratic probing: Definition and properties

Hash functionGiven an auxiliary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) = (h′(k) + c1i + c2i2) mod m, (20)

where c1, c2 are auxiliary constants

Sequence of probesGiven key k, we first probe T [h′(k)], later positions probed are offsetby amounts that depend in a quadratic manner on the probe numberi .The initial probe determines the entire sequence, and so only mdistinct probe sequences are used.

74 / 81

Quadratic probing: Definition and properties

AdvantagesThis method works much better than linear probing, but to make full useof the hash table, the values of c1,c2, and m are constrained.

DisadvantagesIf two keys have the same initial probe position, then their probe sequencesare the same, since h(k1, 0) = h(k2, 0) implies h(k1, i) = h(k2, i). Thisproperty leads to a milder form of clustering, called secondary clustering.

75 / 81

Quadratic probing: Definition and properties

AdvantagesThis method works much better than linear probing, but to make full useof the hash table, the values of c1,c2, and m are constrained.

DisadvantagesIf two keys have the same initial probe position, then their probe sequencesare the same, since h(k1, 0) = h(k2, 0) implies h(k1, i) = h(k2, i). Thisproperty leads to a milder form of clustering, called secondary clustering.

75 / 81

Outline1 Basic data structures and operations

2 Hash tablesHash tables: ConceptsAnalysis of hashing under Chaining

3 Hashing MethodsThe Division MethodThe Multiplication MethodClustering Analysis of Hashing FunctionsA Possible Solution: Universal Hashing

4 Open AddressingLinear ProbingQuadratic ProbingDouble Hashing

5 Excercises

76 / 81

Double hashing: Definition and properties

Hash functionDouble hashing uses a hash function of the form

h(k, i) = (h1(k) + ih2(k)) mod m, (21)

where i = 0, 1, ...,m − 1 and h1, h2 are auxiliary hash functions (Normallyfor a Universal family)

Sequence of probesGiven key k, we first probe T [h1(k)], successive probe positions areoffset from previous positions by the amount h2(k) mod m.Thus, unlike the case of linear or quadratic probing, the probesequence here depends in two ways upon the key k, since the initialprobe position, the offset, or both, may vary.

77 / 81

Double hashing: Definition and properties

Hash functionDouble hashing uses a hash function of the form

h(k, i) = (h1(k) + ih2(k)) mod m, (21)

where i = 0, 1, ...,m − 1 and h1, h2 are auxiliary hash functions (Normallyfor a Universal family)

Sequence of probesGiven key k, we first probe T [h1(k)], successive probe positions areoffset from previous positions by the amount h2(k) mod m.Thus, unlike the case of linear or quadratic probing, the probesequence here depends in two ways upon the key k, since the initialprobe position, the offset, or both, may vary.

77 / 81

Double hashing: Definition and properties

AdvantagesWhen m is prime or a power of 2, double hashing improves over linearor quadratic probing in that Θ(m2) probe sequences are used, ratherthan Θ(m) since each possible (h1(k), h2(k)) pair yields a distinctprobe sequence.The performance of double hashing appears to be very close to theperformance of the “ideal” scheme of uniform hashing.

78 / 81

Why?

Look At This

79 / 81

Analysis of Open Addressing

Theorem 11.6Given an open-address hash table with load factor α = n

m < 1, theexpected number of probes in an unsuccessful search is at most 1

1−αassuming uniform hashing.

CorollaryInserting an element into an open-address hash table with load factor̨requires at most 1

1−α probes on average, assuming uniform hashing.

Theorem 11.8Given an open-address hash table with load factor α < 1, the expectednumber of probes in a successful search is at most 1

α ln 11−α assuming

uniform hashing and assuming that each key in the table is equally likelyto be searched for.

80 / 81

Analysis of Open Addressing

Theorem 11.6Given an open-address hash table with load factor α = n

m < 1, theexpected number of probes in an unsuccessful search is at most 1

1−αassuming uniform hashing.

CorollaryInserting an element into an open-address hash table with load factor̨requires at most 1

1−α probes on average, assuming uniform hashing.

Theorem 11.8Given an open-address hash table with load factor α < 1, the expectednumber of probes in a successful search is at most 1

α ln 11−α assuming

uniform hashing and assuming that each key in the table is equally likelyto be searched for.

80 / 81

Analysis of Open Addressing

Theorem 11.6Given an open-address hash table with load factor α = n

m < 1, theexpected number of probes in an unsuccessful search is at most 1

1−αassuming uniform hashing.

CorollaryInserting an element into an open-address hash table with load factor̨requires at most 1

1−α probes on average, assuming uniform hashing.

Theorem 11.8Given an open-address hash table with load factor α < 1, the expectednumber of probes in a successful search is at most 1

α ln 11−α assuming

uniform hashing and assuming that each key in the table is equally likelyto be searched for.

80 / 81

Excercises

From Cormen’s book, chapters 1111.1-211.2-111.2-211.2-311.3-111.3-3

81 / 81

Recommended