21
COSC 1030 Lecture 10 COSC 1030 Lecture 10 Hash Table

COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Embed Size (px)

Citation preview

Page 1: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

COSC 1030 Lecture 10COSC 1030 Lecture 10

Hash Table

Page 2: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

TopicsTopics

TableHash ConceptHash FunctionResolve collisionComplexity Analysis

Page 3: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

TableTable

Table– A collection of entries– Entry :<key, info>– Insert, search and delete– Update, and retrieve

Array representation– Indexed– Maps key to index

Page 4: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Hash TableHash Table Hash Table

– A table– Key range >> table size– Many-to-one mapping (hashing)– Indexed – hash code as index

Tabbed Address Book– Map names to A:Z– Multiple names start with same letter

Same tab, sequential slots

Page 5: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Hash Table ADTHash Table ADT

Interface Hashtable {

void insert(Item anItem);

Item search(Key aKey);

boolean remove(Key aKey);

boolean isFull();

boolean isEmpty();

}

Page 6: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Hash FunctionHash Function

Maps key to index evenlyFor any n in N,

hash(n) = n mod Mwhere M is the size of hash table.

hash(k*M + n) = n, where n < M, k: integerMap to integer first if key is not an integer

– A:Z 0:25String s h(s[0]) + h(s[1])*26 +…+ h(s[n-1])*26^(n-1)String s h(s[0])*26^(n-1) + …+h(s[n-1])

Page 7: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Hash FunctionHash Function

String s h(s[0])*26^(n-1) + …+h(s[n-1])

int toInt(String s) {

assert(s != null);

int c = 0;

for (int I = 0; I < s.length(); I ++) {

c = c*26 + toInt(s.charAt(I));

}

return c;

}

int hash(String s) { return hash(toInt(s)); }

Page 8: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Example Example

Table[7] – HASHTABLE_SIZE = 7 Insert ‘B2’, ‘H7’, ‘M12’, ‘D4’, ‘Z26’ into the table

2, 0, 5, 4, 5 Collision

– The slot indexed by hash code is already occupied

A simple solution– Sequentially decreases index until find an empty slot or

table is full

Page 9: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Collision PossibilityCollision Possibility

How often collision may occur? Insert 100 random number into a table of 200 slots 1 – ((200 – I)/200), I=0:99

= 1 – 6.66E-14 > 0.99999999999993 Load factor

– 100/200 = 0.5 = 50% 0.99999999999993– 20/ 200 = 0.1 = 10% 0.63– 10/200 = 0.05 = 5% 0.2

Default load factor is 75% in java Hashtable

Page 10: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Primary ClusterPrimary Cluster

The biggest solid block in hash tableJoin clustersThe bigger the primary cluster is, the easier

to growDistributed evenly to avoid primary cluster

Page 11: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Probe MethodProbe Method

What we can do when collision occurred?– A consistent way of searching for an empty slot– Probe

Linear probe – decrease index by 1, wrap up when 0 Double hash – use quotient to calculate decrement

– Max(1, (Key / M) % M)

Separate chaining – linked list to store collision items Hash tree – link to another hash table (A4)

Page 12: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Probe sequence coverageProbe sequence coverage

Ensure probe sequence cover all table– Utilizes the whole table– Even distribution– M and probe decrement are relative prime

No common factor except 1

– Makes M a prime number M and any decrement (< M) are relative prime

Page 13: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Probe MethodProbe Method

void insert(Item item) {

if(!isFull()) {

int index = probe(item.key);

assert(index >=0 && index < M);

table[index] = item;

count ++;

}

}

Page 14: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Linear Probe MethodLinear Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE;

if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

do { index--; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Page 15: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Double Hash Probe MethodDouble Hash Probe Method int probe(int key) {

int hashcode = key % HASHTABLE_SIZE;if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

int dec = (key / HASHTABLE_SIZE) % HASHTABLE_SIZE; dec = Math.max(1, dec);

do { index -= dec; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Page 16: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Search MethodSearch Method Item search(int key) {

int hashcode = key % HASHTABLE_SIZE;

int dec = max(1, (key / HASHTABLE_SIZE) % HASHTABLE_SIZE);

while(table[hashcode] != null) {

if(table[hashcode].key == key) break;

hashcode -= dec;

}

return table[hashcode];

}

Page 17: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Delete MethodDelete Method

Difficulty with delete when open addressing– Destroy hash probe chain

Solution– Set a deleted flag– Search takes it as occupied– Insert takes it as deleted– Forms primary cluster

Separate chaining– Move one up from chained structure

Page 18: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

EfficiencyEfficiency Successful search

– Best case – first hit, one comparison– Average

Half of average length of probe sequence Load factor dependent O(1) if load factor < 0.5

– Worst case – longest probe sequence Load factor dependent

Unsuccessful search– Average - average length of probe sequence– Worst case - longest probe sequence

Page 19: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Advanced TopicsAdvanced Topics Choosing Hash Functions

– Generate hash code randomly and uniformly– Use all bits of the key– Assume K=b0b1b2b3– Division

h(k) = k % M; p(k) = max (1, (k / M) % M)

– Folding h(k) = b1^b3 % M; p(k) = b0^b2 % M; // XOR

– Middle squaring h(k) = (b1b2) ^ 2

– Truncating h(k) = b3;

Page 20: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Advanced TopicsAdvanced TopicsHash Tree

– Separate chained collision resolution– Recursively hashing the key

Hash Table

Hash Table Hash Table Hash Table

Hash Table

Hash Table

Page 21: COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis

Hash TreeHash Treevoid insert(int key, Item item) {

Int h = h(key);Int k = g(key); // one-to-one mapping Key KeyIf(table[h] == null) {

table[h] = item;} else {

if(table[h].link == null) table[h].link = new HashTree();

table[h].link.insert(k, item);}

}