19
Evolving Universal Hash Functions Using Genetic Algorithms Ramprasad Joshi, Mustafa Safdari BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI GOA CAMPUS 2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Evolving Universal Hash Function using Genetic Algorithms

Embed Size (px)

DESCRIPTION

The ppt presented at the International Conference on Future Computer and Communication, 2009 at Kuala Lumpur, Malaysia. Includes the early work done in the project: "Evolving Universal Hash Functions using Genetic Algorithms". The revised version of this project was presented at GECCO 2009.

Citation preview

Page 1: Evolving Universal Hash Function using Genetic Algorithms

Evolving Universal Hash Functions Using Genetic Algorithms

Ramprasad Joshi, Mustafa Safdari

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI

GOA CAMPUS

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 2: Evolving Universal Hash Function using Genetic Algorithms

Outline Introduction Implementation of Genetic Algorithms Simulation and Result Conclusion and future work

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 3: Evolving Universal Hash Function using Genetic Algorithms

Introduction

Universal Hash Functions Selecting h randomly

Page 4: Evolving Universal Hash Function using Genetic Algorithms

Universal Hash Functions Mapping integers in the range [0,M-1] to [0,N-

1] A Set H of hash functions is Universal if for

any 2 keys j and k and a randomly chose hash function h,

Expected no. of collisions for any key is n/N

1Pr h j h k

N

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 5: Evolving Universal Hash Function using Genetic Algorithms

Selecting h randomly One such type of Hash function:

p is a prime number, a, b are any two random integers, How do we select a, b, p? Minimize collisions as much as possible

, mod moda bh k ak b p N

2M p M 0 , 0a p b p

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 6: Evolving Universal Hash Function using Genetic Algorithms

Implementation of GA

Chromosome, Fitness Function, Crossover, Mutation

p_values, p_Array

Page 7: Evolving Universal Hash Function using Genetic Algorithms

Elements of the GA Chromosome:

Fitness function:

Crossover types: single point, 2 point, midway and random

Mutation: single point, multi point Roulette Wheel Selection

010010000111010111(32 )a bits

010010000111010111(32 )b bits

(64 )Chromosome bits

1filled

collisions

nFitness

n

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 8: Evolving Universal Hash Function using Genetic Algorithms

p_values, p_Array p is any prime number such that M≤p<2M.

An array p_values called keeps track of the allowable values of p so that it can be used in the above steps. p_values can be constructed and populated it using any sieve algorithm (from Primality testing) to find out prime numbers within a range. The method used in our implementation of the algorithm uses Sieve of Eratosthenes.

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 9: Evolving Universal Hash Function using Genetic Algorithms

p_Array For every chromosome of (a, b) there is an

associated value for p such that To store this information in the chromosome, we

create a separate array called p_Array which stores for each chromosome, the index of the prime number present in p_values. For example, if a chromosome in the population has a=9, b=7, p=4, it means that the value of p assigned for this chromosome is the one found in p_values at index 4.

Index values of p don’t undergo crossover/mutation. Only a, b do. But after each such operation, a suitable p is found for the new resultant a, b pair if the one associated with the parent chromosome doesn’t satisfy (1).

0 , 0 ---(1)a p b p

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 10: Evolving Universal Hash Function using Genetic Algorithms

Simulations and Results

Simulation settings Results

Page 11: Evolving Universal Hash Function using Genetic Algorithms

Simulation Settings No. of generations = 30 Size of populations = 50 pc = 0.8, pm = 0.01 Input set of keys N

Uniformly Randomly Generated in (0, 50000) Different sets of size 10, 100, 1000, 10000 Taking N as prime gives better results

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 12: Evolving Universal Hash Function using Genetic Algorithms

Results

TABLE IRESULTS OF RUNNING THE ALGORITHM FOR RANDOM INPUT DISTRIBUTIONS

Sr. No.

Range Of Input Crossover Type *

Mutation Type *

No. of keys n

No. of buckets N

No. of initial collisions

ncollisions nfilled p a b

1. 0-10 1 2 10 10 0 0 10 11 3 2

2. 0-500 1 2 10 11 1 4 6 701 67 452

3. 0-600 1 2 20 23 2 2 18 1013 626 635

4. 0-100 1 1 100 100 0 0 100 179 109 114

5. 0-50000 1 2 100 101 8 21 79 98869 54339 35059

6. 0-1000 1 2 500 499 0 1 499 1823 747 581

7. 0-50000 1 2 500 499 37 108 392 69313 46631 9950

8. 1 2 10000 10000 0 0 10000 14153 9347 517

9. 1 2 10000 10000 0 0 10000 57203 25869 37769

10. 0-50000 1 2 10000 10000 911 2397 6692 79063 33068 31178

* Indices from the crossover and mutation type as mentioned in the previous section

4 45 10 6 10 4 42 10 5 10

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 13: Evolving Universal Hash Function using Genetic Algorithms

Case 1 Multiple point mutations (2 points) gave a much

better result in lesser number of generations as compared to single point or more than 2 point mutation, Single Point Random crossover was found to produce much better results.

The convergence of the algorithm under any case was within 7-8 generations in the worst case.

For some cases, where the range of distribution was really big and not coincident with [0, N-1], the number of collisions was relatively more. However, this number was drastically reduced when N was taken as a prime number in the nearby range.

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 14: Evolving Universal Hash Function using Genetic Algorithms

Case 2 (Comparative Runs) In the next type of simulation, the algorithm

was tested against randomly selecting h. The algorithm performed much better than

the random selection, giving lesser number of collisions. Table 2. Results of Comparative Run 1

Input File ncollisions by random selection ncollisions by GA generated function

1 286 251

2 273 256

3 267 245

4 285 244

5 285 255

6 285 262

7 281 259

8 273 255

9 273 258

10 304 259Setting for GA: P=100, N=1423, pc=0.75 (1), pm=0.01 (1)

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 15: Evolving Universal Hash Function using Genetic Algorithms

In the End…

Conclusion Future Work Acknowledgement

Page 16: Evolving Universal Hash Function using Genetic Algorithms

Conclusion The proposed algorithm produces an efficient

Universal Hash function for hashing a given distribution of keys which results in the relatively less number of collisions.

The problem of clustering is avoided by generating a hash function using metaheuristic, in this case Genetic Algorithms.

It performs better than random selection of h. This algorithm is ideal for scenarios where the

input distribution to be hashed is changing frequently and the hash function needs to be changed dynamically to rehash the input.

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 17: Evolving Universal Hash Function using Genetic Algorithms

Future Work The scope for future work on this algorithm

include selection of an efficient Sieve algorithm an efficient encoding of the chromosome understanding the effect of various types of

crossover and mutation on the result better design of fitness function so that the few

exceptional cases are also taken care of Testing the algorithm against some standard hash

functions.

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 18: Evolving Universal Hash Function using Genetic Algorithms

Acknowledgment My sincere thanks to Mr. Ramprasad Joshi, my

mentor and guide for this project. I also thank my colleague Miss Joanna Mary

Oommen for assistance with the paper and presentation.

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION

Page 19: Evolving Universal Hash Function using Genetic Algorithms

Thank You!

Any Questions?

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI

GOA CAMPUS

2009 INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATION