Upload
morgan-palmer
View
230
Download
3
Tags:
Embed Size (px)
Citation preview
Peacock Hash: Deterministic and Updatable Hashing for High
Performance Networking
Sailesh KumarJonathan TurnerPatrick Crowley
2 - Sailesh Kumar - 04/20/23
Overview
Overview of Hash Tables and Segmented Hash Table
Analysis and Limitations» Increased memory references
Adding Bloom Filters per segment
Selective Filter Insertion Algorithm
Simulation Results and Analysis
Conclusion
3 - Sailesh Kumar - 04/20/23
Hash Tables
Consider the problem of searching an array for a given value
» If the array is not sorted, the search requires O(n) time
» If the array is sorted, we can do a binary search– O(lg n) time
» Can we do in O(1) time– Hash table– Use hash function to map elements to table cells
4 - Sailesh Kumar - 04/20/23
Hash Tables
Suppose our hash function gave us the following values:» hash("apple") = 5
hash("watermelon") = 3hash("grapes") = 8hash("cantaloupe") = 7hash("kiwi") = 0hash("strawberry") = 9hash("mango") = 6hash("banana") = 2
» hash("honeydew") = 6
This is called collision» Now what
kiwi
bananawatermelon
applemango
cantaloupegrapes
strawberry
0
1
2
3
4
5
6
7
8
9
5 - Sailesh Kumar - 04/20/23
Collision Resolution Policies
Linear Probing»Successively search for the first empty subsequent
table entry
Linear Chaining»Link all collided entries at any bucket as a linked-list
Double Hashing»Uses a second hash function to successively index
the table
6 - Sailesh Kumar - 04/20/23
Performance Analysis
Average performance is O(1) However, worst-case performance is O(n) In fact the likelihood that a key is at a distance
> 1 is pretty high
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
10 20 30 40 50 60 70 80 90 100
Load m/n (%)
Pro
babi
lity
ke y d is ta n c e > 1
ke y d is ta n c e > 2
These keys will take twice time to be
probed
These will take thrice the time to be
probed
Pretty high probability that throughput is half or three times lower than the peak
throughput
7 - Sailesh Kumar - 04/20/23
Hashing in Network Processors
High query latency (memory access)»Hide latency with multiple threads
Queryrequests
HashTable
8 - Sailesh Kumar - 04/20/23
Segmented Hashing
Uses power of multiple choices» has been proposed earlier by Azar et al
A N-way segmented hash» Logically divides the hash table array into N equal segments» Maps the incoming keys onto a bucket from each segment» Picks the bucket which is either empty or has minimum keys
k i
h( ) k i is mappedto this bucket
k i+1
h( )k i+1 is mappedto this bucket
2 1 1 1 2 1 21 2
A 4-way segmented hash table
12
9 - Sailesh Kumar - 04/20/23
Segmented Hash Performance
More segments improves the probabilistic performance» With 64 segments, probability that a key is inserted at
distance > 2 is nearly zero even at 100% load» Improvement in average case performance is still modest
1E-15
1E-12
1E-09
1E-06
1E-03
1E+00
10 20 30 40 50 60 70 80 90 100
Load m/n (%)
Pro
b. {
key
dist
ance
> 1
}
1 s e g me n t
4
16
64
32
8
1E-15
1E-12
1E-09
1E-06
1E-03
1E+00
10 20 30 40 50 60 70 80 90 100
Load m/n (%)
Pro
b. {
key
dist
ance
> 2
} 1 s e g me n t
4
16
32
8
10 - Sailesh Kumar - 04/20/23
An obvious Deficiency
Even though distance of keys are one, every query requires at least N memory probes» Average probes are O(N) compared to O(1) of a naive table
– If things are bandwidth limited, N times lower throughput
In order to ensure O(1) operations, segmented hash table uses on-chip Bloom filters» On-chip memory requirements are quite modest, 1-2 bytes per
hash table bucket
Each segment has a Bloom filter, which supports membership queries» These on-chip filters are queried before actually making an off-
chip hash table memory reference
11 - Sailesh Kumar - 04/20/23
Adding per Segment Filters
0
1
0
2 1 1 1 2 0 1 21 2
k ih( ) k i can go to any of the 3 buckets
1
0
0
0
0
1
1
0
1
h1(ki)
h2(ki)
hk(ki)
:
mb bits
We can select any of the above three segments and insert the key into the
corresponding filter
12 - Sailesh Kumar - 04/20/23
False Positive Rates
With Bloom Filters, there is likelihood of false positives» A filter might say that the key is present in its segment, while
key is actually not present
With N segments, clearly the false positive rates will be at least N times higher» In fact, it will be even higher, because we have to also
consider several permutations of false positives
We propose Selective Filter Insertion algorithm, which reduces the false positive rates by several orders of magnitudes
13 - Sailesh Kumar - 04/20/23
Selective Filter Insertion Algorithm
0
1
0
k ih( )
2 1 1 1 2 0 1 21 2
k i can go to any of the 3 buckets
1
0
0
0
0
1
1
0
1
h1(ki)
h2(ki)
hk(ki)
:
mb bits
Insert the key into segment 4, since fewer bits are set. Fewer
bits are set => lower false positive
With more segments (or more choices), our
algorithm sets far fewer bits in the Bloom filter
14 - Sailesh Kumar - 04/20/23
Selective Filter Insertion Results
1E-11
1E-09
1E-07
1E-05
1E-03
1E-01
8 16 24 32 40 48 56 64
Bits per entry m b
/ n i
Fal
se p
ositi
ve p
roba
bilit
y
k = 8
O p t im um k
N o r m a l B lo o m f ilt e r
Se le c t iv e F ilt e r I n se r t io n s
15 - Sailesh Kumar - 04/20/23
Selective Filter Insertion Details
First we build the set of segments where the arriving key can be inserted, we call it {minSet}» i.e. these segments will have minimum and equal collision
chain length at the corresponding hash index
A naive or greedy algorithm will choose the segment, where least number of bits are set in the Bloom filter» Leads to unbalanced segments» An already loaded segment is likely to receive further keys
because its filter array is more likely to have fewer transitions» Our simulations suggest that an enhancement in the insertion
algorithm reduces the false positive further by up to an order of magnitude
16 - Sailesh Kumar - 04/20/23
Selective Filter Insertion Enhancement
Our aim is to try to keep the segments balanced while also trying to reduce the bit transitions in the Bloom filters
1. Label segments in the set {minSet} eligible if its occupancy is less than (1+δ) times the occupancy of the least occupied segment. Parameter δ is typically set at 0.1 to 0.01.
2. If no segment remains eligible, select the least occupied segment from {minSet}
3. Otherwise choose a segment from {minSet}, which has minimum bit transitions
4. If multiple such segments exist, choose the least occupied one
5. If multiple such segments are again found, break the tie with a round-robin arbitration policy
17 - Sailesh Kumar - 04/20/23
Simulation Results
64K buckets, 32 bits/entry Bloom filter. Simulation runs for 500 phases.
» During every phase, 100,000 random searches are performed. Between every phase 10,000 random keys are deleted and inserted.
Hash policy = Double Hashing
1
1.1
1.2
1.3
1.4
1.5
0 20 40 60 80 100
Load (%)
Hash policy = Linear Chaining
1
1.1
1.2
1.3
1.4
1.5
0 20 40 60 80 100
Load (%)
Avg
. sea
rch
time
1 s eg m en t
4
1 66 4
1 s eg m en t
4
1 66 4
18 - Sailesh Kumar - 04/20/23
Effectiveness of Modified Bloom Filters
Plotting average memory references at different successful search rates.» Lower memory references reflects the effectiveness of filters.
Load is kept at 80%.
1E-07
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+00 1E-02 1E-04 1E-06 1E-08 1E-10Probability of successful search
1E-07
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+00 1E-02 1E-04 1E-06 1E-08 1E-10Probability of successful search
Avg
. mem
ory
refe
renc
es
S eg m en ted h as hn aiv e B lo o m fi l t er
S in g le h as h t ab le
S eg m en ted h as h w i thS elect iv e F i l t er In s ert io n
S in g le h as h t ab le
S eg m en ted h as h w i thS elect iv e F i l t er In s ert io n
S eg m en ted h as hn aiv e B lo o m fi l t er
3 2 fi l t er b i t s p er elem en t 1 6 fi l t er b i t s p er elem en t
19 - Sailesh Kumar - 04/20/23
Questions?