23
1 Blooming Trees: Space-Efficient Structures for Data Representation Author: Domenico Ficara, Stefano G iordano, Gregorio Procissi, Fa bio Vitucci Publisher: ICC 2008 Presenter: Yu-Ping Chiang Date: 2009/05/20

Blooming Trees: Space-Efficient Structures for Data Representation

Embed Size (px)

DESCRIPTION

Blooming Trees: Space-Efficient Structures for Data Representation. Author: Domenico Ficara, Stefano Giordano, Gregorio Procissi, Fabio Vitucci Publisher: ICC 2008 Presenter: Yu-Ping Chiang Date: 2009/05/20. Outline. Blooming Tree Lookup Insert Delete Optimized Blooming Tree - PowerPoint PPT Presentation

Citation preview

Page 1: Blooming Trees:  Space-Efficient Structures  for Data Representation

1

Blooming Trees: Space-Efficient Structures

for Data Representation

Author: Domenico Ficara, Stefano Giordano, Gregorio Procissi, Fabio Vitucci

Publisher: ICC 2008Presenter: Yu-Ping ChiangDate: 2009/05/20

Page 2: Blooming Trees:  Space-Efficient Structures  for Data Representation

2

Outline Blooming Tree

Lookup InsertDelete

Optimized Blooming TreeLookup InsertDelete

Simulations

Page 3: Blooming Trees:  Space-Efficient Structures  for Data Representation

3

Blooming Tree

B0

B1

B2

B3

0

1 1 1 10 0 0 0

1 2 1 1 1 1

1 1 1 10 0 0 0

1 1 1 1 1 10 0

0 0 0

0 0

1

1 1 1

3 items 2 items1 item

item

Bit string

HASH FUNCTION

3 bits 1 bit 1 bit

index

index

index

Page 4: Blooming Trees:  Space-Efficient Structures  for Data Representation

4

Blooming Tree n items, k0 hash functions, L+2 layers

Layer0 (B0) : m = nk0/ln2 bits

Layer1~L (B1~BL) : bits/block ( b=1 in following examples ) Block numbers is modified

LayerL+1 (BL+1) : Composed c-bits counters

Hash function k0 hash functions log m + L*b bits output

log m bit for layer0 B bits for layer1~layerL+1

B0

B1

B2

B3

0

1 1 1 10 0 0 0

1 2 1 1 1 1

1 1 1 10 0 0 0

1 1 1 1 1 10 0

0 0 0

0 0

1

1 1 1

3 items 2 items1 item

B0

B1

B2

B3

0

1 1 1 10 0 0 01 1 1 10 0 0 0

1 2 1 1 1 11 2 1 1 1 1

1 1 1 10 0 0 01 1 1 10 0 0 0

1 1 1 1 1 10 01 1 1 1 1 10 0

0 0 0

0 0

1

1 1 1

3 items 2 items1 item

b2

Page 5: Blooming Trees:  Space-Efficient Structures  for Data Representation

5

Blooming Tree - lookup Algorithm:

Using first log m bits as layer0 index. Compute a popcount on layer i, that gives index of the couple in layer i+1. Checking the bit string output by hash function, the bit for layer i.

0 for first bit. 1 for second bit.

If processing bit is 0, result NOT FOUND. Otherwise continue search in next layer. Time complexity:

k0 [ hash + L ( popcount + 2 * check ) ]

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1 1 1

1 1 1 10 0 0 0

1 1 1 1 1 10 0

item1 bit string: 01010hash

Page 6: Blooming Trees:  Space-Efficient Structures  for Data Representation

6

Blooming Tree - lookup Algorithm:

Using first log m bits as layer0 index. Compute a popcount on layer i, that gives index of the couple in layer i+1. Checking the bit string output by hash function, the bit for layer i.

0 for first bit. 1 for second bit.

If processing bit is 0, result NOT FOUND. Otherwise continue search in next layer. Time complexity:

k0 [ hash + L ( popcount + 2 * check ) ]

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1 1 1

1 1 1 10 0 0 0

1 1 1 1 1 10 0

1

item1 bit string: 01010hash

Page 7: Blooming Trees:  Space-Efficient Structures  for Data Representation

7

Blooming Tree - lookup Algorithm:

Using first log m bits as layer0 index. Compute a popcount on layer i, that gives index of the couple in layer i+1. Checking the bit string output by hash function, the bit for layer i.

0 for first bit. 1 for second bit.

If processing bit is 0, result NOT FOUND. Otherwise continue search in next layer. Time complexity:

k0 [ hash + L ( popcount + 2 * check ) ]

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1 1 1

1 1 1 10 0 0 0

1 1 1 1 1 10 0

0

1

item1 bit string: 01010hash

Match !!

Page 8: Blooming Trees:  Space-Efficient Structures  for Data Representation

8

Blooming Tree - lookup Algorithm:

Using first log m bits as layer0 index. Compute a popcount on layer i, that gives index of the couple in layer i+1. Checking the bit string output by hash function, the bit for layer i.

0 for first bit. 1 for second bit.

If processing bit is 0, result NOT FOUND. Otherwise continue search in next layer. Time complexity:

k0 [ hash + L ( popcount + 2 * check ) ]

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1 1 1

1 1 1 10 0 0 0

1 1 1 1 1 10 0

item2 bit string: 10000hash

NOT FOUND !!

Page 9: Blooming Trees:  Space-Efficient Structures  for Data Representation

9

Blooming Tree - insert Algorithm:

Using first log m bits as layer0 index. In layer1~layerL+1, using popcount of layer0~layerL and bit for each layer as inde

x. If bit in layer I already set (means COLLOSION), directly set bit in layer i+1.

else, allocate a new block and insert it into original layer i+1 blocks. Increase count at layer L+1.

Time complexity: k0 [ hash + L ( popcount + shift + bitset ) ]

B0

B1

B2

B3

0 1 0 00 0 0 0

10

1 0

1

item1 bit string: 01010hash

allocate a new block(2^b bits)allocate a new block(2^b bits)

allocate a new block(2^b bits)

Page 10: Blooming Trees:  Space-Efficient Structures  for Data Representation

10

Blooming Tree - insert Algorithm:

Using first log m bits as layer0 index. In layer1~layerL+1, using popcount of layer0~layerL and bit for each layer as inde

x. If bit in layer I already set (means COLLOSION), directly set bit in layer i+1.

else, allocate a new block and insert it into original layer i+1 blocks. Increase count at layer L+1.

Time complexity: k0 [ hash + L ( popcount + shift + bitset ) ]

B0

B1

B2

B3

1 1 0 00 0 0 0

1 1

10

1 1 0

1 0

0

item2 bit string: 00101hash

allocate a new block(2^b bits)allocate a new block(2^b bits)allocate a new block(2^b bits)

Page 11: Blooming Trees:  Space-Efficient Structures  for Data Representation

11

Blooming Tree - insert Algorithm:

Using first log m bits as layer0 index. In layer1~layerL+1, using popcount of layer0~layerL and bit for each layer as inde

x. If bit in layer I already set (means COLLOSION), directly set bit in layer i+1.

else, allocate a new block and insert it into original layer i+1 blocks. Increase count at layer L+1.

Time complexity: k0 [ hash + L ( popcount + shift + bitset ) ]

B0

B1

B2

B3

1 1 1 10 0 0 0

2 1

10

1 1 0

1 0

0

item3 bit string: 00101hash

Collision occur

Page 12: Blooming Trees:  Space-Efficient Structures  for Data Representation

12

Blooming Tree - delete Algorithm:

Trace to the last layer, decrease count. If counter isn’t equal to 0, terminal processing.

else, remove the block and checking upper layer if there only this item in the block, if yes, remove that block too.recursive processing upper layers.

B0

B1

B2

B3

1 1 1 10 0 0 0

2 1

10

1 1 0

1 0

item1 bit string: 00100hash

1

10

Remove empty block

0

Page 13: Blooming Trees:  Space-Efficient Structures  for Data Representation

13

Blooming Tree - delete Algorithm:

Trace to the last layer, decrease count. If counter isn’t equal to 0, terminal processing.

else, remove the block and checking upper layer if there only this item in the block, if yes, remove that block too.recursive processing upper layers.

B0

B1

B2

B3

1 1 1 10 0 0 0

2 1

10

1 1 0

1 0

item2 bit string: 01001hash

0

1

Page 14: Blooming Trees:  Space-Efficient Structures  for Data Representation

14

Outline Blooming Tree

Lookup InsertDelete

Optimized Blooming TreeLookup InsertDelete

Simulations

Page 15: Blooming Trees:  Space-Efficient Structures  for Data Representation

15

Optimized Blooming Tree

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1

1 0 0 10 0 0 0

1 1 1 1

01

1 1

1 10 0

0 1

3 items 2 items1 item

01100 01 1

bitmap Hash substrings

1 1

Page 16: Blooming Trees:  Space-Efficient Structures  for Data Representation

16

Optimized Blooming Tree - lookup Algorithm:

Access B0 Checking bitmap

If there’s 1 in bitmap, directly compare last L*b bits of hashed bit string, and terminate processing.

Else, lookup method is same as previous defined. Recursively repeat at each level.

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1

1 0 0 10 0 0 0

1 1 1 1

3 items 2 items1 item

01100 01 1

bitmap Hash substrings

B0

B1

B2

B3

1 1 1 10 0 0 01 1 1 10 0 0 0

1 2 1 1

1 0 00 0 10 0 0 0

1 1 1 1

3 items 2 items1 item

01100 01 1

bitmap Hash substrings

01100 01 1

bitmap

0110 01100 01 10 01 1

bitmap Hash substrings

Page 17: Blooming Trees:  Space-Efficient Structures  for Data Representation

17

Optimized Blooming Tree - lookup

1 2 1 1

B0

B1

B2

B3

1 1 1 10 0 0 0

1 0 0 10 0 0 0

1 1 1 1

01100 01 1

bitmap Hash substrings

Algorithm: Access B0 Checking bitmap

If there’s 1 in bitmap, directly compare last L*b bits of hashed bit string, and terminate processing.

Else, lookup method is same as previous defined. Recursively repeat at each level.

item1 bit string : 10001hash

Popcount = 3Popcount = 2

Page 18: Blooming Trees:  Space-Efficient Structures  for Data Representation

18

Optimized Blooming Tree - insert Without collision

Add a zero-block Set bit string and hash substring

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1

1 0 0 10 0 0 0

1 1 1 1

01100 01 1

bitmap Hash substrings

item1 bit string : 01101hash

1

0 0

0110

Hash substrings

0 11 1

bitmap

0 01

Page 19: Blooming Trees:  Space-Efficient Structures  for Data Representation

19

Optimized Blooming Tree - insert With collision

Set corresponding branches

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1

1 0 10 0 0

1 1 1 1

00 01100 01 1

bitmap Hash substrings

item2 bit string : 01001hash

10

00 00 1 0

11

00 010 1 1 0

010 00 1

Page 20: Blooming Trees:  Space-Efficient Structures  for Data Representation

20

Optimized Blooming Tree - delete

B0

B1

B2

B3

1 1 1 10 0 0 0

1 2 1 1

1 0 10 0 0

1 1 1 1

item2 bit string : 01001hash

00

11

00 010 1 1 0

010 00 1

0

0 1 0

0

Page 21: Blooming Trees:  Space-Efficient Structures  for Data Representation

21

Outline Blooming Tree

Lookup InsertDelete

Optimized Blooming TreeLookup InsertDelete

Simulations

Page 22: Blooming Trees:  Space-Efficient Structures  for Data Representation

22

Simulation Size comparison

Page 23: Blooming Trees:  Space-Efficient Structures  for Data Representation

23

Simulation Build on NP Intel IXP2800