28
Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Ku en

Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines

  • Upload
    pippa

  • View
    45

  • Download
    2

Embed Size (px)

DESCRIPTION

Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines. By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen. Introduction. Motivation Objectives Problem statements. A) Motivation. Increasing trend to keep flow state in routers - PowerPoint PPT Presentation

Citation preview

Page 1: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

Beyond Bloom Filters: From Approximate Membership

Checks to Approximate State Machines

By F. Bonomi et al.

Presented byKenny Cheng, Tonny Mak Yui Kuen

Page 2: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

2

IntroductionIntroduction

A)A) MotivationMotivation

B)B) ObjectivesObjectives

C)C) Problem statementsProblem statements

Page 3: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

3

A) MotivationA) Motivation

• Increasing trend to keep flow state in routers

• Large memory space (~100 bits per flow) is needed for storing a large amount of flow states

• If memory space can be reduced, using fast on-chip memory is feasible to improve performance

Page 4: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

4

B) ObjectivesB) Objectives

• Introduce the idea of an Approximate Concurrent State Machine (ACSM), it sacrifices some accuracy for memory size.

• Introduce and compare several solutions to ACSM problem

• To find an approach with the highest accuracy to memory ratio

Page 5: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

5

C) Problem statementsC) Problem statements

• Describe 3 techniques based on Bloom filters and hashing, and evaluate them using both theoretical analysis and simulation

Page 6: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

6

Bloom Filter

• A data structure proposed by Bloom in 1970

• Designed for membership test, i.e. to test whether an element exists in a set

• Fast and compact

• Chance of false positive, i.e. an element not in the set may be wrongly identified

• No false negative, i.e. an element in the set must be identified correctly

Page 7: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

7

How a Bloom Filter Works

• A bit array with all zeros initially• k hash functions

...1 2 k3

0 0 0 0 0 00 0 0 0 0 0 0 0

Page 8: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

8

How a Bloom Filter Works

• Hash the element using the hash functions, get k indices in the bit array

• Mark the bits to 1

...1 2 k3

0 0 0 0 0 00 0 0 0 0 0 0 0

Insertion

x

0 0 1 0 0 00 0 1 1 0 0 0 1

Page 9: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

9

How a Bloom Filter Works

• Hash the element using the hash functions• If all corresponding bits are 1, it’s in the set

...1 2 k3

0 0 1 0 0 10 0 1 1 1 0 0 1

Lookup

x

0 0 1 0 0 10 0 1 1 1 0 0 1

Page 10: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

10

How a Bloom Filter Works

• Sorry, no deletion• You don’t know whether the bits are used by other

elements or not, cannot simply clear them

...1 2 k3

0 0 1 0 0 10 0 1 1 1 0 0 1

Deletion

x

0 0 ? 0 0 10 0 ? ? 1 0 0 ?

Page 11: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

11

Counting Bloom Filter

• Use a counter to replace a bit• For insertion, increment the counters• For deletion, decrement the counters• Problems: more space, overflow counters

...1 2 k3

0 0 0 0 1 00 0 0 0 1 0 0 1

x

0 0 0 0 1 00 0 0 0 3 0 0 2 0 0 1 0 1 00 0 1 1 3 0 0 3

Page 12: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

12

3 Approaches to ACSM

• Approaches:1. Direct Bloom Filter2. Stateful Bloom Filter3. Fingerprint-compressed Filter

• Operations need to implement:1. Insert(flow, state)2. Lookup(flow) returns (state)3. Delete(flow)4. Update(flow, new_state)

Page 13: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

13

Direct Bloom Filter Approach

• Use counting Bloom filter• 4 operations:

Insert – insert (flow_id, state) pairLookup – if state is not provided, have to lookup every state, return “don’t know” if more than one state is foundDelete – lookup + decrement countersUpdate – delete old + insert new

• Improvement: use timing-based deletion to handle non-terminated flows

Page 14: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

14

Timing-based Deletion

• Add a timing bit to each cell• Set the bit if the cell is touched• Clear untouched cells periodically, and reset timing bits• Alternative to DBF: use standard Bloom filter instead of

counting, delete elements only by time-based deletion

...1 2 k3

0 0 3 3 0 12 0 1 1 0 1 0 2

x

0 0 3 0 0 00 0 1 1 0 0 0 20 0 0 0 0 00 0 0 0 0 0 0 0Timing Bits 0 0 1 0 0 00 0 1 1 0 0 0 1

Page 15: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

15

Stateful Bloom Filter Approach

• Direct Bloom Filter doesn’t store the state of a flow, need to lookup every state

• Improvement: add a state value for each cell for faster lookup

• Hash flow_id only, instead of (flow_id, state) pair

• Introduce a “don’t know” (DK) state when collision occurs

• Keep timing-based deletion

Page 16: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

16

Stateful Bloom Filter Approach

• Insert, modify, delete – similar to Direct Bloom Filter, set the cell value to DK for collision (counter > 1)

• Lookup:If all cells are DK, return DKIf all cells are either state i or DK, return state iIf more than one state other than DK, return “not found”

Page 17: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

17

1001010110 11100110000 40110111010 2

0111010100 11110011101 3

1100000110 30000111101 3

...

Fingerprint State

Fingerprint-compressed Filter Approach

• Store a fingerprint of flow + state in a d-left hashtable

...

x

...1 2 d

1110001000 1

Page 18: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

18

Fingerprint-compressed Filter Approach

• Insert - hash the element, and find the corresponding bucket in each hash table, insert the fingerprint + state in the bucket with least number of elements (choose the left-most one to break ties)

• Lookup – retrieve the state of the fingerprint• Delete – remove the fingerprint• Update – direct update or remove old + add new• Make use of DK when a fingerprint is found in

multiple buckets• Timing-based deletion can still be applied

Page 19: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

19

Simulation

• To investigate the size/accuracy trade-off for the 3 approaches

• State machine: 10 states• Legal state changes: 1 → 2 → 3 → … → 10• Run for 1 million flows• About 60000 simultaneous flows• 100 ± 40 packets for each flow• Some packets trigger state change

Page 20: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

20

Simulation

• 3 kinds of simulation flows

• Interesting flows (30%) – flows with legal state changes only, always complete

• Noise flows (30%) – flows with random (can be legal or illegal) state changes, never complete

• Random flows (40%) – flows without state change

Page 21: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

21

Simulation

False positive rate: % of completed flows which is not-interesting

False negative rate: % of interesting flows without completion

Page 22: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

22

Applications

Place in the application level QoS:-

• Video congestion control

• Peer-to-Peer (P2P) traffic identification

Page 23: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

23

Video congestion control

• Apply to MPEG video streaming

• 3 kinds of frames for MPEG video:I frame – scene informationP frame – differential informationB frame – least important information

• Can drop B frames up to 30% with acceptable quality

• Need to keep track of current frame

Page 24: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

24

Video congestion control

• Use FCF ACSM to keep track of state

• Experimentally the highest false positive rate acceptable is 0.37%

• This requires a memory size of 27 bits per flow (about ¼ compared to original 100 bits)

Page 25: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

25

P2P Traffic Identification

• To limit P2P flows to increase quality for other applications

• One possible way to identify a P2P flow:concurrent TCP and UDP flows

• Use ACSM for real-time P2P identification

Page 26: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

26

ConclusionConclusion

• It’s feasible for ACSM

• FCF approach is the best approach

• Two potential applications are introduced for ACSM

• ACSM may be beneficial to QoS applications, which are fault-tolerant

Page 27: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

27

Comments

• Authors focus on accuracy and memory size, but not real performance

• FCF approach may not perform well on hardware

Page 28: Beyond Bloom Filters:  From Approximate Membership Checks to Approximate State Machines

- End -

Question & Answer