14
Twitter Frenzy FPGA Data Stream Processing Cory Kleinheksel (Team Leader) Tim Meyer David Graziano Josh Clausman

Twitter Frenzy FPGA Data Stream Processing

Embed Size (px)

DESCRIPTION

Twitter Frenzy FPGA Data Stream Processing. Cory Kleinheksel (Team Leader) Tim Meyer David Graziano Josh Clausman. Project Idea. Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA to perform packet analysis. - PowerPoint PPT Presentation

Citation preview

Page 1: Twitter Frenzy  FPGA Data Stream Processing

Twitter Frenzy  FPGA Data Stream Processing

Cory Kleinheksel (Team Leader)Tim Meyer

David GrazianoJosh Clausman

Page 2: Twitter Frenzy  FPGA Data Stream Processing

Project Idea  • Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA

to perform packet analysis.

• Accelerate the stream processing of Twitter data queries.

• Specifically accelerate computationally intensive and long life-time queries with data with short life-times.

• The design/implementation of a frequency-based query will be the primary focus (interesting application of signal processing).

 

Page 3: Twitter Frenzy  FPGA Data Stream Processing

Details  • Input: Live (or simulated) Twitter stream data

• Java program used to simulate twitter feed by reading from a dataset

• Processing:1. Extract tweets from input stream2. Filter tweets based on query parameters

• Text Matching3. Determine tweet frequency components

• Frequency Analysis4. Apply signal filter (signal processing)

• Output: Tweets matching filter

Page 4: Twitter Frenzy  FPGA Data Stream Processing

Design Issues

• Ability to acquire data from twitter at a useful speed

• Determining packet usefulness (send/drop) in efficient manner

• Managing concurrently arriving packets and multi-fragment packets

• How to calculate frequency and filter corresponding packets

Page 5: Twitter Frenzy  FPGA Data Stream Processing

Implementation Issues• How to properly buffer and send fragmented tweets

• Time/clock cycles needed to perform frequency calculations

• Time to perform Hashing – Created a lookup table based hashing block

• Modules consuming data at different rates

• Debugging HW

Page 6: Twitter Frenzy  FPGA Data Stream Processing

System Architecture Diagram

 

Page 7: Twitter Frenzy  FPGA Data Stream Processing

Breakdown: Network Data Flow

 

Page 8: Twitter Frenzy  FPGA Data Stream Processing

Breakdown: Text Matching

Page 9: Twitter Frenzy  FPGA Data Stream Processing

Breakdown: Frequency Analysis

Page 10: Twitter Frenzy  FPGA Data Stream Processing

Algorithms

• Hashing

• String Matching

• Frequency Analysis

• Filtering (FIR)

Page 11: Twitter Frenzy  FPGA Data Stream Processing

Project Results

• Analyzed the problem

• Implemented full simulator in software

• Implemented in VHDL

• Simulated in ModelSim

• Tested on hardware, confirmed results against software implementation

Dataset: JSON_29493.txtProcessed 29493 tweets192 passed string filter133 passed frequency filter

Page 12: Twitter Frenzy  FPGA Data Stream Processing

Software Simulator Example

Page 13: Twitter Frenzy  FPGA Data Stream Processing

Demo

Page 14: Twitter Frenzy  FPGA Data Stream Processing

References

Berinde, Indyk, Cormode, Strauss. "Space-optimal Heavy Hitters with Strong Error Bounds"

Cormode, Korn, Tirthapura. "Time-Decaying Aggregates in Out-of-order Streams"

Charikar, Chen, Farach-Colton. "Finding Frequent Items in Data Streams“