Twitter Frenzy FPGA Data Stream Processing - PowerPoint PPT Presentation

twitter frenzy fpga data stream processing n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Twitter Frenzy FPGA Data Stream Processing PowerPoint Presentation
Download Presentation
Twitter Frenzy FPGA Data Stream Processing

play fullscreen
1 / 14
Twitter Frenzy FPGA Data Stream Processing
172 Views
Download Presentation
cade-newman
Download Presentation

Twitter Frenzy FPGA Data Stream Processing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Twitter Frenzy FPGA Data Stream Processing Cory Kleinheksel (Team Leader) Tim Meyer David Graziano Josh Clausman

  2. Project Idea • Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA to perform packet analysis. • Accelerate the stream processing of Twitter data queries. • Specifically accelerate computationally intensive and long life-time queries with data with short life-times. • The design/implementation of a frequency-based query will be the primary focus (interesting application of signal processing).

  3. Details • Input: Live (or simulated) Twitter stream data • Java program used to simulate twitter feed by reading from a dataset • Processing: • Extract tweets from input stream • Filter tweets based on query parameters • Text Matching • Determine tweet frequency components • Frequency Analysis • Apply signal filter (signal processing) • Output: Tweets matching filter

  4. Design Issues • Ability to acquire data from twitter at a useful speed • Determining packet usefulness (send/drop) in efficient manner • Managing concurrently arriving packets and multi-fragment packets • How to calculate frequency and filter corresponding packets

  5. Implementation Issues • How to properly buffer and send fragmented tweets • Time/clock cycles needed to perform frequency calculations • Time to perform Hashing • Created a lookup table based hashing block • Modules consuming data at different rates • Debugging HW

  6. System Architecture Diagram

  7. Breakdown: Network Data Flow

  8. Breakdown: Text Matching

  9. Breakdown: Frequency Analysis

  10. Algorithms • Hashing • String Matching • Frequency Analysis • Filtering (FIR)

  11. Project Results • Analyzed the problem • Implemented full simulator in software • Implemented in VHDL • Simulated in ModelSim • Tested on hardware, confirmed results against software implementation • Dataset: JSON_29493.txt • Processed 29493 tweets • 192 passed string filter • 133 passed frequency filter

  12. Software Simulator Example

  13. Demo

  14. References Berinde, Indyk, Cormode, Strauss. "Space-optimal Heavy Hitters with Strong Error Bounds" Cormode, Korn, Tirthapura. "Time-Decaying Aggregates in Out-of-order Streams" Charikar, Chen, Farach-Colton. "Finding Frequent Items in Data Streams“