1 / 11

Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Semantics and Evaluation Techniques for Window Aggregates in Data Stream. Jin Li, David Maier, Kristin Tufte, Vassillis Papadimos, Peter Tucker. Presented by: Venkatesh Raghvan Charudatta Wad CS 525 Class discussion. Overview. Background Problem Statement Window semantics

gail
Télécharger la présentation

Semantics and Evaluation Techniques for Window Aggregates in Data Stream

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantics and Evaluation Techniques for Window Aggregates in Data Stream Jin Li, David Maier, Kristin Tufte, Vassillis Papadimos, Peter Tucker. Presented by: Venkatesh Raghvan Charudatta Wad CS 525 Class discussion

  2. Overview • Background • Problem Statement • Window semantics • WID approach • Discussion

  3. Background • Disorders Handling: Punctuations. • Aggregate Queries: • In SQL? • In CQL? (without WIDs) • In sliding windows, what causes an output?

  4. Problem Statement • Lack of explicit window semantics. • Implementation efficiency. • Out of order arrival of data.

  5. Running Example • Consider the example from the paper: • Schema <seg-id, speed, ts> • Query: SELECT seg-id, max(speed), min(speed) FROM Traffic [Range 300 seconds SLIDE 60 seconds WATTR ts] GROUP BY seg-id.

  6. Running Example - This picture is taken from the paper itself.

  7. Big Picture • Mapping of tuples to window extents and vice versa. • New Window semantics. • Window specifications: RANGE, SLIDE and WATTR.

  8. Window specification • Time based query: • Counting the number of vehicles in each segment for the past 1 hour, update the result every 20 min. SELECT seg-id, count(*) FROM Traffic [RANGE 60 minutes SLIDE 20 minutes WATTR ts] GROUP BY seg-id.

  9. Window specification • Tuple-based query: • Counting the number of vehicles in each segment for the past 100 rows, update the result every 10 rows. SELECT seg-id, count(*) FROM Traffic [RANGE 100 rows SLIDE 10 rows WATTR row-num] GROUP BY seg-id.

  10. Window specification • Can we specify RANGE and SLIDE on different attributes: • YES!! SELECT seg-id, count(*) FROM Traffic [RANGE 300 seconds SLIDE 10 rows RATTR ts SATTR row-num] GROUP BY seg-id.

  11. WID Approach • Explained by Venky.

More Related