Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Semantics and Evaluation Techniques for Window Aggregates in Data Stream Jin Li, David Maier, Kristin Tufte, Vassillis Papadimos, Peter Tucker. Presented by: Venkatesh Raghvan Charudatta Wad CS 525 Class discussion

Overview • Background • Problem Statement • Window semantics • WID approach • Discussion

Background • Disorders Handling: Punctuations. • Aggregate Queries: • In SQL? • In CQL? (without WIDs) • In sliding windows, what causes an output?

Problem Statement • Lack of explicit window semantics. • Implementation efficiency. • Out of order arrival of data.

Running Example • Consider the example from the paper: • Schema <seg-id, speed, ts> • Query: SELECT seg-id, max(speed), min(speed) FROM Traffic [Range 300 seconds SLIDE 60 seconds WATTR ts] GROUP BY seg-id.

Running Example - This picture is taken from the paper itself.

Big Picture • Mapping of tuples to window extents and vice versa. • New Window semantics. • Window specifications: RANGE, SLIDE and WATTR.

Window specification • Time based query: • Counting the number of vehicles in each segment for the past 1 hour, update the result every 20 min. SELECT seg-id, count(*) FROM Traffic [RANGE 60 minutes SLIDE 20 minutes WATTR ts] GROUP BY seg-id.

Window specification • Tuple-based query: • Counting the number of vehicles in each segment for the past 100 rows, update the result every 10 rows. SELECT seg-id, count(*) FROM Traffic [RANGE 100 rows SLIDE 10 rows WATTR row-num] GROUP BY seg-id.

Window specification • Can we specify RANGE and SLIDE on different attributes: • YES!! SELECT seg-id, count(*) FROM Traffic [RANGE 300 seconds SLIDE 10 rows RATTR ts SATTR row-num] GROUP BY seg-id.

WID Approach • Explained by Venky.

Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Semantics and Evaluation Techniques for Window Aggregates in Data Stream

Presentation Transcript

Paradise Stream Watershed Assessment Data Gap Evaluation

Evaluation Techniques

Evaluation Techniques

Evaluation Techniques

Stream Restoration Techniques

Paradise Stream Watershed Assessment Data Gap Evaluation

Stream Data

Effective Data Management Techniques - In the view of Stream data

Window Aggregates in NiagaraST

Semantics and Evaluation Techniques for Window Aggregates in Data Streams

Stream Assessment Techniques

Maintaining Time-Decaying Stream Aggregates

CORPUS DATA AND FRAME SEMANTICS

Load Shedding Techniques for Data Stream Systems

Window cleaners techniques

Data Stream Management Techniques

Evaluation techniques

Semantics for Spatial Data Infrastructures

Aggregates for Use In Concrete

Evaluation Techniques

Aggregates for Use In Concrete

Maintaining Time-Decaying Stream Aggregates