450 likes | 556 Vues
This presentation explores the PSoup model for processing continuous queries over streaming data, aiming to address limitations in existing systems like Aurora and STREAM. By treating queries and data as duals, PSoup allows for new queries to be applied to previously received data and vice versa. The model supports landmark, sliding window, and hybrid continuous queries, facilitating efficient multiquery processing and enabling operations with intermittent connectivity, particularly useful for mobile applications. Key insights include separating computation from delivery and optimizing results retrieval.
E N D
PSoup Kevin Menard CS 561 4/11/2005
Slides are modified versions of the following original presentation: Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin VLDB 2002
Result Query Psoup Insight #1 • Queries and data are duals • Store new queries, apply to data that arrived earlier • Store new data, apply to queries that arrived earlier Index Index Data Queries • Multiquery Processing = “join” of query and data • Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Sirish Chandrasekaran
Data Result Psoup Insight #1 • Queries and data are duals • Store new queries, apply to data that arrived earlier • Store new data, apply to queries that arrived earlier Index Index Data Queries • Multiquery Processing = “join” of query and data • Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Sirish Chandrasekaran
Motivation? • Why another model for continuous queries? • What is wrong with how Aurora and STREAM supply responses? Sirish Chandrasekaran
Motivation: Disconnected Operation • Previous solutions stream out answers immediately Not feasible/suitable for all applications • Intermittent Connectivity: e.g., Applications on hand-held devices (as in this morning’s keynote address) • Even if connected: Not always interested in streaming answers Sirish Chandrasekaran
Invoke } Register Psoup Insight #2 • Separate computation from delivery • Query answers continuously generated in background • Apply windows on-demand to transmit “current” results Query Data Queries ID Predicate ID R.a R.b T F T T T T T F Data F F F F F F T T Results Structure • Efficient support for disconnected operation • Low response time, Shared computation and storage across invocations Sirish Chandrasekaran
PSoup Query Model SELECT select_list FROM from_list WHERE where_clause BEGIN begin_time END end_time • Where clause: conjunction of boolean factors • BEGIN-END clause: system clock or sequence numbers • (begin_time, end_time): • (constant, constant) – snapshot query • (constant, variable) – landmark window query • (variable, variable) – sliding window query Sirish Chandrasekaran
Query Registration } SELECT select_list FROM from_list WHERE where_clause BEGIN begin_time END end_time Standing Query Clause (SQC) to the Symmetric Join } to the Windows_Table • QueryID: handle for future query invocations Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 52 8 4 PSoup (a) Initial State Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 52 8 4 Select * From R Where R.a<=4 and R.b>=3 PSoup New query (b) Arrival of new Query Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 51 0 0 24 R.a<=4 and R.b>=3 52 8 4 BUILD PSoup (c) Building Query Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Query Store Data Store ID Predicate ID R.a R.b match 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 match 50 3 8 23 R.a=4 and R.b=3 PROBE 51 0 0 24 R.a<=4 and R.b>=3 52 8 4 PSoup (d) Probing Data Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Queries 20 21 22 23 24 48 4 3 48 ? 49 ? Data Results 50 3 8 50 ? 51 ? 52 ? Results Structure (e) Inserting Results Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Query Specification Queries 20 21 22 23 24 48 4 3 48 T 49 F Data Results 50 3 8 50 T 51 F 52 F Results Structure (e) Inserting Results Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 PSoup (a) Initial State Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 PSoup New data 53 3 6 (b) Arrival of new Data Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 52 8 4 53 3 6 BUILD PSoup (c) Building Data Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Query Store Data Store match ID Predicate ID R.a R.b 20 0<R.a<=5 48 4 3 21 R.a>4 and R.b=3 49 7 3 22 0>R.b>4 50 3 8 match 23 R.a=4 and R.b=3 24 R.a<=4 and R.b>=3 51 0 0 PROBE 52 8 4 53 3 6 PSoup (d) Probing Query Store Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Queries 20 21 22 23 24 48 20 0<R.a<=5 49 Data Results 50 51 24 R.a<=4 and R.b>=3 52 53 ? ? ? ? ? Results Structure (e) Inserting Results Sirish Chandrasekaran
Selections over Single Stream: Arrival of New Data Queries 20 21 22 23 24 48 20 0<R.a<=5 49 Data Results 50 51 24 R.a<=4 and R.b>=3 52 53 T F F F T Results Structure (e) Inserting Results Sirish Chandrasekaran
Query Invocation • System returns the results corresponding to the current value of the BEGIN-END clause BEGIN begin_time END end_time Queries 20 21 22 23 24 48 T 49 F Data 50 T } Current Window 51 F 52 F 53 T F F F T Results Structure Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 48 9 7 PSoup (a) Initial State Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 48 9 7 New query PSoup 23 R.a<5 and R.a>S.a and S.b>1 (b) Arrival of new Query Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 BUILD PSoup (c) Building Query Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store ID S.a S.b 21 2 2 25 3 3 36 4 4 49 5 5 Query Store Matches R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 } 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 PROBE 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (d) Probing R-Data Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 21 2 2 10 23 2>S.a and S.b>1 25 3 3 14 23 3>S.a and S.b>1 36 4 4 31 23 4>S.a and S.b>1 49 5 5 Query Store R-Data Store ID Predicate Matches 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 } 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (e) Constructing Hybrid Structs Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs Results ID S.a S.b Matches R.ID Q.ID Q.Predicate R,S,Q { 21 2 2 10 23 2>S.a and S.b>1 ? 25 3 3 PROBE 14 23 3>S.a and S.b>1 ? 36 4 4 31 23 4>S.a and S.b>1 ? 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (f) Probing S-Data Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Query Specification S-Data Store Hybrid Structs Results ID S.a S.b Matches R.ID Q.ID Q.Predicate R,S,Q { 21 2 2 10 23 2>S.a and S.b>1 14,21,23 25 3 3 PROBE 14 23 3>S.a and S.b>1 31,21,23 36 4 4 31 23 4>S.a and S.b>1 31,25,23 49 5 5 Query Store R-Data Store ID Predicate 20 R.a=5 and R.b<S.b ID R.a R.b 21 R.a>4 and R.b<S.b and S.a<10 10 2 5 22 R.b=4 and R.a+5>S.a and S.b>2 14 3 3 31 4 1 23 R.a<5 and R.a>S.a and S.b>1 48 9 7 PSoup (f) Probing S-Data Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b PSoup (a) Initial State Sirish Chandrasekaran
Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b PSoup New data 53 5 4 (b) Arrival of new Data Sirish Chandrasekaran
Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 BUILD PSoup (c) Building R-Data Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Data S-Data Store ID S.a S.b 48 4 4 49 5 3 52 3 2 R-Data Store Matches Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 { 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 PROBE 23 R.a<4 and R.b<S.b 53 5 4 PSoup (c) Probing Query Store Sirish Chandrasekaran
Joins over R and S: Arrival of New Data S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 48 4 4 ? ? 4<S.b 49 5 3 53 21 ? 52 3 2 53 22 ? R-Data Store Query Store Matches ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 { 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 PSoup (d) Constructing Hybrid Structs Sirish Chandrasekaran
Joins over R and S: Arrival of New Data S-Data Store Hybrid Structs ID S.a S.b R.ID Q.ID Q.Predicate 48 4 4 53 20 4<S.b 49 5 3 53 21 4<S.b and S.a<10 52 3 2 53 22 10>S.a and S.b>2 R-Data Store Query Store Matches ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 { 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 PSoup (d) Constructing Hybrid Structs Sirish Chandrasekaran
Joins over R and S: Arrival of New Data Results S-Data Store Hybrid Structs R,S,Q ID S.a S.b R.ID Q.ID Q.Predicate } Matches 53,48,22 48 4 4 53 20 4<S.b 53,49,22 49 5 3 PROBE 53 21 4<S.b and S.a<10 52 3 2 53 22 10>S.a and S.b>2 R-Data Store Query Store ID Predicate ID R.a R.b 20 R.a=5 and R.b<S.b 47 4 3 21 R.a>4 and R.b<S.b and S.a<10 50 5 3 22 R.b=4 and R.a+5>S.a and S.b>2 51 3 8 23 R.a<4 and R.b<S.b 53 5 4 PSoup (e) Probing S-Data Store Sirish Chandrasekaran
Other Queries • N-way Joins • Similar to 2-way joins • Probe, generate hybrid structs, repeat • Can be executed without intermediate tables • Aggregations • Performed at query invocation • Uses n-ary ranked tree, clustered on time Sirish Chandrasekaran
Telegraph Background: CACQ • CACQ [MSHR02] • Shared execution of multiple queries with one Eddy • Tuple lineage • Query Indices • Queries and Data treated very differently • Only Landmark Continuous Queries • No support for disconnected operation Sirish Chandrasekaran
PSoup in Telegraph • Leverage SteMs to store and index queries • Changes to Eddies • Encode queries as tuples • break Where clause into individual boolean factors (BF) • encode each BF as R.a relop [R.b|S.b] [+|-] constant • Stream Prefix Consistency • A new query or data tuple is completely processed before any other tuple: no holes in Result Structure. • Results Structure: to buffer the results. Sirish Chandrasekaran
Experiments and Results • Alternatives • NoMat – No background processing • PSoup-Partial – background processing, apply current window on invocation • PSoup-Complete – current windows are also continuously applied in the background • Experimental Parameters • Unloaded Server with two Intel Pentium III, 666 MHz processors with 768 MB RAM • Data arrives as fast as possible, in domain [0,255] • Queries of form R.a relop C, where c in [0,255] • Join Queries of form R.a relop S.b +/- C. Sirish Chandrasekaran
Experiments: Response Time vs. Window Size • Interval Predicates, Selection Queries Sirish Chandrasekaran
Experiments: Response Time vs. Window Size • Equality Predicates, Selection Queries Sirish Chandrasekaran
Experiments: Max data arrival rate vs. #SQCs • Window Size = 1000 tuples Sirish Chandrasekaran
PSoup in traditional query processor • PSoup = SQL QUERY over data and client query streams? • Joins = expression evaluators • Notes • Conventional QPs do not have tuple lineage • Conventional QPs always use intermediate tables Sirish Chandrasekaran
Conclusions • Treating Queries and Data the same • Combines approaches for previously studied queries • Queries over the past and continuous queries • Allows new functionality – hybrid queries • Separating Result Generation and Delivery • Makes disconnected operation feasible • Efficient support for repeated query invocations Sirish Chandrasekaran