280 likes | 411 Vues
This document explores the transition from traditional database systems to Continuous Query Processing, focusing on the CAPE (Constraint-exploiting Adaptive Processing Engine) project. Highlighting emerging applications like traffic management and network monitoring, it identifies the need for effective online processing of data streams. The paper emphasizes the development of a high-level query language akin to SQL for continuous data streams, addressing CAPE's current limitations. Methodologies for implementing and optimizing this language are discussed, augmenting CAPE's capabilities for real-time query execution.
E N D
Continuous Query Language: From CQL to CAPE Algebra Plans Lee Chu Che Wai Kwan MQP 2004/2005
Continuous Query Processing • Emerging Applications: • Traffic management • Network monitoring • Require: • Online processing of data streams • But: • Traditional databases handle persistent data
Database System One time query Random access Data Stream System Continuous queries Sequential access Databases Systems VS Data Stream System
CAPE: Constraint-exploiting Adaptive Processing Engine • An on-going project at WPI
CAPE’s limitation • Desire: • High-level query language, such as SQL • Instead: • Enter queries as low-level execution plan • Problems: • Tedious to enter • Error prone
select S.A from R, S, Q where R.A = S.A <queryplan> <operator root = “true” id = “1” className = “ …”> <classVariables> <variable name=“group_pos” value=“0”/> <variable name=“function” value=“null”/> <variable name=“function_pos” value=“0”/> <variable name=“function” value=“count”/> <variable name=“function_pos” value=“0”/> <variable name=“propagate” value=“false”/> <variable name=“debug” value=“true”/> </classVariables> <properties> </properties> <parents> </parents> <children> <child id = “2”/> </children> <streams> </streams> </operator> <operator root…..> . . . </operator> . . . </queryplan> Algebra Plan VS SQL ID = 1 Group By ID = 2
Objective • Define and implement a high-level query language for CAPE
Methodology • Study existing Continuous Processing Language proposals • Identify one, adopt and adapt if appropriate • Implement it for CAPE
Requirements on Language • SQL-alike • Data Streams • Windows on streams
Continuous Processing Languages • UDA – UCLA • TelegraphCQ – Berkeley • STREAM-CQL – Stanford
STREAM-CQL • Well defined semantics • Open source available • Query example: query : rstream (select S.A from R, Q, S[range 1 minute] where R.A = S.A);
STREAM Plan Generator STREAM Parser CQL CAPE XML Plan Writer CAPE Plan Rewriter CAPE Engine Our Query Plan Generator: Big Picture
CQL Generates a parse tree STREAM Plan Generator STREAM Parser Step 1 :STREAM Parser Yacc and Lex
t_rstreamNow t_removeIstream t_streamCross t_removeProject t_makeCrossBinary t_makeStreamCrossBinary t_pushSelect Step 2: STREAM Plan Generator STREAM Plan Generator Modified Plan CAPE Plan Rewriter Parse Tree
RStream ID = 7 Project [1, 0] ID = 6 Select[0,0]==[1,0] ID = 5 Cross (1, 3, 4) ID = 0 Stream Source[2] ID = 4 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[1] ID = 2 STREAM Plan Generator :Default Query Plan query : rstream (select S.A from R, S [range 1 minute], Q, where R.A = S.A);
RStream ID = 7 Project [1, 0] ID = 6 Cross (10, 4) ID = 9 Select[0,0]==[1,1] ID = 10 Stream Source[2] ID = 4 Cross (1, 3) ID = 8 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[1] ID = 2 STREAM Plan Generator:Cleaned Query Plan query : rstream (select S.A from R, S [range 1 minute], Q, where R.A = S.A);
ThetaJoin rule WindowPushUp rule Step 3: CAPE Plan Rewriter CAPE Plan Rewriter Optimized Tree Cleaned Tree
Project [1, 0] ID = 6 Cross (10, 4) ID = 9 Cross (11, 4) ID = 9 Select[0,0]==[1,1] ID = 10 Stream Source[2] ID = 4 ThetaJoin[0,0]==[1,1] ID = 11 Cross (1, 3) ID = 8 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[1] ID = 2 ThetaJoin Rule RStream ID = 7
Range Window[60] ID = 3 Project [1, 0] ID = 6 Project [1, 0] ID = 6 Project [1, 0] ID = 6 ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] Range Window[60] ID = 3 Cross (11, 4) ID = 9 Range Window [60] Cross (11, 4) ID = 9 Range Window[60] ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] Cross (11, 4) ID = 9 Stream Source[0] ID = 1 Stream Source[2] ID = 4 ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] ThetaJoin[0,0]==[1,1] ID = 11 ThetaJoin[0,0]==[1,1] ID = 11 Stream Source[0] ID = 1 ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] Stream Source[1] ID = 2 Stream Source[1] ID = 2 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[0] ID = 1 Stream Source[0] ID = 1 WindowPushUp Rule RStream ID = 7
<queryplan> <operator root> <class variables> < /class variables> <properties> </properties> <parents> </parents> <children> </children> <stream> </stream> </operator> </queryplan> CAPE Engine Step 4: CAPE XML Plan Writer Optimized Tree XML Plan CAPE XML Plan Writer
Evaluation Methodology • Query test bed: • Test individual operators • Test complex query plans • Evaluation • Manual inspection of generated XML plan • Test XML file on CAPE
Evaluation of Individual Operators • Regular Project • Function Project • Select • Stream Source • Range Window • Partition • Distinct
CQL: Rstream (Select A from S where A =5);
CQL: rstream (select A + B from S);
Conclusion • Identified query language for CAPE • Designed a loosely coupled translation frameworks from CQL to CAPE: • Rewrite algebra tree • Generate CAPE XML plans • Evaluation of generated query plans
Future Works • Implement Relations • Which will maximize CAPE’s capability • Research on the window size • Support different time range variation • Implement a Graphical User Interface • Drag and Drop feature to input CQL
Acknowledgements • Prof. Rundensteiner • Yali Zhu • Luping Ding