Enhancing Continuous Query Language with CAPE Algebra for Real-Time Data Processing

Continuous Query Language: From CQL to CAPE Algebra Plans Lee Chu Che Wai Kwan MQP 2004/2005

Continuous Query Processing • Emerging Applications: • Traffic management • Network monitoring • Require: • Online processing of data streams • But: • Traditional databases handle persistent data

Database System One time query Random access Data Stream System Continuous queries Sequential access Databases Systems VS Data Stream System

CAPE: Constraint-exploiting Adaptive Processing Engine • An on-going project at WPI

CAPE’s limitation • Desire: • High-level query language, such as SQL • Instead: • Enter queries as low-level execution plan • Problems: • Tedious to enter • Error prone

select S.A from R, S, Q where R.A = S.A <queryplan> <operator root = “true” id = “1” className = “ …”> <classVariables> <variable name=“group_pos” value=“0”/> <variable name=“function” value=“null”/> <variable name=“function_pos” value=“0”/> <variable name=“function” value=“count”/> <variable name=“function_pos” value=“0”/> <variable name=“propagate” value=“false”/> <variable name=“debug” value=“true”/> </classVariables> <properties> </properties> <parents> </parents> <children> <child id = “2”/> </children> <streams> </streams> </operator> <operator root…..> . . . </operator> . . . </queryplan> Algebra Plan VS SQL ID = 1 Group By ID = 2

Objective • Define and implement a high-level query language for CAPE

Methodology • Study existing Continuous Processing Language proposals • Identify one, adopt and adapt if appropriate • Implement it for CAPE

Requirements on Language • SQL-alike • Data Streams • Windows on streams

Continuous Processing Languages • UDA – UCLA • TelegraphCQ – Berkeley • STREAM-CQL – Stanford

STREAM-CQL • Well defined semantics • Open source available • Query example: query : rstream (select S.A from R, Q, S[range 1 minute] where R.A = S.A);

STREAM Plan Generator STREAM Parser CQL CAPE XML Plan Writer CAPE Plan Rewriter CAPE Engine Our Query Plan Generator: Big Picture

CQL Generates a parse tree STREAM Plan Generator STREAM Parser Step 1 :STREAM Parser Yacc and Lex

t_rstreamNow t_removeIstream t_streamCross t_removeProject t_makeCrossBinary t_makeStreamCrossBinary t_pushSelect Step 2: STREAM Plan Generator STREAM Plan Generator Modified Plan CAPE Plan Rewriter Parse Tree

RStream ID = 7 Project [1, 0] ID = 6 Select[0,0]==[1,0] ID = 5 Cross (1, 3, 4) ID = 0 Stream Source[2] ID = 4 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[1] ID = 2 STREAM Plan Generator :Default Query Plan query : rstream (select S.A from R, S [range 1 minute], Q, where R.A = S.A);

RStream ID = 7 Project [1, 0] ID = 6 Cross (10, 4) ID = 9 Select[0,0]==[1,1] ID = 10 Stream Source[2] ID = 4 Cross (1, 3) ID = 8 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[1] ID = 2 STREAM Plan Generator:Cleaned Query Plan query : rstream (select S.A from R, S [range 1 minute], Q, where R.A = S.A);

ThetaJoin rule WindowPushUp rule Step 3: CAPE Plan Rewriter CAPE Plan Rewriter Optimized Tree Cleaned Tree

Project [1, 0] ID = 6 Cross (10, 4) ID = 9 Cross (11, 4) ID = 9 Select[0,0]==[1,1] ID = 10 Stream Source[2] ID = 4 ThetaJoin[0,0]==[1,1] ID = 11 Cross (1, 3) ID = 8 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[1] ID = 2 ThetaJoin Rule RStream ID = 7

Range Window[60] ID = 3 Project [1, 0] ID = 6 Project [1, 0] ID = 6 Project [1, 0] ID = 6 ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] Range Window[60] ID = 3 Cross (11, 4) ID = 9 Range Window [60] Cross (11, 4) ID = 9 Range Window[60] ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] Cross (11, 4) ID = 9 Stream Source[0] ID = 1 Stream Source[2] ID = 4 ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] ThetaJoin[0,0]==[1,1] ID = 11 ThetaJoin[0,0]==[1,1] ID = 11 Stream Source[0] ID = 1 ThetaJoin[0,0]==[1,1] ID = 11 Range Window[60] Stream Source[1] ID = 2 Stream Source[1] ID = 2 Range Window[60] ID = 3 Stream Source[0] ID = 1 Stream Source[0] ID = 1 Stream Source[0] ID = 1 WindowPushUp Rule RStream ID = 7

<queryplan> <operator root> <class variables> < /class variables> <properties> </properties> <parents> </parents> <children> </children> <stream> </stream> </operator> </queryplan> CAPE Engine Step 4: CAPE XML Plan Writer Optimized Tree XML Plan CAPE XML Plan Writer

Evaluation Methodology • Query test bed: • Test individual operators • Test complex query plans • Evaluation • Manual inspection of generated XML plan • Test XML file on CAPE

Evaluation of Individual Operators • Regular Project • Function Project • Select • Stream Source • Range Window • Partition • Distinct

CQL: Rstream (Select A from S where A =5);

CQL: rstream (select A + B from S);

Conclusion • Identified query language for CAPE • Designed a loosely coupled translation frameworks from CQL to CAPE: • Rewrite algebra tree • Generate CAPE XML plans • Evaluation of generated query plans

Future Works • Implement Relations • Which will maximize CAPE’s capability • Research on the window size • Support different time range variation • Implement a Graphical User Interface • Drag and Drop feature to input CQL

Acknowledgements • Prof. Rundensteiner • Yali Zhu • Luping Ding

Question or Comments?

Enhancing Continuous Query Language with CAPE Algebra for Real-Time Data Processing

Enhancing Continuous Query Language with CAPE Algebra for Real-Time Data Processing

Presentation Transcript

Fundamentals of Misys Query (Tiger and PM)

ALGEBRA TILES

RDF for Developers

Downloaded from www.slideshare.net made by Cape Cod Language School

Continuous detail is used in language comprehension and language learning:

Multi-dimensional Search Trees

Chapter 14

SAS912: XML Support in ASA

The Relational Algebra and Calculus

CS 245: Database System Principles

本讲主要内容

Implementation of Relational Operators

Chapter 5 Structured Query Language (SQL)

Surfing Cape Fear