1 / 21

Safety Guarantee of Continuous Join Queries over Punctuated Data Streams

Safety Guarantee of Continuous Join Queries over Punctuated Data Streams. Hua-Gang Li * , Songting Chen, Junichi Tatemura Divykant Agrawal, K. Selcuk Candan and Wang-Pin Hsiung NEC Laboratories America * University of California, Santa Barbara. Stream Query Processing.

cybele
Télécharger la présentation

Safety Guarantee of Continuous Join Queries over Punctuated Data Streams

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Safety Guarantee of Continuous Join Queries over Punctuated Data Streams Hua-Gang Li *, Songting Chen, Junichi Tatemura Divykant Agrawal, K. Selcuk Candan and Wang-Pin Hsiung NEC Laboratories America * University of California, Santa Barbara

  2. Stream Query Processing • Online transaction management • Network analysis • Sensor network monitoring • … Continuous Queries Stream Query Engine Streaming Data Streaming Data VLDB' 2006. Seoul, Korea

  3. Motivating Example • Window approach • However, window size may be hard to determine • Exploiting stream constraints • Uniqueness, sorted input, etc • Punctuations VLDB' 2006. Seoul, Korea

  4. Punctuation • Punctuation • A predicate that must be evaluated to false for every element following the punctuation • Representation [Tucker et al. TKDE 2003] • A special tuple (*, c, *, *) • E.g., Item(sellerid,itemid,name,initialprice) A punctuation “no more item with itemid = 1” is denoted as (*, 1, *, *) VLDB' 2006. Seoul, Korea

  5. State of the Art • Semantic modeling of punctuations[Tucker et al. TKDE 2003] • Punctuation-aware query optimization • Binary join[Ding et al. EDBT 2004] • Group By[Li et al. SIGMOD 2005] • Generation of useful punctuations, i.e., heartbeats, from time domain[Srivastava et al. PODS 2004] • However, one fundamental problem is not addressed • Whether a query can benefit from available punctuations, refer to as “safety checking” problem VLDB' 2006. Seoul, Korea

  6. Outline • Formulate safety checking problem for continuous join queries • Sound and complete safety condition for simple punctuations • Sounds and complete safety condition for complex punctuations • Conclusion and future work VLDB' 2006. Seoul, Korea

  7. Punctuation Scheme • Punctuation scheme • Describe the types of punctuation instances that a data stream can have at runtime • Can be viewed as metadata of punctuation instances • Representation • Simple punctuation schemes: e.g.,Item(sellerid,itemid,name,initialprice). punctuation scheme (–,+,–,–), instance (*, 1, *, *) • Complex punctation schemes: e.g., Bid(bidderid,itemid, increase). punctuation scheme (+,+,–), instance (1, 1, *) • Determined by application semantics VLDB' 2006. Seoul, Korea

  8. Safety Checking Problem • Given a continuous join query Q (CJQ) and a set of punctuation schemes, • Determine If Q still requires unbounded memory consumption no matter what punctuation instances (described by the punctuation schemes) may occur • For example: • Unsafe if we only have following two punctuation schemes • Item(sellerid,itemid,name,initialprice) (–, +, –, –) • Bid(bidderid,itemid,increase) (+, +, –) • Safety .vs. Runtime memory consumption • Unsafe query always requires infinite runtime memory • However, safe query does not guarantee low runtime memory consumption VLDB' 2006. Seoul, Korea

  9. Concepts • Join State • Refer to the space used for storing the inputs of each join operator • Purgeability • Purgeability of a join state for every tuple t, there exists a finite set of punctuation instances such that t will not produce any join results with any new tuples • Purgeability of a join operator • Safe Execution Plan • Every join operator involved is purgeable • Safe CJQ • There exists at least one safe execution plan … … √ VLDB' 2006. Seoul, Korea

  10. Purging for Binary Join Operator Purge S2 is similar. Hence we need punctuation schemes S1 (–, +), S2 (+, –) VLDB' 2006. Seoul, Korea

  11. A CJQ with no Safe Binary Join Plan CJQ Punctuation Schemes S1 (A–, B+), S2 (B–, C+), S3 (C–, A+) S1.A = S3.A Unsafe Plan VLDB' 2006. Seoul, Korea

  12. Purging for M-Way Join Operator VLDB' 2006. Seoul, Korea

  13. Chained Purge Strategy There is a punctuation propagation effect for M-way join operator! VLDB' 2006. Seoul, Korea

  14. Punctuation Graph (simple punctuation scheme) Capture such punctuation propagation effect VLDB' 2006. Seoul, Korea

  15. Purgeability of a Join Operator • THEOREM 1. The join state S is purgeable iff there exists a path from S to every other node Si in the punctuation graph • COROLLARY 1. A join operator is purgeable iffits punctuation graph is a strongly connected graph. … S2 S S1 S3 … S’ VLDB' 2006. Seoul, Korea

  16. Safety for CJQ • Safe CJQ requires at least one safe execution plan • However, the number of execution plans is exponential • THEOREM 2. A CJQ is safe iff its M-join plan is safe → If M-join plan is unsafe, no other safe plan exists → Linear safety checking for simple punctuation schemes VLDB' 2006. Seoul, Korea

  17. Handling Complex Punctuation Schemes S3 (A, C) • S3: (+,+) cannot purge either S1 or S2, but can purge S1 S2 VLDB' 2006. Seoul, Korea

  18. Generalized Punctuation Graph Intermediate result Purge of intermediate result Purge of raw data stream VLDB' 2006. Seoul, Korea

  19. CJQ Safety under Complex Punctuations Schemes • Intuition: intermediate results have to be purgeable as well • Transformed Punctuation Graph • 1. Identify strongly connected sub-graph, merge them into a single merged node • 2. Take the generalized punctuation edges of merged node into account, continue Step 1 • THEOREM 3. A CJQ is safe iff transformed punctuation graph ends up in a single merged node • Polynomial safety checking for complex punctuation schemes VLDB' 2006. Seoul, Korea

  20. Conclusion & Future Work • Formulate the safety checking problem for CJQ • Sound and complete safety conditions • Based on novel punctuation graph • Linear for simple punctuation schemes • Polynomial for complex punctuation schemes • Future work • Optimization of Chained Purge Strategy for M-join • M-join purge .vs. a tree binary-join purge • Optimization of CJQ • Purge plan .vs. join plan • Adaptive purge plan • Generation of Punctuations VLDB' 2006. Seoul, Korea

More Related