1 / 22

Sven Bittner and Annika Hinze , 10 June 2005

Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems. Sven Bittner and Annika Hinze , 10 June 2005. Structure. Motivation Canonical Transformation

maddy
Télécharger la présentation

Sven Bittner and Annika Hinze , 10 June 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Talk at the 4th International Workshop on Distributed Event-Based Systems at the Conference ICDCS 2005 On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems Sven Bittner and Annika Hinze, 10 June 2005

  2. Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

  3. Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

  4. Motivation: Current Assumptions • Expressive filtering • All subscriptions might be transformed to DNFs (or are purely conjunctive) • Efficient filtering • Utilisation of indexes • Filtering on conjunctions (DNFs) is most efficient • Main memory solutions are most efficient • Scalable filtering • Filtering is obtained on designated servers • Large main memories are available Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

  5. Motivation: Our Point of View • Main memory algorithms are as scalable as provided resources  Efficiency is only one quality measure  Matching algorithms should consider their memory usage (scalability) • Claim • Filtering on arbitrary Boolean subscriptions is • More expressive (i.e., richer subscription language) • More scalable (i.e., requires less memory) • Only slightly less efficient (i.e., slower matching times) Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

  6. Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

  7. Transformations: Example Transformation Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

  8. Transformations: Implications • Efficiency • Faster filtering algorithms applicable • Filtering on more subscriptions, common sub-expressions • Scalability • Storage of Boolean formulae not required • More subscriptions to store  Which influences overweigh? Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

  9. Transformations: Origin - DBMS • Utilised for query execution • Transform to canonical expression (e.g. DNF) • Simplify each element in disjunction separately • Create access plans and execute cheapest  Useful, since efficient data access is crucial • Several advantages • Only few queries at one time  no memory problems • Data storage is known in advance  data access might be optimised Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

  10. Transformations: Why in ENS? • ENSs show converse problem definition • Large subscription numbers (queries) • Events not known in advance (data) • Subscriptions are not optimised (in current approaches) • Memory usage even higher • Computations for more subscriptions  Is a transformation useful in ENSs? Motivation Canonical Transformation Non-Canonical Filtering Experiments Summary

  11. Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

  12. Non-Canonical Filtering: Trees • (almost) as shown: • Internal representation • Predicate identifiers in leaves (indexes for predicates) • Space efficient encoding (in future) • Actually encoded on byte level, i.e., • 1 byte each: No. of children, operator • 2 bytes: width of children MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  13. Non-Canonical Filtering: 2 Steps • Predicate matching • Determine matching predicates • Subscription matching • Determine candidate subscriptions (min 1 match) • Evaluate their Boolean combinations MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  14. Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Summary and Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

  15. AND OR OR OR p1 p2 p3 p4 p5 p6 Experiments: Initial Evaluation • Comparison of Step 2 of matching approaches • Step 1 utilises same indexes • Canonical counting algorithm (count no. of predicates) • Original – compare for all subscriptions • Variant – compare for candidate subscriptions only • Our non-canonical approach • Subscription characterisation • Number of predicates P (=6) • DNF consists of disjunctive elements (8) • Each element contains predicates (3) MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  16. Experiments: Setting P M MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  17. Counting algorithm Sharp bends denote when available main memory resources are exhausted The less subscriptions are created the better the scalability Non-canonical approach Available main memory sufficient Scalability independent of transformations P=6; M=5,000 P=10; M=5,000 Experiments: Results - Scalability MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  18. Counting algorithm Original approach shows linear increasing matching times Variant becomes more efficient in case of large subscription numbers Non-canonical approach Filtering more efficient than variant of counting algorithm Difference becomes more pronounced when DNFs become larger P=6; M=5,000 P=10; M=5,000 Experiments: Results - Efficiency MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  19. Experiments: Results - Summary • Transformations to DNFs radically drop scalability of filter algorithms  Memory requirements for transformed conjunctive subscriptions overweigh storage space for Boolean ones • Filtering on several conjunctive subscriptions instead of arbitrary Boolean ones decreases efficiency  Impact of more conjunctive (simpler) subscriptions on filtering performance overweighs higher matching costs of Boolean ones MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  20. Structure • Motivation • Canonical Transformation • Non-Canonical Filtering • Experiments • Future Work Annika Hinze – On the Benefits of Non-Canonical Filtering in Publish/Subscribe Systems

  21. Future Work • Theoretical evaluation of memory requirements • Characterisation of subscriptions • Statements like “when to use which approach” • Further practical experiments • Prove correctness of theoretical evaluation • Analyse more sophisticated settings MotivationCanonical Transformation Non-Canonical Filtering Experiments Summary

  22. Thank you for your attention! Contact: Sven Bittner, Annika Hinze {s.bittner, a.hinze}@cs.waikato.ac.nz

More Related