1 / 52

IVOX I ncremental V iew Maintenance for O rdered X ML

IVOX I ncremental V iew Maintenance for O rdered X ML. DSRG Talk WPI February 20 th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke Rundensteiner. Outline. Motivation Problem Description Background XML Algebra Order in XML Algebra The IVOX Approach

Télécharger la présentation

IVOX I ncremental V iew Maintenance for O rdered X ML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IVOXIncremental View Maintenance for Ordered XML DSRG Talk WPI February 20th 2003 Students: Katica Dimitrova & Maged El Sayed Advisor: Prof. Elke Rundensteiner

  2. Outline • Motivation • Problem Description • Background • XML Algebra • Order in XML Algebra • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work

  3. Outline  • Motivation • Problem Description • Background • XML Algebra • Order in XML Algebra • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work

  4. Views in general Data warehouses Information integration Access control, Privacy, ..etc XML Views (EXTRA useful) Information Inter-Portability Crossing gaps between different data models Materialized Views Speed up data retrieval Query optimization Increased availability Motivation View View Definition Query RDB XML Other Sources

  5. Maintaining Materialized Views When sources are updated, materialized view may becomes inconsistent. Methods of view maintenance • Recomputation • recompute view from scratch from base data • Incremental view maintenance • compute changes to view in response to changes to base sources Heuristic: Incremental view maintenance is usually cheaper than full recomputation.

  6. Outline  • Motivation • Problem Description • Background • The XAT Algebra • XML order in the XAT Context • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work 

  7. The Problem • Previous work for: • Relational [GMS93],bag semantics [GL95], [ZGHW95], [PSCP02] • Object-Relational [LVM00] • Object-Oriented [AFP02] • Structured data models [AMRVW98], [ZM98] • XML data model not handling order [LD00] • Can techniques for other data models be reused for XML?

  8. Is Maintaining XML Views Different? • XML features • Hierarchical • Optional elements • Self-typed • References • Ordered • Expressiveness of view definition language • Complex operations • tagging, unnesting, aggregation, .. • Expected large auxiliary information

  9. Example <result> <book> <title>Data on the Web</title> <price>39.95</price> </book> </result> <bib> <book> <price> 65.95 </price> <title> Advanced Programming in the Unix environment </title> </book> <book> <title> TCP/IP Illustrated </title> </book> <book> <price>39.95</price> <title> Data on the Web </title> </book> </bib> View Extent Bib.xml <result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <book> $b/title, $b/price </book> </result> List all books that cost less than $60, including their title and price View Definition Query

  10. Example <result> <book> <title>Data on the Web</title> <price>39.95</price> </book> </result> <book> <title>TCP/IP Illustrated</title> <price>55.48</price> </book> <bib> <book> <price> 65.95 </price> <title> Advanced Programming in the Unix environment </title> </book> <book> <title> TCP/IP Illustrated </title> </book> <book> <price>39.95</price> <title> Data on the Web </title> </book> </bib> <price>55.48</price> View Extent Bib.xml <result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <book> $b/title, $b/price </book> </result> Insert element <price>55.48</price> into second book View Definition Query

  11. Our Goal • Design incremental view maintenance strategy for XQuery views that: • Correctly update the view • Is order sensitive • Returns view in proper order • Allows for updates that specify order • Covers at least the “core” of XQuery language views • Minimizes auxiliary information requirements

  12. Basics of IVOX Approach: Algebraic Update propagation rules for each algebra operator and each update type XML View D2 Update D2 Update Algebra Tree Operator Operator XQuery Definition D1 D1 Update Execution View Maintenance XML Source XML Source XML Source time Update

  13. Why Algebraic? • Robust – Easily adaptable to operator semantic changes • Extensible – new operators can be added • Allows for reuse of techniques for known operators • Language independent- independent of syntax changes (of XQuery by W3C) • Formal – basis for provable correctness

  14. Outline  • Motivation • Problem Description • Background • XML Algebra • Order in XML Algebra • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work  

  15. Background on XML Algebra XAT • XAT Operators • SQL Operators: Select, Project … • Special Operators: Source, FOR… • XML Operators: Navigate, Tagger .. • XAT Data Model (XAT Table) • Order sensitive table of tuples • Columns denote user-specified or internally generated variable bindings • A cell in a tuple holds an XML node for a sequence of XML nodes  $col1, price $col3

  16. Order among tuples Order among XML nodes in a cell Order in XAT Context  $col1, price $col3

  17. Order among the tuples Order among XML nodes in a single cell Order in the XAT Context ( , ) Agg$col5

  18. On update worry about: Order among tuples Order among XML nodes in a cell Order in XAT Context: View Maintenance  $col1, price $col3

  19. On update worry about: Order among the tuples Order among XML nodes in a single cell Order in XAT Context & View Maintenance ( , ) Agg$col5

  20. Complex operations require auxiliary information Auxiliary information can be too large in XAT context May be expensive to maintain it Duplicate Information in XAT Context  $col1, price $col3 ! Duplicated Storage

  21. Outline  • Motivation • Problem Description • Background • XML Algebra • Order in XML Algebra • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work   

  22. $b $col3 <book>…. </book> <price> 65.95 </price> <book>…. </book> <price> 55.48 </price> <book>…. </book> <price> 39.95 </price>  $col1, price $col3 $b <book> <price> 65.95 </price> <title> Advanced …</title> </book> <book> <title> TCP/IP …</title> </book> <price>55.48</price> <book>….</book> Possible Solutions to Order Preservation (I) • Sequential storage (XPROP approach by Maged, Ling & Luping) • Assume intermediate results stored sequentially • Inserts and deletes are performed in physical order • No order encoding Special support required for secondary storage May require iteration over many tuples to determine order

  23. Ord 1 3 2 2 <book>…. </book> <price> 55.48 </price> <price>55.48</price> Possible Solutions to Order Preservation (II) • Naïve order encoding for tuples and sequences of XML nodes • Assign order numbers to tuples and to XML nodes in a sequence Requires frequent renumbering on inserts.  $col1, price $col3

  24. Using Node Identity • Idea: Use node identity • Usage: • For encoding order and structure • As a reference to base data

  25. Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? 1 bib 8 2 7 book book 5 book 10 9 6 title 3 price price 9 8 4 price title 7 6 title

  26. Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? 1 bib 1 3 book book 2 book 2 1 title 1 price price 1 2 price title 2 1 title

  27. Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? 1 bib 1.1 1.3 book book 1.2 book 1.3.2 1.2.1 title 1.1.1 price price 1.3.1 1.1.2 price title 1.2.2 1.2.1 title

  28. Existing techniques for encoding order for XML Global Order (UW) Local Order (UW) Dewey Order (UW) Lexicographical Order (MASS) What Encoding For Node Identity? b bib b.b b.f book book b.d book b.f.l b.d.b title b.b.b price price b.f.cm b.b.cd price title The Winner b.d.f title

  29. Lexicographical Keys: LexKeys • What are LexKeys? • Multi-level lexicographical keys • Example: c , ba.c.b • Examples of comparison b < b.c bab < bd.cc b.b < b.b.c • Advantages • All LexKeys form a totally ordered set with respect to < • It is always possible to generate a key between two keys • The deletion of a LexKey in a sequence does not affect other LexKeys • Usage • Reference to XML nodes • Encoding order

  30. LexKeys in XAT Tables  $b, price $col2  $b, price $col2

  31. Order Among XAT Tuples Notion: designate order schema to XAT tables • Ordering by LexKeys by columns in order schema yields correct tuple order. Order Schema 1 2 3 1 2

  32. Calculating Order Schema • Rules for each operator • Calculated in a postorder traversal of the tree • Sample Rules

  33. Order Among Tuples Example 1 1 2  $b, price $col2  $b, price $col2 1 1 2 3

  34. Order in Collection within a cell? 2 1 ( { } , , ) Agg$col5 Agg$col5 1 2 2 1

  35. Smart Keys • What is a SmartKey? SmartKey Key part, by default also represents order Optional, only represents order when present • Notation: key(order) • Examples • b.c.b (h) • b.c.b

  36. SmartKeys in XATTables 2 1 ( { } , , ) Agg$col5 Agg$col5 1 2 2 1

  37. The Impact of SmartKeys on View Maintenance

  38. Not touching other tuples in XAT table No reordering ever needed. Gaining distributiveness in regard to bag union on tuple level Order Among XAT Tuples during View Maintenance 1 3 2  $col1, price $col3 1 3 2

  39. Not touching other members of the sequence No reordering ever needed. Gaining distributiveness in regard to bag union on cell level Order in a Sequence during View Maintenance 2 1 { } , Agg$col5 2 1

  40. Use distributiveness in regard to bag union Reuse rules from relational for most SQL XAT operators XAT table 2 Update to XAT table 2 Operator Operator XAT table 1 Update to XAT table 1 Execution View Maintenance time Update Propagation Rules

  41. Update Propagation Rules Example(Navigate Unnest on Insert Tuple) T2old =  $col,path$col’ (T1old) T1new=T1old + T1 T2new =  $col,path$col’ (T1old + T1) = =  $col,path$col’ (T1old) +  $col,path$col’ (T1) = = T2old + T2 + represents bag union T2 T2  $col,path$col’  $col,path$col’ T1 T1 Execution View Maintenance time

  42. Update xatup Update XQuery keyup xmlup Update Propagation Strategy XML View XAT Translator XML Source XML Source XML Source Storage Manager

  43. Update Primitives (The Format of Delta) Apply to original XML Document • XML Update Primitives (xup) • Insert (xmlFragment, path) • Delete (path) • InsertAtt (name, value, path) • DeleteAtt (name, path) • Replace (oldValue, newValue, path) • XML Key Update Primitives (keyup) • Insert (el, path) • Delete (path) • Replace (el, pos) • XAT Update Primitives (xatup) • InsertTuple (tuple) • DeleteTuple (tupleId) • ChangeTuple (Keyup, columnName, tupleId) Express update on original XML data in terms of LexKeys Apply to XATTable

  44. A Complete Example

  45. $col6 $col5 tr { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) } b bib $col5 b.b b.d b.f tb..b.f.l..b.f.cm book book book b.b.cd b.b.b b.f.cm b.f.l $ col2 $col4 price title price title b.f.cm b.f.l b.d.f Key XDOM $col2 $col4 title tr result b.b.b b.b.cd $b $col2 b.f.cm b.f.l Key Key XDOM XDOM tb..b.f.l.. b.f.cm b.b b.b.b tb.. b.f.l.. b.f.cm tb.. b.f.l.. b.f.cm tb.. b.f.l.. b.f.cm $b book book book b.f b.f.cm b.b b.f.l b.f.cm b.f.l b.f.l b.f.cm b.f.cm b.d b.f $col1 b T <result>$col5</result> $col6 Execution Agg $col5 T <book>$col4 $col2</book> $col5 Storage Manager Constructed XDOMs  $col3 < 60  $b, title $col4  $b, price $col2  $col1, book $b bib.xml  $S1, bib $col1 S ”bib.xml” $S1 bib.xml

  46. $col6 $col5 $col5 tr { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) } { tb..b.f.l..b.f.cm(b.f.l..b.f.cm ) tb..b.d.f..b.d.b(..b.d.f..b.d.b) } ChangeTuple(insert(tb..b.d.f..b.d.b, result[tr]), $col6, tr) b $col5 $col5 tb..b.d.f..b.d.b(..b.d.f..b.d.b) bib Insert (price, bib[1].book[2]) tb..b.f.l..b.f.cm tb..b.f.l..b.f.cm b.b b.d b.f ChangeTuple(insert( tb..b.d.f..b.d.b, null), $col5, ) tb..b.d.f..b.d.b book book book $ col2 $col4 b.b.cd b.b.b b.f.cm b.f.l $ col2 insertTuple({tb..b.d.f..b.d.b}) $col4 b.f.cm b.f.l price title price title b.f.cm b.f.l $col2 $col4 $col2 $col4 b.d.d b.d.f b.d.f Key XDOM Key XDOM Key Key XDOM XDOM b.b.b b.b.cd title b.b.b b.b.cd tr tb.. b.f.l.. b.f.cm tb.. b.f.l.. b.f.cm tb.. b.f.l.. b.f.cm result book book book $b $col2 b.f.cm b.f.l b.f.cm b.f.l $b $col2 b.f.l b.f.cm b.f.l b.f.l b.f.cm b.f.cm Key XDOM tb..b.f.l.. b.f.cm b.b b.b.b b.d.d b.d.f $b b.b b.b.b tb.. b.d.f.. b.d.b tb.. b.d.f.. b.d.b tb.. b.d.f.. b.d.b b.f b.f.cm tb.. b.f.l.. b.f.cm tb.. b.f.l.. b.f.cm book book book book book b.f b.f.cm b.b b.d.f b.d.b b.d.f b.d.f b.d.b b.d.b b.d b.d.b b.d b.f.l b.f.l b.f.cm b.f.cm b.f insertTuple({b.d, b,d.b}) $col1 b ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col2, b.f, b.f.m) insetTuple({b.d.b, b.d.f}) changeTuple(insert(price[b.d.b], book[b.d]), $b, b.d) ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col1, b) insertTuple({b.d.b, b.d.f}) Insert (price[b.d.b], bib[b].book[b.d]) b.d.b price T <result>$col5</result> $col6 View Maintenance Agg $col5 T <book>$col4 $col2</book> $col5 Storage Manager Constructed XDOMs  $col3 < 60  $b, title $col4  $b, price $col2  $col1, book $b bib.xml  $S1, bib $col1 S ”bib.xml” $S1 bib.xml

  47. Outline  • Motivation • Problem Description • Background on XAT • XML Algebra • Order in XML Algebra • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work    

  48. System Architecture View Maintenance Execution User View Definition XQuery Legend Materialized XML View Update XQuery Process XML Query Engine Update Primitive Generator Data VM Initializer XML View Maintainer Update Propagation Rules Repository XML Algebra Tree Persistent Data Storage IVOX Executer One time occurrence Rainbow XTUP On-update occurrence XML Source XML Source Materialized Auxiliary Views XML Source Storage Manager

  49. Outline  • Motivation • Problem Description • Background on XAT • XML Algebra • Order in XML Algebra • The IVOX Approach • Order Encoding • Overall strategy • System Architecture • Related Work • Future Work     

  50. Related Work • A.Gupta, I.S.Mumick. Maintenance of Materialized Views: Problems, Techniques, and Application. In Bulletin of the Technical Committee on Data engineering 1995. • T. Grin, L.Libkin. Incremental maintenance of views with duplicates. In SIGMOD 1995. • H. Liefke and S. Davidson. View Maintenance for Hierarchical Semistructured Data. In DAWAK 2000. • S. Abiteboul, J. McHugh, Rys, Vassalos, J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB 1998.

More Related