1 / 43

XQuery to SQL by XML Algebra Tree

XQuery to SQL by XML Algebra Tree. Brad Pielech, Brian Murphy Thanks: Xin. Outline. Overview of Rainbow System Process of translating XQuery -> SQL XML Operators Partial translation walkthrough with running example. Rainbow System. Complete XML <-> SQL system

Télécharger la présentation

XQuery to SQL by XML Algebra Tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XQuery to SQL by XML Algebra Tree Brad Pielech, Brian Murphy Thanks: Xin

  2. Outline • Overview of Rainbow System • Process of translating XQuery -> SQL • XML Operators • Partial translation walkthrough with running example

  3. Rainbow System • Complete XML <-> SQL system • Uses some ideas from XPERANTO, Niagara, and other systems • Several main subsystems: • Document Shredder • View Generator • Query Translation, Query Rewrite • Result Generation • Work in progress

  4. Steps in Translation • User inputs XQuery query • User Query is converted into an XML Algebra Tree (XAT) • Database Mapping Query’s XAT generated • Queries are Decorrelated • Trees are merged, unnecessary branches cut

  5. Steps Continued • Computation Pushdown (presentation concludes here) • SQL Generation • Query Execution • Tagging of Results

  6. What is the difference between the two queries? • The user query is executed over a view of the XML document and specifies what to return and how to return it • The mapping query specifies how the view the user is querying “maps” to the database • Therefore, combining the two queries into one is necessary in order to correctly process the user’s request

  7. XAT Operators • Each XAT is comprised of XAT Operators. • Similar in concepts to Relational Algebra • Operator set is combination between Niagara and Xperanto papers

  8. Set of Operators • SQL like (9): • Project, Select, Join (Theta, Outer, Semi), Groupby, Orderby, Union (Node, Outer), Cartesian Product. • XML like (4): • Tagger, Navigate, is(Element, Text), Aggregate. • Special: • SQL, Function, Source, NameColumn, FOR

  9. SQL like Operators (9)

  10. XML like Operators

  11. Special Operators

  12. Sports XML Document <sports> <organization> <team> Boston Red Sox </team> <stadium sname = "Fenway Park"/> <starPlayer> <pname> Nomar </pname> <position> Shortstop </position> </starPlayer> <starPlayer> <pname> Pedro </pname> <position> Pitcher </position> </starPlayer> <starPlayer> <pname> Manny </pname> <position> Outfield </position> </starPlayer> </organization> <organization> … </organization> <stadium> <sname> Fenway Park </sname> <capacity> 33,000 </capacity> <yearBuilt> 1912 </yearBuilt> <ticket_high rate = "55"/> <ticket_low rate = "18"/> </stadium> <stadium> … <stadium> <player name="Pedro" number="45" rookieYear = "1991" /> <player name="Nomar" number="5" rookieYear = "1997" /> <player name="Manny" number="24" rookieYear = "1993" /> </sports>

  13. Example XQuery <bestPlayers> { For $p in document("sports.xml")/sports/organization Let $a = $p/team/text() Where $a = "Boston Red Sox" Return <playerName> $p/starPlayer/pname/text() <playerName> } </bestPlayers> List all of the star players’ names on the Boston Red Sox

  14. Tagger(<bestPlayers> V1 </bestPlayers> $pname = Navigate($p, starPlayer/pname/text()) V1 := Aggregate Tagger(<playerName> $pname </playerName> $a := Navigate($p, team/text()) $p := Navigate(“/”, sports/organization) Select($a = "Boston Red Sox") Source(“sports.xml”) XAT Tree for Example Query

  15. RDBMS Tables of Sports Info PlayerInfo Organization Stadium StarPlayer

  16. Partial Default XML View <Organization> <row> <organizationId> 1 </organizationId> <teamName> Boston Red Sox </teamName> <stadiumName> Fenway Park </stadiumName> </row> </Organization> <Stadium> <row> <stadiumID> 1 </stadiumID> <sname> Fenway Park </sname> … </row> </Stadium>

  17. Challenge Question I What is the XQuery that converts the document on the left (default XML view) to the document on the right (user view)? <Organization> <row> <organizationId> 1 </organizationId> <teamName> Boston Red Sox </teamName> <stadiumName> Fenway Park </stadiumName> </row> </Organization> … <StarPlayer> <row> <starPlayerName> Nomar </starPlayerName> <starPlayerPosition> shortstop</starPlayerPosition> <organizationId> 1 </organizationId> </row> </StarPlayer> … <organization> <team> Boston Red Sox </team> <stadium sname = "Fenway Park"/> <starPlayer> <pname> Nomar </pname> <position> Shortstop </position> </starPlayer> <starPlayer> <pname> Pedro </pname> <position> Pitcher </position> </starPlayer> <starPlayer> <pname> Manny </pname> <position> Outfield </position> </starPlayer> </organization>

  18. Mapping Query Part I B2 Create view invoice as ( <sports> FOR $organization IN view ("default") /Organization/row RETURN <organization> <team> $organization/teamName/text() </team> <stadium sname = $organization/stadiumName/text() /> FOR$starPlayer IN view ("default") /StarPlayer/row WHERE$starPlayer/organizationID = $organization/organizationID RETURN <starPlayer> <pname> $starPlayer/starPlayerName/text() </pname> <position> $starPlayer/starPlayerPosition/text() </position> </starPlayer> </organization> B1

  19. Mapping Query Part II FOR$stadium IN view ("default") /Stadium/row RETURN <stadium> <sname> $stadium/sname/text() </sname> <capacity> $stadium/capacity/text() </capacity> <yearBuilt> $stadium/yearBuilt/text() </yearBuilt> <ticket_high rate = $stadium/ticket_high_rate/text() /> <ticket_low rate = $stadium/ticket_low_rate/text() /> </stadium> FOR$player IN view ("default") /PlayerInfo/row RETURN <player name = $player/playerName/text() number = $player/playerNumber/text() rookieYear = $player/rookieYear/text() /> </sports> ) B3 B4

  20. Cutting Mapping Query • The mapping query has data that is unused by the user query, so we can get rid of it • B3 and B4 are completely removed • Remove stadium from B1 • Remove position from B2

  21. Mapping Query XAT General Form B1 FOR $organization Some Stuff $organization := Navigate("/",Organization/row) FOR $starPlayer $starPlayer := Navigate("/", StarPlayer/row) Source(“default.xml”) More Stuff Source(“default.xml”) B2 • Some Stuff will be shown in Part I • More Stuff in Part II

  22. O := Tagger(<sports> All </sports) All = Aggregate Tagger(<organization> V0 </organization) V0 := Aggregate Tagger (<team>$tname </team> ) Mapping Query XAT Part I B1 FOR $organization $tname := Navigate($organization, teamName/text()) FOR $starPlayer $starPlayer := Navigate("/", StarPlayer/row) To: Part II Source("default.xml") Some Stuff

  23. Select($starPlayerID = $ID) $sname := Navigate($starPlayer, starPlayerName) $starPlayerID := Navigate($starPlayer, OrganizationID) $ID := Navigate($organization, organizationID) Mapping Query XAT Part II More Stuff To: Part I Aggregate Tagger(<starPlayer> <pname> $sname </pname> </starPlayer) B2

  24. Decorrelated Mapping XAT Part I <sports> <organization> <team> Boston Red Sox </team> <starPlayer> <pname> Nomar </pname> </starPlayer> <starPlayer> <pname>Pedro </pname> </starPlayer> <starPlayer> <pname> Manny </pname> </starPlayer> </organization> </sports> O:= Tagger(<sports> All </sports>) All = Aggregate Tagger(<organization> V0 </organization) V0 := Aggregate Tagger (<team>$tname </team> $tname := Navigate($organization, teamName/text()) From Part II

  25. $ID := Navigate($organization, organizationID) Select($starPlayerID = $ID) Cartesian Product $sname := Navigate($starPlayer, starPlayerName) $starPlayerID := Navigate($starPlayer, organizationID) Decorrelated Mapping XAT Part II To Part I Aggregate Tagger(<starPlayer> <pname> $sname </pname> </starPlayer) $starPlayer := (Navigate"/", StarPlayer/row) $organization = Navigate("/", Organization/row) Source("default.xml") Source("default.xml")

  26. Progress Report • User inputs XQuery query • User Query is converted into an XML Algebra Tree (XAT) • Database Mapping Query’s XAT generated • Queries are Decorrelated • Trees are merged, unnecessary branches cut

  27. XAT merging • Input: • User Query XAT + Mapping Query XAT • Output: • Simplified composite XAT • Approach: • The Tagger from the top of the Mapping Query is linked to the bottom of the User Query. • The Source Operator at the bottom of the User Query is deleted • Pushdown Navigation • By using the commutative rules • Cancel out the navigation operators • By using the composition rules

  28. O:= Tagger(<sports> All </sports>) All = Aggregate Tagger(<organization> V0 </organization) V0 := Aggregate Tagger (<team>$tname </team> $tname := Navigate($organization, teamName/text()) Combined XAT Top of Mapping Query Tagger(<bestPlayers> V1 </bestPlayers> V1 := Aggregate Tagger(<playerName> $pname </playerName> Select($a = "Boston Red Sox") $a := Navigate($p, team/text()) User Query $pname = Navigate($p, starPlayer/pname/text()) $p := Navigate(O, sports/organization) Rest of Mapping Query

  29. Computation Pushdown Part I What is PushDown? • After merging the 2 XATs, there may be redundancies in the larger tree. • Ex: The user query and mapping query may navigate to the same thing • The decorrelated query tree may be unorganized and inefficient • Pushdown aims to eliminate these problems

  30. Computation Pushdown Part II • XPERANTO mentions pushdown as a means of pushing computation to relational engine • Niagara defines equivalence rules and specifies several different heuristics for using the rules

  31. XAT Pushdown Example Part I Top of Mapping Query Tagger(<bestPlayers> V1 </bestPlayers> V1 := Aggregate O:= Tagger(<sports> All </sports>) Tagger(<playerName> $pname </playerName> All = Aggregate Tagger(<organization> V0 </organization) Select($a = "Boston Red Sox") V0 := Aggregate $a := Navigate($p, team/text()) Tagger (<team>$tname </team> User Query $pname = Navigate($p, starPlayer/pname/text()) $tname := Navigate($organization, teamName/text()) $p := Navigate(O, sports/organization) Rest of Mapping Query

  32. XAT Pushdown Example Part II Top of Mapping Query Tagger(<bestPlayers> V1 </bestPlayers> V1 := Aggregate O:= Tagger(<sports> All </sports>) Tagger(<playerName> $pname </playerName> All = Aggregate Tagger(<organization> V0 </organization) Select($a = "Boston Red Sox") V0 := Aggregate $a := Navigate($p, team/text()) Tagger (<team>$tname </team> User Query $pname = Navigate($p, starPlayer/pname/text()) $tname := Navigate($organization, teamName/text()) $p := Navigate(O, sports/organization) Rest of Mapping Query

  33. XAT Pushdown Example Part III Tagger(<starPlayer> <pname> $sname </pname> </starPlayer) Tagger(<bestPlayers> V1 </bestPlayers> Select($starPlayerID = $ID) V1 := Aggregate $sname := Navigate($starPlayer, starPlayerName) Tagger(<playerName> $pname </playerName> $starPlayerID := Navigate($starPlayer, organizationID) Select($a = "Boston Red Sox") $ID := Navigate($organization, organizationID) Cartesian Product $a := Navigate($p, team/text()) $pname = Navigate($p, starPlayer/pname/text()) $starPlayer := (Navigate"/", StarPlayer/row) $organization = Navigate("/", Organization/row) User Query $p := Navigate(O, sports/organization) Source("default.xml") Source("default.xml")

  34. XAT Pushdown Example Part IV Tagger(<starPlayer> <pname> $sname </pname> </starPlayer) Tagger(<bestPlayers> V1 </bestPlayers> Select($starPlayerID = $ID) V1 := Aggregate $sname := Navigate($starPlayer, starPlayerName) Tagger(<playerName> $pname </playerName> $starPlayerID := Navigate($starPlayer, organizationID) Select($a = "Boston Red Sox") $ID := Navigate($organization, organizationID) Cartesian Product $a := Navigate($p, team/text()) $pname = Navigate($p, starPlayer/pname/text()) $starPlayer := (Navigate"/", StarPlayer/row) $organization = Navigate("/", Organization/row) User Query $p := Navigate(O, sports/organization) Source("default.xml") Source("default.xml")

  35. Challenge Questions II & III • What are some of the heuristics we could use during Pushdown? • What can / should we try to accomplish? • What should the tree look like afterwards? • How could we go about pushing things down? • What would the algorithm be? • How do we know if an operator can be pushed down? • When do we stop pushing an operator down?

  36. Computation Pushdown Part III • Goal: Tagger + SQL operators + XML operators • Use Equivalence rules repository to swap operators • Step 1: Navigation Pushdown. • Cancel Mapping Query Taggers and corresponding Aggregates • Delete redundant Navigates from User Query • Rename columns in Mapping Query • Step 2: SQL Computation Pushdown. • By commutative and composition rules.

  37. Equivalence Rules • Pair-wise rules that determine if one operator (parent) may be pushed through another (child) • Navigate / Navigate rule: If the parent depends on the child, they may not be swapped • Navigate / Join: Navigate is pushed to the side of the join that its entry point comes from • And many, many more

  38. Pushdown Results • Push Navigates to the correct side of Cartesian Product • Create a NameColumn operator that renames $tname into $a • Create a 2nd NameColumn operator that renames $pname into $sname • Get rid of all Taggers and Aggregates from Mapping Query and Navigates that were crossed out from User Query • Merge Select($starPlayerID = $ID) and Cartesian into a Join

  39. Tagger(<bestPlayers> V1 </bestPlayers> V1 := Aggregate Tagger(<playerName> $pname </playerName> Select($a = "Boston Red Sox") NameColumn( $a = $tname) NameColumn( $pname = $sname) XAT After Computation PushDown Part I From Part II

  40. $tname := Navigate($organization, teamName/text()) $starPlayerID := Navigate($starPlayer, OrganizationID) $ID := Navigate($organization, organizationID) $sname := Navigate($starPlayer, starPlayerName) $organization := Navigate("/",Organization/row) $starPlayer := Navigate("/", StarPlayer/row) Source("default.xml") Source("default.xml") XAT After Computation PushDown Part II To Part I Join on ($ID = $starPlayerID)

  41. Rest of the Process • Take the Combined XAT from the previous slide and generate a single SQL query. • Execute query on local RDBMS • Format result tuples according to Tagger • Return XML document to user

  42. Summary • Created XAT of the user query • Created XAT for mapping query • Cut information unused by user query • Decorrelated Mapping query • Merged two queries into 1 larger XAT • Identified weaknesses in combined tree • Walked through pushdown steps • Displayed final, optimized tree

  43. The End!!!

More Related