1 / 70

XML, XML Schema, XPath and XQuery Query Languages

XML, XML Schema, XPath and XQuery Query Languages. CS561. Slides collated from several sources, including D. Suciu at Univ. of Washington. XML Data. XML. W3C standard to complement HTML origins: structured text SGML motivation: HTML describes presentation XML describes content

hei
Télécharger la présentation

XML, XML Schema, XPath and XQuery Query Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington

  2. XML Data

  3. XML W3C standard to complement HTML • origins: structured text SGML • motivation: • HTML describes presentation • XML describes content • HTML e XML subset SGML CS561 - Spring 2007.

  4. From HTML to XML HTML describes the presentation CS561 - Spring 2007.

  5. HTML <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteboul, Buneman, Suciu <br> Morgan Kaufmann, 1999 CS561 - Spring 2007.

  6. XML <bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> XML describes the content CS561 - Spring 2007.

  7. XML Terminology • tags: book, title, author, … • start tag: <book>, end tag: </book> • elements: <book>…<book>,<author>…</author> • elements are nested • empty element: <red></red> abbrv. <red/> • an XML document: single root element well formed XML document: if it has matching tags CS561 - Spring 2007.

  8. XML: Attributes <bookprice = “55” currency = “USD”> <title> Foundations of Databases </title> <author> Abiteboul </author> … <year> 1995 </year> </book> attributes are alternative ways to represent data CS561 - Spring 2007.

  9. More XML: Oids and References <personid=“o555”> <name> Jane </name> </person> <personid=“o456”> <name> Mary </name> <childrenidref=“o123 o555”/> </person> <personid=“o123” mother=“o456”><name>John</name> </person> oids and references in XML are just syntax CS561 - Spring 2007.

  10. So Far • Differences between “xml data” versus “relational data” ? • Data model? • Typed? • Homogeneity? • Correctness? • Usage/Purpose ? CS561 - Spring 2007.

  11. “XML Data Model” Numerous competing models: • Document Object Model (DOM): • class hierarchy (node, element, attribute,…) • defines API to inspect/modify the document • XML query data model (formal) CS561 - Spring 2007.

  12. XML Namespaces • http://www.w3.org/TR/REC-xml-names • name ::= [prefix:]localpart <bookxmlns:isbn=“www.isbn-org.org/def”> <title> … </title> <number> 15 </number> <isbn:number> …. </isbn:number> </book> CS561 - Spring 2007.

  13. defined here XML Namespaces • syntactic: <number> , <isbn:number> • semantic: provide URL for “shared” schema <tagxmlns:mystyle = “http://…”> … <mystyle:title> … </mystyle:title> <mystyle:number> … </tag> CS561 - Spring 2007.

  14. So Far • What are “namespaces” good for ? • Are they typically available for relational databases? CS561 - Spring 2007.

  15. Schemas for XML

  16. DTD - Element Type Definitions <!ELEMENT paper (title,author*, year, (journal|conference) )> CS561 - Spring 2007.

  17. XML Schemas • generalizes DTDs (SGML derivative) • now, instead uses XML syntax • two main documents: structure and data types • XML Schema more powerful but more complex CS561 - Spring 2007.

  18. XML Schema <xsd:elementname=“paper” type=“papertype”/> <xsd:complexTypename=“papertype”> <xsd:sequence> <xsd:elementname=“title” type=“xsd:string”/> <xsd:elementname=“author” minOccurs=“0”/> <xsd:elementname=“year”/> <xsd:choice> < xsd:elementname=“journal”/> <xsd:elementname=“conference”/> </xsd:choice> </xsd:sequence> </xsd:complexType </xsd:element> DTD: <!ELEMENT paper (title,author*,year, (journal|conference))> CS561 - Spring 2007.

  19. So Far • Differences between “xml schema” versus “relational schema” ? • Purpose ? Do we need it ? • Definition time? • Strictness of typing ? • Underlying model ? CS561 - Spring 2007.

  20. Elements versus Types in XML Schema DTD: <!ELEMENT person (name, address) > <xsd:elementname=“person”> <xsd:complexType> <xsd:sequence> <xsd:elementname=“name” type=“xsd:string”/> <xsd:elementname=“address”type=“xsd:string”/> </xsd:sequence> </xsd:complexType></xsd:element> <xsd:elementname=“person”type=“ttt” /><xsd:complexType name=“ttt”> <xsd:sequence> <xsd:elementname=“name” type=“xsd:string”/> <xsd:elementname=“address”type=“xsd:string”/> </xsd:sequence></xsd:complexType> CS561 - Spring 2007.

  21. Elements versus Types in XML Schema • Types: • Simple types (integers, strings, ...) • Complex types (regular expressions, like in DTDs) • Element-type-element alternation: • Root element has a complex type • Complex type is a regular expression of elements • Those elements have their complex types ... • ... • Leaves have simple types CS561 - Spring 2007.

  22. Local and Global Types in XML Schema • Local type: <xsd:elementname=“person”> [define locally the person’s type] </xsd:element> • Global type: <xsd:elementname=“person” type=“ttt”/> <xsd:complexType name=“ttt”> [define here the type ttt] </xsd:complexType> Global types: can be reused in other elements CS561 - Spring 2007.

  23. Local v.s. Global Elements inXML Schema • Local element: <xsd:complexType name=“ttt”> <xsd:sequence> <xsd:elementname=“address” type=“...”/>... </xsd:sequence> </xsd:complexType> • Global element: <xsd:elementname=“address” type=“...”/> <xsd:complexType name=“ttt”> <xsd:sequence><xsd:elementref=“address”/> ... </xsd:sequence> </xsd:complexType> Global elements: like in DTDs CS561 - Spring 2007.

  24. Regular Expressions in XML Schema Recall the element-type-element alternation: <xsd:complexType name=“....”> [regular expression on elements] </xsd:complexType> Regular expressions: • <xsd:sequence> A B C </...> • <xsd:choice> A B C </...> • <xsd:group> A B C </...> • <xsd:... minOccurs=“0”maxOccurs=“unbounded”> ..</...> • <xsd:... minOccurs=“0”maxOccurs=“1”> ..</...> CS561 - Spring 2007.

  25. Regular Expressions in XML Schema Regular expressions: • <xsd:sequence> A B C </...> = A B C • <xsd:choice> A B C </...> = A | B | C • <xsd:group> A B C </...> = (A B C) • <xsd:... minOccurs=“0”maxOccurs=“unbounded”> ..</...> = (...)* • <xsd:... minOccurs=“0”maxOccurs=“1”> ..</...> = (...)? CS561 - Spring 2007.

  26. Derived Types by Extensions <complexTypename="Address"> <sequence> <elementname="street" type="string"/> <elementname="city" type="string"/> </sequence> </complexType> <complexTypename="USAddress"> <complexContent> <extensionbase= "ipo:Address"> <sequence> <elementname="state" type="ipo:USState"/> <elementname="zip" type="positiveInteger"/> </sequence> </extension> </complexContent> </complexType> Corresponds to inheritance CS561 - Spring 2007.

  27. Key Constraints in XML

  28. Keys in XML Schema XML: • <purchaseReport> • <regions> • <zipcode="95819"> • <partnumber="872-AA" quantity="1"/> • <partnumber="926-AA" quantity="1"/> • <partnumber="833-AA" quantity="1"/> • <partnumber="455-BX" quantity="1"/> • </zip> • <zipcode="63143"> • <partnumber="455-BX" quantity="4"/> • </zip> • </regions> • <parts> • <partnumber="872-AA">Lawnmower</part> • <partnumber="926-AA">Baby Monitor</part> • <partnumber="833-AA">Lapis Necklace</part> • <partnumber="455-BX">Sturdy Shelves</part> • </parts> • </purchaseReport> XML Schema for Key : <keyname="NumKey"> <selectorxpath="parts/part"/> <fieldxpath="@number"/> </key> CS561 - Spring 2007.

  29. Keys in XML Schema • In general, syntax is : <keyname=“someDummyNameHere"> <selectorxpath=“p"/> <fieldxpath=“p1"/> <fieldxpath=“p2"/> . . . <fieldxpath=“pk"/> </key> Notes: All XPath expressions “start” at the element currently being defined The fields must identify a single “node”. CS561 - Spring 2007.

  30. Keys in XML Schema • Unique = guarantees uniqueness • Key = guarantees uniqueness and existence • All XPath expressions are “restricted”: • /a/b | /a/c OK for selector • //a/b/*/c OK for field • Note: better than DTD’s ID mechanism CS561 - Spring 2007.

  31. Examples of Keys in XML Schema • Examples • <keyname="fullName"> • <selectorxpath=".//person"/> • <fieldxpath="firstname"/> • <fieldxpath="surname"/> • </key> • <uniquename="nearlyID"> • <selectorxpath=".//*"/> • <fieldxpath="@id"/> • </unique> Note: Must have single firstname, Single surname CS561 - Spring 2007.

  32. Foreign Keys in XML Schema • Example • <keyrefname="personRef" refer="fullName"> • <selectorxpath=".//personPointer"/> • <fieldxpath="@first"/> • <fieldxpath="@last"/> • </keyref> CS561 - Spring 2007.

  33. So Far • Differences between “keys/foreign-keys”in xml versus relational model? • Purpose ? • Underlying model ? CS561 - Spring 2007.

  34. XPath “The Basic Building Block”

  35. XPath • Goal = Permit access some nodes from document • XPath main construct : Axis navigation • Navigation step : axis + node-test + predicates • Examples • descendant::node() • child::author • attribute::booktitle=“XML” CS561 - Spring 2007.

  36. XPath • XPath path consists of one or more navigation steps, separated by “/” • Navigation step : axis + node-test + predicates • Examples • /descendant::node() /child::author • /descendant::node() /child::author [parent /attribute::booktitle =“XML”][2] • XPath offers shortcuts : • no axis means child • // º /descendant-or-self::node()/ CS561 - Spring 2007.

  37. context node aaa ccc aaa aaa ccc 2 3 1 bbb bbb 4 5 6 7 XPath- Child Axis Navigation • author is shorthand for child::author. • Examples: • aaa -- all the children nodes labeled aaa • aaa/bbb -- all the bbb grandchildren of aaa children • */bbb all the bbb grandchildren of any child • Notes: • . -- the context node • / -- the root node CS561 - Spring 2007.

  38. XPath- Child Axis Navigation • /doc -- all doc children of the root • ./aaa -- all aaa children of the context node (equivalent to aaa) • text() -- all text children of context node • node() -- all children of the context node (includes text and attribute nodes) • .. -- parent of the context node • .// -- the context node and all its descendants • // -- the root node and all its descendants • //text() -- all the text nodes in the document CS561 - Spring 2007.

  39. Predicates • [2] -- the second child node of the context node • chapter[5] -- the fifth chapter child of context node • [last()] -- the last child node of the context node • chapter[title=“introduction”] -- the chapter children of the context node that have one or more title children whose string-value is “introduction” (string-value is concatenation of all text on descendant text nodes) • person[.//firstname = “joe”] -- the person children of the context node that have in their descendants a firstname element with string-value “Joe” CS561 - Spring 2007.

  40. Axis navigation • So far, our expressions have moved us down by moving to children nodes. • Exceptions are : • . stay where you are • / go to the root • // all descendants of the root • .// all descendants of the context node CS561 - Spring 2007.

  41. Axis navigation • XPath has several axes: ancestor, ancestor-or-self, attribute, child, descendant, descendant-or-self, following, following-sibling, namespace, parent, preceding, preceding-sibling, self • Some of these describe single nodes: • self, parent • Some describe sequences of nodes: • All others CS561 - Spring 2007.

  42. XPath Navigation Axes ancestor preceding-sibling following-sibling self child attribute preceding following namespace descendant CS561 - Spring 2007.

  43. XPath Abbreviated Syntax (nothing) child:: @ attribute:: // /descendant-or-self::node() . self::node() .// descendant-or-self::node .. parent::node() / (document root) CS561 - Spring 2007.

  44. So Far Differences between SQL and XPATH? • What are similar query capabilities? • What features does SQL have, but not XPATH? • What features does XPATH support, but not SQL? • Is XPath a full-fledged query language? CS561 - Spring 2007.

  45. Query Languages - XQuery

  46. Summary of XQuery • FLWR expressions • FOR and LET expressions • Collections and sorting Resources XQuery: A Query Language for XML Chamberlin, Florescu, et al. W3C recommendation: www.w3.org/TR/xquery/ CS561 - Spring 2007.

  47. XQuery • Designed based on Quilt (which is based on XML-QL) • http://www.w3.org/TR/xquery/2/2001 • XML Query data model (ordered) CS561 - Spring 2007.

  48. FLWR (“Flower”) Expressions FOR ... LET... FOR... LET... WHERE... RETURN... CS561 - Spring 2007.

  49. XQuery Find the titles of all books published after 1995: FOR$xINdocument("bib.xml")/bib/book WHERE$x/year > 1995 RETURN$x/title How does result look like? CS561 - Spring 2007.

  50. XQuery Find all book titles published after 1995: FOR$xINdocument("bib.xml")/bib/book WHERE$x/year > 1995 RETURN$x/title Result: <title> abc </title> <title> def </title> <title> ghi </title> CS561 - Spring 2007.

More Related