1 / 18

Storing and Querying Ordered XML Using a Relational Database System

Storing and Querying Ordered XML Using a Relational Database System. By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas. Introduction.

damian-snow
Télécharger la présentation

Storing and Querying Ordered XML Using a Relational Database System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas

  2. Introduction • Researchers proposed using relational database system to decompose XML documents into relations and translate XML queries into SQL queries over these relation. • The paper will answer the question of how (and whether) the ordered XML data model can be effectively supported by the unordered relational data model.

  3. Introduction (Cont.) • The paper proposes three order encoding methods (Global Order, Local Order, and Dewey Order) to represent XML order in the relational data model. • The paper will also answer the question of when and why to use which encoding method.

  4. 1. Ordered XML: Data Model, Query Language and Query Dimensions • 1.1 The XML Data Model • An XML can be viewed as a tree. • Leaf Nodes = Data Values (text). • Internal Nodes = XML elements. • Document Order = the orders of the elements in the XML document.

  5. 1. Ordered XML: Data Model, Query Language and Query Dimensions • 1.2 Order in XML Query Language • 1.2.1 XPath • is a language for specifying navigation within an XML document. • An XPath expression’s syntax: • Path ::= /Step1/Step2/…/Stepn • Path ::= /films/film/writers/writer/Murray Burnett • An XPath expression is evaluated sequentially, “step” by “step”.

  6. 1. Ordered XML: Data Model, Query Language and Query Dimensions (Cont.) • 1.2.2 XQuery • Is a more complex language based on XPath. • Has all the functionalities of XPath. • Includes Before and After operators that take two node sequences (XPath expressions) and • Return the nodes from the first sequence that are before or after some node in the second sequence.

  7. 1. Ordered XML: Data Model, Query Language and Query Dimensions (Cont.) • 1.3 Evaluation Modes for XML Queries • 1.3.1 Select Mode: • The nodes in an input XML document are assumed to have unique identifiers (IDs). • The results of executing an XPath expression is an ordered set of node IDs. • 1.3.2 Reconstruct Mode: • Combines selection and extraction. • The result of evaluating an XPath expression in reconstruct mode is an ordered set of XML elements.

  8. 2. XML Order Encoding Methods • To capture the document order in the relational data model is accomplished by encoding each node’s position in an XML document as a data value. • Unfortunately, there is no single encoding method, which is optimal for both queries and updates.

  9. 2. XML Order Encoding Methods (Cont.) • 2.1 Global Order Encoding: • Each node is assigned a number that represents the node’s absolute position in the document. • Poor insertion performance is the primary weakness. • To improve updates’ performance, use sparse numbering which does not require the remaining nodes be numbered when deleting XML fragments.

  10. 1 play 2 4 title act act 3 5 7 #text title scene 6 #text Global Order Global Order Encoding

  11. 2. XML Order Encoding Methods (Cont.) • 2.3 Local (Sibling) Order Encoding: • Each node is assigned a number that represents its relative position among its siblings. • Combining a node’s position with that of its ancestors yields a path vector that uniquely identifies the absolute position of the node within the document. • The advantage is the low overhead incurred by updates.

  12. 1 play 3 1 2 title act act 1 1 2 #text title scene 1 #text Local Order Local Order Encoding

  13. 2. XML Order Encoding Methods (Cont.) • 2.3 Dewey Order Encoding: • Each node is assigned a vector that represents the path from the document’s root to the node. • Is “lossless” because each path uniquely identifies the absolute position of the node within the document. • One potential disadvantage is the extra space required to store paths from the root to each node.

  14. 1 play 1.3 1.1 1.2 title act act 1.1.1 1.2.1 1.2.2 #text title scene 1.2.1.1 #text Dewey Order Dewey Order Encoding

  15. 3. Shredding Ordered XML into Relations • 3.1 The Schema-less Case: • The schema of input document is unknown. • The Edge shredding approach is proposed. • A single relation, the Edge table, is used to store an entire document. • The Edge table is defined • Edge(id, parent, name, value) • To reduce storage, a separate relation (the Path table) can be used to store paths and their identifiers.

  16. 3. Shredding Ordered XML into Relations (Cont.) • 3.1.1 Storing Order Information • Global Order • Edge(id, parent_id, end_des_id, path_id, value) • Local Order • Edge(id, parent_id, sIndex, path_id, value) • Dewey Order • Edge(dewey, path_id, value)

  17. 3. Shredding Ordered XML into Relations (Cont.) • 3.2 The Schema-aware Case: • Inlining shredding technique can be used when an XML Schema (DTD) is known. • One advantage is the possibility of more efficient navigation from an element to its sub-elements. • Another advantage is Inlining shreds and stores XML documents into a set of tables. Queries tend to access less data and perform better.

  18. Summary • Relational database systems can support most ordered XML queries efficiently. • Global Order is best for query-mostly workloads. Dewey Order is best for a mix of queries and updates. Local Order is best for update-intensive environments. • Edge and Inlining methods are well known for translating ordered XML queries into SQL queries.

More Related