Processing XML data using a relational database: Schema-Based XML Storage

Processing XML data using a relational database: Schema-Based XML Storage By Khang Nguyen Based on the paper of Rajasekar Krishnamurthy

Three main points on the query translation problem • Developing query translation algorithms for the case when the XML Schema and/or the XML query may be recursive. • Designing algorithms that make better use of the XML-to-Relational mapping information during the query translation process. • Studying the interaction between the two sub problems: choosing a good relational decomposition for storing the XML data and choosing a query translation algorithm.

Recursive Schemas and Recursive Queries • Has been a lot of work on alternative relational decompositions for XML data, not much on query translation algorithms. • [Choi02] out of 60 XML schemas analyzed, 35 were recursive. Recursive XML schemas are important. • Descendant operator (//) specifies ancestor-descendant relationships. • i.e., the query //section/title is a recursive query.

Recursive Schemas and Recursive Queries (Cont.) – Interesting Issues • How do we translate path expression queries over arbitrary XML-to-Relational mappings into equivalent SQL queries? • Is the support for recursion in SQL3 sufficient for supporting path expression queries over arbitrary XML-to-Relational mapping? • Are there any issues in the translation process when the XML schema is non-recursive? • Does XPath semantics introduce any interesting challenges?

Mapping-aware Query Translation Algorithm

Mapping-aware Query Translation Algorithm (Cont.) • Query: retrieve all the top-level section titles. • XQuery: • for $title in document(*)/book/section/title • SQL query: • Select S.title • From Book B, Section S • Where B.id = S.parentid and S.parentcode = 1 • Mapping-aware algorithm query: • Select title • From Section • Where parentcode = 1

Are the two sub problems independent? • One is to pick a good relational decomposition and the other is to translate queries over this XML-to-Relational mapping. • The two sub problems can’t be solved in isolation. • There exist query translation algorithms T1 and T2, and relational decomposition D1 and D2. If we use T1, then D1 is better than D2 while with T2, then D2 is better than D1.

Yes, the two sub problems dependent

Yes, the two sub problems are dependent (Cont.) • On the 100MB XMark dataset [11], we noticed that XQ2fg was about three times faster than XQ2fp. • So, we see that for query Q, with algorithm NaiveTranslation, the fully partitioned strategy is better, whereas with algorithm MultipleScan, the fully grouped strategy is better. • As a result, the quality of a decomposition is closely related to the query translation algorithm used.

Processing XML data using a relational database: Schema-Based XML Storage

Processing XML data using a relational database: Schema-Based XML Storage

Presentation Transcript

Non-Relational Database

Relational Database

3.1.5: Relational Database Concepts

The Relational Data Model

Relational Database

Relational Database Schema normalizer

Database Management Systems

Database Models: Flat Files and the Relational Database

Data Mining Lecture 2: DBMS, DW, OLAP, and Data Preprocessing

Database Principles

The Relational Data Model

Chapter 3

IT 20303

Database Design

Database Languages

Chapter 9 Relational Database Design by ER- and EER-to-Relational Mapping

Chapter 3 The Relational Database Model

Chapter 7: Relational Database Design

Chapter 3

Chapter 37 Java Database Programming

Objectives