1 / 21

Managing XML and Semistructured Data

Managing XML and Semistructured Data. Lecture 1: Preliminaries and Overview. Prof. Dan Suciu. Spring 2001. In this lecture. Goals of the course Prerequisites Resources textbooks research papers Overview of the course. Goals of the Course. Purpose:

gaia
Télécharger la présentation

Managing XML and Semistructured Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Managing XML and Semistructured Data Lecture 1: Preliminaries and Overview Prof. Dan Suciu Spring 2001 Managing XML and Semistructured Data

  2. In this lecture • Goals of the course • Prerequisites • Resources • textbooks • research papers • Overview of the course Managing XML and Semistructured Data

  3. Goals of the Course Purpose: • Foundations of semistructured data • Issues in semistructured data management • Glimpse at current XML standards and technology Managing XML and Semistructured Data

  4. Prerequisites • A graduate course in database systems • Logic • Programming languages • Complexity theory • Algorithms and data structures Managing XML and Semistructured Data

  5. Textbooks • Data on the Web: from Relations, to Semistructured Data and XML, Abiteboul, Buneman, Suciu • For foundations • W3C homepage, www.w3.org • For current standards • Professional XML Databases,Kevin Williams • For current XML technologies Managing XML and Semistructured Data

  6. Other Useful Texts • A first course in database systems (2 vols) Ullman, Widom and Garcia-Molina • Data and Knowledge based Systems (2 vols) Ullman • Foundations of data bases Abiteboul, Hull Vianu • Proceedings of SIGMOD, VLDB, PODS conferences. Managing XML and Semistructured Data

  7. Papers: Data Models • XML, Java, and the future of the Web by Jon Bosak, Sun Microsystems. • W3C XML Query Data Model Mary Fernandez, Jonathan Robie. • Adding structure to semistructured data by Buneman, Davidson, Fernandez, Suciu, in ICDT 97 • Object Exchange Across Heterogeneous Information Sources Y. Papakonstantinou and H. Garcia-Molina and J. Widom, Data Engineering 95 Managing XML and Semistructured Data

  8. Papers: Query Languages • A formal semantics of patterns in XSLT by Phil Wadler. • XQuery: A Query Language for XML Chamberlin, Florescu, et al. • XML-QL: A Query Language for XML by Deutsch, Fernandez, Florescu, Levy, Suciu, in WWW8. • Catching the boat with Strudel VLDBJ 2001. • UnQL: A Query Language and Algebra for Semistructured Data Based on Structural Recursion Buneman, Fernandez, Suciu.VLDBJ 2000 • The Lorel Query Language for Semistructured Data  by Abiteboul, Quass, McHugh, Widom, Wiener, in International Journal on Digital Libraries, 1997. Managing XML and Semistructured Data

  9. Papers: Schemas • MSL: A Model for W3C XML Schema by Brown, Fuchs, Robie, Wadler, in WWW10, 2001. • Keys for XML by Buneman, Davidson, Fan, Hara, Tan, in WWW10, 2001. • Subsumption for XML Types by Kuper and Simeon, ICDT'2001. • Extracting Schema from Semistructured Data Nestorov, Abiteboul, Motwani. SIGMOD 98 Managing XML and Semistructured Data

  10. Papers: Query Analysis, Typechecking • Optimizing Regular Path Expressions Using Graph Schemas Fernandez, Suciu, ICDE'98. • XDuce: A typed XML processing language by Hosoya and Pierce • Regular Expresssion Pattern Matching for XML by Hosoya and Pierce (in POPL 2001) • Typechecking for XML TransformersMilo, Vianu, Suciu. Managing XML and Semistructured Data

  11. Papers: Indexing • Index Structures for Path Expressions by Milo and Suciu, in ICDT'99. Managing XML and Semistructured Data

  12. Papers: Publishing • Efficiently Publishing Relational Data as XML Ducments  by Shanmugasundaram, Shekita, Barr, Carey, Lindsay, Pirahesh, Reinwald in VLDB'2000 • SilkRoute: Trading between relations and XML by Fernandez, Suciu, Tan R, in WWW9, 2000 • Efficient Evaluation of XML Middle-ware Queries in SIGMOD'2001 Managing XML and Semistructured Data

  13. Papers: Compression • XMILL: An Efficient Compressor for XML Data by Liefke and Suciu, in SIGMOD'2001 Managing XML and Semistructured Data

  14. Overview • Semistructured Data • Model • Syntax • Comparison with relational data Managing XML and Semistructured Data

  15. Overview • XML • Motivation • Syntax: • Basic stuff: elements, attributes, content • Esoteric stuff: PIs, entities, CDATA, comments • DTDs • Data model (XQuery) • Miscellaneous: Name spaces, XPointer, XLink Managing XML and Semistructured Data

  16. Overview • Query Languages • Lorel extends OQL • UnQL structural recursion, patterns • StruQL Skolem Functions • XML-QL everything for XML • Quilt/Xquery the standard • XSL the standard • XDuce a general-purpose language Managing XML and Semistructured Data

  17. Overview • Schemas • Theory: lower bound, upper bound • XML-Schema • “XML-Schema are regular tree languages” • Constraints (keys for XML) Managing XML and Semistructured Data

  18. Overview • Query analysis • Query pruning • Query containment Managing XML and Semistructured Data

  19. Overview • XML Publishing from Relational Databases • Virtual XML publishing: SilkRoute, Microsoft’s XDR • Materialized XML publishing: Experanto, SilkRoute, Microsoft’s “for XML” Managing XML and Semistructured Data

  20. Overview • Indexes • Indexes for ss data: data guides, T-indexes • Indexes for XML: we are still waiting for them... Managing XML and Semistructured Data

  21. Overview • Miscellaneous • XML compression (Xmill) Managing XML and Semistructured Data

More Related