1 / 26

Schema Mediated Exchange of Temporal XML Data

Schema Mediated Exchange of Temporal XML Data. Curtis Dyreson – Washington State University Richard T. Snodgrass – University of Arizona Sabah Currim – University of Arizona Faiz Currim – University of Iowa. Scenario. Genomic data from NCBI

talen
Télécharger la présentation

Schema Mediated Exchange of Temporal XML Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Schema Mediated Exchange of Temporal XML Data Curtis Dyreson –Washington State University Richard T. Snodgrass –University of Arizona Sabah Currim – University of Arizona Faiz Currim – University of Iowa ER 2006 - Tucson

  2. Scenario • Genomic data from NCBI • Data collection is growing/changing • Want data and data provenance (who, what, when, …)

  3. Request D NCBI D D Obtaining Web Data Overwrite D Write D

  4. Data Evolves • Download XML formatted data (as of January 1) <gene name="TRY4"> <desc>trypsin 4</desc> <ontology ref="MGI" function="unknown"/> </gene> • Download again (as of March 6) <gene name="TRY4"> <desc>trypsin 4, beta-cell receptor</desc> <ontology ref="MGI" function="synthesizes trypsinogen"/> </gene>

  5. Request updates to Dsince time t Copy D Update D NCBI D Dold Change summary of D XMLDiff Refreshing the Data (using SDOs) What about versions between D and Dold? Did I download “valid” data? My DB is pretty big…

  6. Valid Did I Download the Right Data? • Validate against schema Schema Namespace Validating Parser XML Data

  7. Fragment of the Genomic Schema … <element name="gene"> <complexType> <attribute name="name" type="text" use="required"/> <sequence> <element name="desc" type="string"/> <element ref="ontology" minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </element> …

  8. Uses of an XML Schema • Validation • XML editors • Guides query formulation • Query optimization • Provides a web service binding

  9. Request updates to Dsince time t Extend history of D t NCBI now D D Temporal D ΔD[t,now] Temporal Schema Which elements vary over time A Temporal Data Collection • Validate the “delta” with the temporal schema • cost is size of change

  10. Outline • Motivation • tXSchema • Architecture • Summary

  11. Goals for a Temporal Schema • Make it easy to create a schema for temporal data • Identify which data is temporal • Upwards compatibility • Minimal extensions of XML Schema • Reuse off-the-shelf parsers/tools • Support • Valid and transaction time • Data (element) versioning • Schema versioning • Logical/physical independence • Flexible timestamp representation and location

  12. … Persistent Elements • An item is an element that persists across snapshots. • Item identifier (like a temporally-invariant key) <txs:itemIdentifier> <txs:field path=”@name”/> </txs:itemIdentifier> January snapshot March snapshot

  13. Extend a Snapshot Schema • Specify which elements are temporal • Temporal elements have • Item identifiers • Simple constraints (state/event, existence/content-varying) <element name="gene"> <txs:temporal> <txs:itemIdentifier> <txs:field path="@name"/> </txs:itemIdentifier> <txs:transactionTime kind="state" contentVarying="true" existenceVarying="no gaps"/> </txs:temporal> ….definition of gene from the snapshot schema omitted for space… </element>

  14. Versions • A version is a change in an item. • DOM inequivalence January snapshot March snapshot … …

  15. Temporal Genomic Data <dataTemporal> <data><geneTemporal itemRef="1"/></data> <geneItem itemId="1"> <geneVersion><time start="2005-01-01" end="2005-03-05"/> <gene name="TRY4"> <desc>trypsin 4</desc> <ontologyTemporalitemRef="2"/> </gene> </geneVersion> <geneVersion><time start="2005-03-06" end="now"/> …next version of gene… </geneVersion> </geneItem> <ontologyItem itemId="2"> …ontology item… </dataTemporal>

  16. Outline • Motivation • tXSchema • Architecture • Summary

  17. Construction Process Representational Schema Valid Valid Temporal Data Not valid Validating Temporal Data • Snapshot data validated with a snapshot schema • Construct a representational schema (details in paper) • Can also validate the “delta” Snapshot Schema Namespace Validating Parser XML Data

  18. At time T Validating Parser Snapshot Snapshot Schema Property of a “Good” Construction • Every snapshot must conform to the snapshot schema Temporal Schema (Temporal) Validating Parser Valid Temporal data Valid

  19. Outline • Motivation • tXSchema • Architecture • Summary

  20. Related Work – Temporal XML • Change detection and management • Nguyen, Abiteboul, Cobena, Preda, SIGMOD 2001 • Xyleme’s Alerter, described in Data Engineering Bulletin, 2001 • Dyreson, Lin, Wang WWW 2004 • Leonardi, Bhowmick, ER 2006 • Representing time-varying XML documents (versioning) • Chawathe, Abiteboul, Widom, ICDE 1998 • Dyreson, Böhlen, Jensen, VLDB 1999 • Chien, Tsotras, Zaniolo, VLDB 2000 • Marian, Abiteboul, Cobena, Mignet, VLDB 2001 • Buneman, Khanna, Tajima, Tan, SIGMOD 2002, TODS 2004 • Rosado, Marquez, Gonzalez, ECDM 2006 • XML Versioning Use Cases (W3C)

  21. Related Work – XML Schemas • XML Schema languages • Many, but XML Schema is backed by the W3C • Incremental XML validation • Bouchou & Halfeld-Ferrari, DBPL 2003 • Papkonstantinou & Vianu, ICDT 2003 • Barbosa, Mendelzon, Libkin, Mignet, Arenas, ICDE 2004 • Temporal XML schemas • Currim, Currim, Dyreson, Snodgrass, EDBT 2004 • Dyreson, Snodgrass, Currim, Currim, Joshi, XSDM 2006

  22. Aspect Enhanced .java weaver Cut points javac An Overarching Vision • Aspect-oriented programming • Cross-cutting concerns • Augment behavior without changing the code • Example aspects: logging, garbage collection Program .java Aspect .java

  23. time security reliability Aspects for Data? • What are cross-cutting concerns? • Milieu of metadata • Time is an aspect

  24. imports schema Validation aspect + XML data conventional validating parser imports schema snapshot data snapshot gluer aspect validator Aspects in Schema Design • Schema for aspect + schema for data • Our paper describes the “plumbing” for a temporal aspect data (snapshot) schema aspect schema schema tapestry schema weaver

  25. Our Contributions • Temporal schema specification • What is time-varying • Some simple constraints • Validate temporal data • ΔD[t-now] cost • Upwards compatible with XML Schema • Handle schema evolution (Dyreson et al., XSDM ’06) • Suite of tools • Reuse and extend existing tools • www.cs.arizona.edu/tau

  26. tXSchema Project Tools (Beta) • tVALIDATOR – Validating temporal XML document for conventional and temporal constraints • SQUASH – Generating a temporal document from a sequence of snapshot documents • UNSQUASH – Extracting snapshot documents from a temporal document • RESQUASH – Changing a document representation to be consistent with the new physical annotation.

More Related