1 / 21

End-to-End XML Data Description: XDF

End-to-End XML Data Description: XDF. PI. Ed Shaya Brian Thomas, Zhenping Huang, James Gass, James Blackwell, Gail Schneider (RITSS) NASA ATR: Cynthia Cheung http://xml.gsfc.nasa.gov. Introduction. Time to move beyond just moving data arrays from machine to machine.

ely
Télécharger la présentation

End-to-End XML Data Description: XDF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. End-to-End XML Data Description: XDF PI. Ed Shaya Brian Thomas, Zhenping Huang, James Gass, James Blackwell, Gail Schneider (RITSS) NASA ATR: Cynthia Cheung http://xml.gsfc.nasa.gov

  2. Introduction • Time to move beyond just moving data arrays from machine to machine. • Develop Data Model that is rich enough to standardize: • Complex Structures, Axial/Coordinate information, Error bars, named subsets/sections/tiles, arbitrary tuples, physical units. • Allow browsers, visualization programs, image processing packages to automatically display complex data properly. • Use OO inheritance so that ES, Astro, Space Physics, Biology etc. all use same basic schema. • Share / Reuse software across disciplines • Exchange data across disciplines • Use one basic schema from spacecraft to publishing house. Thereby, maintain metadata.

  3. Introduction (cont’d) • Use XML • Get in sync with world wide programming effort and standardization efforts of the w3c • Browsers, XML Tools (parsing, transformation, styling, management, validation, web publishing), and Semantic Web Effort (RDF, DAML). • Get in sync with publishing houses • They use SGML, a superset of XML. • Make use of XML Schema Inheritance/Extension. • Eg. Basic XDF underlies CDFML and FITSML • Eg. FITSML underlies GalaxyDB.

  4. XDF - eXtensible Data Format • An XML Data Description/Format • Can create 100% XML tagged data documents • Can wrap legacy data formats. Includes fixed-width ASCII , delimited ASCII, or binary data. • Data can be included in document file or separate. • XML Logical document can be physically many files • Package them together with zip, tar, jar, Web Folder, or SOAP. • Designed with Data Object Model. • Hierarchical structure of N-dimensional. • Axis/coordinate information standardized. • Tuples (vectors, complex) handled by adding a dimension. • Tables and images are conceptually merged. • Regions within a cube can be named/addressed.

  5. XDF - eXtensible Data Format(cont’d) • Error bars, limits and uncertainties are standardized. • Notes can be added to any item including a datum in an external binary file. Tokens are defined. • Tables are special case of N-dimensional dataCubes. • Slices intercepting a fieldAxis can have local units, notes, and special values. • Traditional 2d tables are special cases of these. • A Perl and Java Package implements an API. • Conversion from XDF to legacy formats (FITS, CDF, HDF) made easy by connecting Java IO package.

  6. XDF - Arrays within Arrays • Array cells can contain references to other arrays. • Allows for Hierarchical Mesh for rapid spatial search. At bottom are files each pertaining to small part of space. • Also, can append arrays to arrays by using align and append attributes of array.

  7. <XDF> <structure name=“structure 1”> <array name=“array 1”> <units><unit>Jansky</unit></units> <dataFormat><float width="5" precision="2"/></dataFormat> <axis name=“x”>…</axis> <axis name=“y”>…</axis> … <data href=“data.dat”/> </array> <array name=“array 2”> ... </array> </structure> <array name=“array 3”> ... </array> </XDF> <!ELEMENT structure (parameter, array

  8. Axis Element An axis has a name, a unique id (used later to establish read/write ordering), units and values. The values are a list of numbers or names, but there is a shortcut (used here) for generating a series. Later versions will have arbitrary functions. <axis name="x" axisId="x-axis” axisDatatype="integer" > <axisUnits> <unit>km</unit> <unit power=“-1”>s</unit> </axisUnits> <valueList size="10" step=”-2" start=”23" /> </axis> (Not Your Father’s) Arrays

  9. Read Element Read elements give the general layout of the data. ‘For’ elements tell how to loop through the data by specifying the ‘inner’ (faster) axis and ‘outer’ (slow changing) axis. The dataFormats are pre-specified, so need only to specify spaces and CRs. <read> <for axisIdRef="y-axis"> <for axisIdRef="x-axis"> <repeat count="9"> <readCell /> <skipChars count="1" /> </repeat> <readCell /> <skipChars count="1" output="&#10;" /> </for> </for> </read> (Not Your Father’s) Arrays

  10. Tuples and Tables One fieldAxis is allowed in an array. DataFormat and units are now specified for each field. Related fields are grouped by fieldGroup. <fieldAxis axisId="fields"> <fieldGroup name="position" > <field name="right ascension"> <units> <unit>degrees</unit> </units> <dataFormat> <float width="11" precision="7" /> </dataFormat> </field> <field name="declination" > <units> <unit>degrees</unit> </units> <dataFormat> <float width="11" precision="7" /> </dataFormat> </field> </fieldGroup> ... (Not Your Father’s) Arrays

  11. A single value is carried by a parameter element. Attributes: special = (infinite | infiniteNegative | noData | notANumber | underflow | overflow) inequlity = ( lessThan | lessThanOrEqual | greaterThan | greaterThanOrEqual | upperErrorValue | lowerErrorValue ) <parameterGroup name="Properties"> <parameter name=“last date" datatype="string”> <units> <unit>xs:date</unit> </units> <value special=“lessThanOr Equal” >1999-12-01</value> <note>It happened on or before this date.</note> </parameter> Parameters

  12. Using the XDF API // create 2D (10x10)structure import gov.nasa.gsfc.adc.xdf.*; public Structure create2DXDFStructure () { Array XDFarray = new Array(); Structure XDFObject = new Structure(); // tack array into structure XDFObject.addArray(XDFarray); // add axes to the array Axis axis0 = XDFarray.addAxis(new Axis()); Axis axis1 = XDFarray.addAxis(new Axis()); // add Axis values for(int I=0; I < 10; I++) { String axisValueName = new String(I); axis0.addAxisValue(new Value(axisValueName)); axis1.addAxisValue(new Value(axisValueName)); }

  13. Using the XDF API // we will add integer data to whole array. // simply set to IntegerDataFOrmat.. XDFarray.setDataFormat(new IntegerDataFormat()); //Now add our integer data Locator myLocation = XDFarray.createLocator(); for(int I=0; I < 100; I++) { XDFarray.setData(myLocation, I); myLocation.next(); } // finished! Return structure w/ array inside return XDFStructure; } i

  14. FITSML, CDFML, and IML inherit XDF • Upgrades for FITS (astronomy) and CDF (space physics) using XDF as core. • Preserve Keywords and vocabularies (just add tags). • XDF is rich enough to preserve familiar structures. • Wrap, not disassemble, FITS files. • FITS-to-FITML Converter (Java FITSIO-to-jXDF Package).

  15. FITSML <XDF type=“FITSML”> <observer>John Smith</observer> <telescope>VLA</telescope> <position> <rightAscension> 300.012</rightAscension> <declination>-45.03</declination> <equinox system=“Besselian”>1950</equinox> </position> <array> ... </array> </XDF>

  16. FITSML After Parse <XDF type="FITSML"> <observation> <observer keyword="OBSERVER" description="Observer name or other identification"> Perl Smith </observer> <telescope keyword="TELESCOP" description="Data acquisition telescope"> VLA </telescope>

  17. XML Dataset Archive • Metadata about published data sets. • abstract, title, description, authors, references, etc. • Inherits XDF for tables, spectra, and images. • XSLT allows transformation to: • web pages, pdf pages, next schema version, and present query responses. • Investigating QXL, XQUERY, PDOM, XML-DB.

  18. Article Markup Language • Replace LaTeX for manuscript preparation. • Separates content from display. • A more natural interface with SGML at publishers. • Equation editors for MathML. • XML tables, images, and graphs (SVG, XDF). • Retain metadata throughout process. • Use CSS2 or XSLT to convert back to LaTex or pretty HTML or PDF hardcopy.

  19. Summary • ADC is doing its best to keep up with tremendous advancements in IT, particularly XML. • It is reasonable now to have an end-to-end XML based data handling system from ground-receivers to archive to publication. Complete metadata historical records can be carried along. • Real reuse of tools, especially visualization is finally practical through common XML bases. • True interoperability between scientific disciplines and the public is within reach.

  20. Where to get more information • http://adc.gsfc.nasa.gov • http://xml.gsfc.nasa.gov • http://xml.gsfc.nasa.gov/XDF_home/

More Related