1 / 36

The BinX Language

The BinX Language. What is BinX?. B inary in X ML Use XML to mark up binary data Mark up data types Mark up sequences Mark up arrays Complex structures. Primitive Data Types. Mark up data types. FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00 00 1 2 3 4.

kaye-ball
Télécharger la présentation

The BinX Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The BinX Language

  2. What is BinX? • Binary inXML • Use XML to mark up binary data • Mark up data types • Mark up sequences • Mark up arrays • Complex structures

  3. Primitive Data Types • Mark up data types FF 7F7F FF FF FF 00 00 C8 4242 C8 00 00 1 2 3 4 • <short-16 byteOrder=“littleEndian”> 32767</short-16> • <integer-32 byteOrder=“bigEndian”> 2147483647</integer-32> • <float-32 byteOrder=“littleEndian”>100.0</float-32> • <float-32 byteOrder=“bigEndian”>100.0</float-32>

  4. Abstract “struct” types • Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte <struct> <unsignedShort-16 /> <unsignedShort-16 /> <byte-8 /> <byte-8 /> <byte-8 /> </struct>

  5. Abstract “array” types • Mark up an array A 2-dimensional array containing 10-by-100,32-bit integers <arrayFixed> <integer-32 /> <dim indexTo=“99”> <dim indexTo=“9” /> </dim> </ arrayFixed >

  6. Embedded abstract types • Complex structures <struct> <short-16 /> <arrayFixed> <byte-8 /> <dim indexTo=“7” /> </arrayFixed> <struct> <integer-32 /> <float-32 /> <double-64 /> </struct> </struct>

  7. User-defined metadata • Label the data types and structures <struct varName=“Data Sample”> <short-16 varName=“ID” /> <arrayFixed varName=“List of 10 complex numbers”> <struct varName=“Complex”> <float-32 varName=“Real” /> <float-32 varName=“Imaginary” /> </struct> <dim indexTo=“9” /> </arrayFixed> </struct>

  8. Reusable type definitions • Define macros for reuse <definitions> <defineTypetypeName=“FourCC”> <arrayFixed> <character-8 /> <dim count=“4” /> </arrayFixed> </defineType> </definitions> <struct varName=“Wave_Header”> <useTypetypeName=“FourCC” varName=“Keyword” /> <integer-32 varName=“Chunk_Size” /> </struct>

  9. Linking to binary data • Reference the binary data file <definitions> <defineType typeName=“Header”>… …</defineType> <defineType typeName=“Format_Chunk”>… …</defineType> <defineType typeName=“Data_Chunk”>… …</defineType> </definitions> <datasetsrc=“myfile.wav”> <useType typeName="Header" /> <useType typeName="Format_Chunk" /> <useType typeName="Data_Chunk" /> </dataset>

  10. A BinX document • <binxbyteOrder=“bigEndian”> • <definitions> • <defineType typeName=“myTyp”> • <arrayFixed> • <character-8/> • <dim indexTo=“9”/> • </arrayFixed> • </defineType> • </definitions> • <datasetsrc=“myfile.bin”> • <useType typeName=“myTyp”/> • <integer-32 varName=“X” /> • </dataset> • </binx> Root element Data class section Abstract data type Data instance section

  11. DataBinX DataBinX = BinX with Data <dataset src=“myfile.bin”> <struct> <short-16 /> <long-64 /> <double-64 /> </struct> <arrayFixed> <integer-32 /> <dim count=“2” /> </arrayFixed> </dataset> <dataset> <struct> <short-16>100</short-16> <long-64>1000</long-64> <double-64>5.257</double-64> </struct> <arrayFixed> <dim> <integer-32>1</integer-32> </dim> <dim> <integer-32>2</integer-32> </dim> </arrayFixed> </dataset>

  12. The BinX Library

  13. BinX Components • The library has core functionality to support generic utilities and applications Applications BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Utilities BinX Library Core Generic tools DataBinx pack/unpack Extractor, Viewer BinX editor Applications Domain-specific

  14. BinX application models • Data catalogue model • Data manipulation model • Data query model • Data service model • Data transportation model

  15. Data catalogue model Abstract BinX 1 Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink BinX 1.2 METADATA BinX 1.1 BinX 1.2.1 BinX 1.2.2 BinX 1.2.3 Detailed 0101010101 0101010101 0101010101 0101010101 BINARY

  16. Data manipulation model • Extraction • Subset of a dataset • Combination • Merge several datasets • Transformation • Conversion of data types • Change of sequence order • Transposition of array dimensions • Transparency • Automatic change of byte order

  17. BinX data source BinX data source BinX data source BinX data source 010101010 010101010 010101010 010101010 Utility BinX library Data query model • In-dataset query • XPath against virtual XML • Cross-dataset query • Link into multiple datasets • Defining result format • XQuery-based return fragment • Output interface • SAX events XLink XPath Transform XQuery SAX Events DataBinX SAX Events VOTable SAX Events APP Custom APP DataBinx APP VOTable

  18. Data service model • Publishing logical datasets in BinX 0101010101 Dataset from multiple data sources DB BinX 0101010101 0101010101 0101010101 0101010101 BinX BinX Grid Dataset from several binary files Dataset from one binary file Client

  19. BinX + Binary Schema BinX Data transportation model DataBinX as interlingua XSLT BinX Util ZIP tool Send Receive XML document DataBinX ZIP (MIME) XSLT BinX Util ZIP tool

  20. Application in Astronomy Case Study 1 Data Conversion Between FITS and VOTable

  21. Application in astronomy • FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … … END 01010101 <?xml version=. <VOTABLE> … … </VOTABLE>

  22. FITS file 0 79 Primary HDU Header Data Extension Header Data

  23. VOTable <VOTABLE> <RESOURCE> <PARAM name=“Obs” value=“Bob”/> <TABLE name=“Stars”> <FIELD name=“Star-name” datatype=“char” arraysize=“10” /> <FIELD name=“RA” datatype=“float” /> <FIELD name=“Dec” datatype=“float” /> <FIELD name=“Counts” datatype=“int” arraysize=“2x3x*” /> <DATA> <TABLEDATA> <TR> <TD>Procyon</TD><TD>114.827</TD><TD>5.227</TD> <TD>4 5 3 4 3 2 1 2 3 3 5 6</TD> </TR> </TABLEDATA> </DATA> </TABLE> </RESOURCE> </VOTABLE>

  24. FITS →DataBinX →VOTable • FITS to VOTable conversion DataBinX Utility FITS XSLT transformer DataBinX Schema BinX Preprocessor XSLT VOTable

  25. VOTable→DataBinX→FITS • VOTable to FITS conversion Schema BinX DataBinX Utility XSLT transformer VOTable Binary Data DataBinX XSLT Post processor Preprocessor FITS FITS Header

  26. FITS-VOTable experiment • Sample FITS file • A data table of 82 rows X 20 fields • File size: 37KB • Generated DataBinX by DataBinX utility • Time spent: 268 ms • DataBinX document size: 1.2MB • VOTable transformed by MSXML • Time spent: about 1 second • VOTable document size: 51KB F V DB

  27. Application in Astronomy Case Study 2 Data Transportation by pipelining BinX and VOTable

  28. The Problem • Three kinds of VOTable data sources • Pure XML VOTable (large) • VOTable + FITS (small) • VOTable + Binary (smaller) • Difficulties • Additional parser for VOTable+Binary • Limited binary format • Byte order and data types

  29. The Solution: VOTable + BinX • No coding necessary • Smaller data files • Easy to separate and restore • Pipelined to work in the background • Platform independent

  30. Approaches • Embedded BinX • BinX document linking Perhaps another method?

  31. Embedded BinX • Example: <VOTABLE xmlns:bx=http://www.edikt.org/binx/2003/06/binx> <TABLE name=“stars”> <FIELD name=“star-name” datatype=“char” arraysize=“*”/> <FIELD name=“RA” datatype=“float”/> <DATA> <bx:dataset src=“bin-file.dat”> <bx:array> <bx:struct> <bx:string varName=“star-name” /> <bx:float-32 varName=“RA” /> </bx:struct> </bx:array> </bx:dataset> </DATA> </TABLE> </VOTABLE>

  32. BinX Document Linking • Example: <VOTABLE> <TABLE name=“stars”> <FIELD name=“star-name” datatype=“char” arraysize=“*”/> <FIELD name=“RA” datatype=“float”/> <DATA> <BINX href=“stars-data-binx.xml” type=“TABLEDATA”/> </DATA> </TABLE> </VOTABLE>

  33. Comparison of the two approaches • Embedded BinX • Advantages: • One annotation file • Consistency with VOTable definitions • Disadvantages: • Spoil the VOTable document • Difficult to parse • BinX document linking • Advantages: • Keep VOTable clean • Easy to parse • Disadvantages: • Need separate BinX document • Difficult to keep consistent

  34. BinX Software Today and the Future

  35. Future releases • Utilities (GUI BinX editor) • XPath-based data query • DFDL support • Text file support • Output through SAX events • Output as XQuery return • Database interfacing • Java wrapper for utilities

  36. Support • Information and software download: • http://www.edikt.org/binx (coming soon) • Questions: • support@edikt.org • Requirements and suggestions: • tedwen@edikt.org • robertc@edikt.org

More Related