360 likes | 450 Vues
The BinX Language. What is BinX?. B inary in X ML Use XML to mark up binary data Mark up data types Mark up sequences Mark up arrays Complex structures. Primitive Data Types. Mark up data types. FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00 00 1 2 3 4.
E N D
What is BinX? • Binary inXML • Use XML to mark up binary data • Mark up data types • Mark up sequences • Mark up arrays • Complex structures
Primitive Data Types • Mark up data types FF 7F7F FF FF FF 00 00 C8 4242 C8 00 00 1 2 3 4 • <short-16 byteOrder=“littleEndian”> 32767</short-16> • <integer-32 byteOrder=“bigEndian”> 2147483647</integer-32> • <float-32 byteOrder=“littleEndian”>100.0</float-32> • <float-32 byteOrder=“bigEndian”>100.0</float-32>
Abstract “struct” types • Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte <struct> <unsignedShort-16 /> <unsignedShort-16 /> <byte-8 /> <byte-8 /> <byte-8 /> </struct>
Abstract “array” types • Mark up an array A 2-dimensional array containing 10-by-100,32-bit integers <arrayFixed> <integer-32 /> <dim indexTo=“99”> <dim indexTo=“9” /> </dim> </ arrayFixed >
Embedded abstract types • Complex structures <struct> <short-16 /> <arrayFixed> <byte-8 /> <dim indexTo=“7” /> </arrayFixed> <struct> <integer-32 /> <float-32 /> <double-64 /> </struct> </struct>
User-defined metadata • Label the data types and structures <struct varName=“Data Sample”> <short-16 varName=“ID” /> <arrayFixed varName=“List of 10 complex numbers”> <struct varName=“Complex”> <float-32 varName=“Real” /> <float-32 varName=“Imaginary” /> </struct> <dim indexTo=“9” /> </arrayFixed> </struct>
Reusable type definitions • Define macros for reuse <definitions> <defineTypetypeName=“FourCC”> <arrayFixed> <character-8 /> <dim count=“4” /> </arrayFixed> </defineType> </definitions> <struct varName=“Wave_Header”> <useTypetypeName=“FourCC” varName=“Keyword” /> <integer-32 varName=“Chunk_Size” /> </struct>
Linking to binary data • Reference the binary data file <definitions> <defineType typeName=“Header”>… …</defineType> <defineType typeName=“Format_Chunk”>… …</defineType> <defineType typeName=“Data_Chunk”>… …</defineType> </definitions> <datasetsrc=“myfile.wav”> <useType typeName="Header" /> <useType typeName="Format_Chunk" /> <useType typeName="Data_Chunk" /> </dataset>
A BinX document • <binxbyteOrder=“bigEndian”> • <definitions> • <defineType typeName=“myTyp”> • <arrayFixed> • <character-8/> • <dim indexTo=“9”/> • </arrayFixed> • </defineType> • </definitions> • <datasetsrc=“myfile.bin”> • <useType typeName=“myTyp”/> • <integer-32 varName=“X” /> • </dataset> • </binx> Root element Data class section Abstract data type Data instance section
DataBinX DataBinX = BinX with Data <dataset src=“myfile.bin”> <struct> <short-16 /> <long-64 /> <double-64 /> </struct> <arrayFixed> <integer-32 /> <dim count=“2” /> </arrayFixed> </dataset> <dataset> <struct> <short-16>100</short-16> <long-64>1000</long-64> <double-64>5.257</double-64> </struct> <arrayFixed> <dim> <integer-32>1</integer-32> </dim> <dim> <integer-32>2</integer-32> </dim> </arrayFixed> </dataset>
BinX Components • The library has core functionality to support generic utilities and applications Applications BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Utilities BinX Library Core Generic tools DataBinx pack/unpack Extractor, Viewer BinX editor Applications Domain-specific
BinX application models • Data catalogue model • Data manipulation model • Data query model • Data service model • Data transportation model
Data catalogue model Abstract BinX 1 Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink BinX 1.2 METADATA BinX 1.1 BinX 1.2.1 BinX 1.2.2 BinX 1.2.3 Detailed 0101010101 0101010101 0101010101 0101010101 BINARY
Data manipulation model • Extraction • Subset of a dataset • Combination • Merge several datasets • Transformation • Conversion of data types • Change of sequence order • Transposition of array dimensions • Transparency • Automatic change of byte order
BinX data source BinX data source BinX data source BinX data source 010101010 010101010 010101010 010101010 Utility BinX library Data query model • In-dataset query • XPath against virtual XML • Cross-dataset query • Link into multiple datasets • Defining result format • XQuery-based return fragment • Output interface • SAX events XLink XPath Transform XQuery SAX Events DataBinX SAX Events VOTable SAX Events APP Custom APP DataBinx APP VOTable
Data service model • Publishing logical datasets in BinX 0101010101 Dataset from multiple data sources DB BinX 0101010101 0101010101 0101010101 0101010101 BinX BinX Grid Dataset from several binary files Dataset from one binary file Client
BinX + Binary Schema BinX Data transportation model DataBinX as interlingua XSLT BinX Util ZIP tool Send Receive XML document DataBinX ZIP (MIME) XSLT BinX Util ZIP tool
Application in Astronomy Case Study 1 Data Conversion Between FITS and VOTable
Application in astronomy • FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … … END 01010101 <?xml version=. <VOTABLE> … … </VOTABLE>
FITS file 0 79 Primary HDU Header Data Extension Header Data
VOTable <VOTABLE> <RESOURCE> <PARAM name=“Obs” value=“Bob”/> <TABLE name=“Stars”> <FIELD name=“Star-name” datatype=“char” arraysize=“10” /> <FIELD name=“RA” datatype=“float” /> <FIELD name=“Dec” datatype=“float” /> <FIELD name=“Counts” datatype=“int” arraysize=“2x3x*” /> <DATA> <TABLEDATA> <TR> <TD>Procyon</TD><TD>114.827</TD><TD>5.227</TD> <TD>4 5 3 4 3 2 1 2 3 3 5 6</TD> </TR> </TABLEDATA> </DATA> </TABLE> </RESOURCE> </VOTABLE>
FITS →DataBinX →VOTable • FITS to VOTable conversion DataBinX Utility FITS XSLT transformer DataBinX Schema BinX Preprocessor XSLT VOTable
VOTable→DataBinX→FITS • VOTable to FITS conversion Schema BinX DataBinX Utility XSLT transformer VOTable Binary Data DataBinX XSLT Post processor Preprocessor FITS FITS Header
FITS-VOTable experiment • Sample FITS file • A data table of 82 rows X 20 fields • File size: 37KB • Generated DataBinX by DataBinX utility • Time spent: 268 ms • DataBinX document size: 1.2MB • VOTable transformed by MSXML • Time spent: about 1 second • VOTable document size: 51KB F V DB
Application in Astronomy Case Study 2 Data Transportation by pipelining BinX and VOTable
The Problem • Three kinds of VOTable data sources • Pure XML VOTable (large) • VOTable + FITS (small) • VOTable + Binary (smaller) • Difficulties • Additional parser for VOTable+Binary • Limited binary format • Byte order and data types
The Solution: VOTable + BinX • No coding necessary • Smaller data files • Easy to separate and restore • Pipelined to work in the background • Platform independent
Approaches • Embedded BinX • BinX document linking Perhaps another method?
Embedded BinX • Example: <VOTABLE xmlns:bx=http://www.edikt.org/binx/2003/06/binx> <TABLE name=“stars”> <FIELD name=“star-name” datatype=“char” arraysize=“*”/> <FIELD name=“RA” datatype=“float”/> <DATA> <bx:dataset src=“bin-file.dat”> <bx:array> <bx:struct> <bx:string varName=“star-name” /> <bx:float-32 varName=“RA” /> </bx:struct> </bx:array> </bx:dataset> </DATA> </TABLE> </VOTABLE>
BinX Document Linking • Example: <VOTABLE> <TABLE name=“stars”> <FIELD name=“star-name” datatype=“char” arraysize=“*”/> <FIELD name=“RA” datatype=“float”/> <DATA> <BINX href=“stars-data-binx.xml” type=“TABLEDATA”/> </DATA> </TABLE> </VOTABLE>
Comparison of the two approaches • Embedded BinX • Advantages: • One annotation file • Consistency with VOTable definitions • Disadvantages: • Spoil the VOTable document • Difficult to parse • BinX document linking • Advantages: • Keep VOTable clean • Easy to parse • Disadvantages: • Need separate BinX document • Difficult to keep consistent
BinX Software Today and the Future
Future releases • Utilities (GUI BinX editor) • XPath-based data query • DFDL support • Text file support • Output through SAX events • Output as XQuery return • Database interfacing • Java wrapper for utilities
Support • Information and software download: • http://www.edikt.org/binx (coming soon) • Questions: • support@edikt.org • Requirements and suggestions: • tedwen@edikt.org • robertc@edikt.org