130 likes | 279 Vues
The Knowledge Network for Biocomplexity (KNB), established in 1999 under NCEAS and LTER auspices, provides extensive resources for ecologists, particularly through EML (Ecological Metadata Language). EML, designed to facilitate data integration, offers XML format schemas for ecological data management. Key tools include Morpho for data structure setup and Metacat for backend storage, supporting data and metadata authentication and transformation. EML is advantageous for archiving and transfer but has limitations regarding relational data interfaces and visualization software integration.
E N D
KNB Overview • “Knowledge Network for Biocomplexity” • Started in 1999 • Auspices/people from NCEAS and LTER • http://knb.ecoinformatics.org • Lots of stuff to help ecologists integrate data • Open source code and process, available at http://cvs.ecoinformatics.org/cvs/cvsweb.cgi/eml/
EML/XML • XML format for ecological data and metadata • XML a “markup” language • XML “schemas” describe a set of other “base” XML document containing “real” information • EML is set of schemas that describe that format • Data Markup • Metadata Markup
Other KNB products • Morpho • Gui application for setting up an EML data structure and for entering data into it. • Written in Java, network enabled • Metacat • Backend to store data/metadata in EML. • Security/Authentication • Transform between EML/SQL/HTML • Validate EML
XML/EML Example <?xml version="1.0"?> <eml:eml packageId="eml.1.1" system="http://knb.ecoinformatics.org" \ xmlns:eml="eml://ecoinformatics.org/eml-2.0.0"> <dataset> <title>Biodiversity surveys for Lesser Tree Frogs… </title> <creator id="23445" scope="document"> <individualName> <givenName>Jane</givenName> <surName>Smith</surName> </individualName> <electronicMailAddress>jane@data.org</electronicMailAddress> </creator> <contact> <references>23445</references> </contact> </dataset> </eml:eml>
XML SchemaExample <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author" type="xs:string"/> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="friend-of-type="xs:string“ minOccurs="0" maxOccurs="unbounded"/>
Schema Cont’d <xs:element name="since" type="xs:date"/> <xs:element name="qualification" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="isbn" type="xs:string"/> </xs:complexType> </xs:element> </xs:schema>
EML Advantages • Automatic transformation using XSLT or stylesheets • XML to HTML • XML to text or other format • XML to different XML • Automatic validation • Correct Hierarchical Structure • Correct Range and Type for data • Structure can help discovery
EML Disadvantages • Complex • Verbose • Hierarchical data hard to interface with relational data • Hard to read without stylesheet
Other Similar Systems • FGDC • Similarities: • Fine grained • Somewhat extendible • Uses some XML things • Differences • Unique format • Different internal development process
PEEIR Data Needs • Need format for data transfer and archiving • Need data store that is easily queried and optimized for ad hoc analysis • Need data store that can integrate with visualization software
Suitability of EML • Good for archive/transfer. • OK for query friendly store, but not as good as relational DB • Probably not so good for integrating with visualization software
Summary • Probably advocate for a limited use of EML for metadata only • Keep data per se in combination of SQL, text, and binary files