1 / 18

NGDA Format Registry

NGDA Format Registry. Why do we need a FR? We are designing with long-term storage in mind (> 100 years) Cannot depend on format spec to be available via url or even a format registry that might not still be up to date or in existence

micheal
Télécharger la présentation

NGDA Format Registry

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NGDA Format Registry • Why do we need a FR? • We are designing with long-term storage in mind (> 100 years) • Cannot depend on format spec to be available via url or even a format registry that might not still be up to date or in existence • Thus semantic definition of format must be archived with the object itself • This semantic definition must be comprehensive so that format can be accessed even if current access mechanisms no longer exist! Catherine Masi, National Geospatial Digital Archive

  2. NGDA Format Registry • Two major tasks • Analyze and define spatial data formats (Meredith Williams) • Develop local format registry with programmatic interface to existing authoritative/collaborative FR (Catherine Masi) Catherine Masi, National Geospatial Digital Archive

  3. Analyze and define spatial data formats • Is there a comprehensive list of geospatial formats? Are they defined? How? • List of Spatial Data Formats - MW • Digital Map Formats • Vector File Formats • Raster File Formats • Other categories - TIN, ASCII, 3D, Tabular Databases • Unacceptable Formats Catherine Masi, National Geospatial Digital Archive

  4. Analyze and define spatial data formats • What formats do we have in ADL? How do we define them? • ADL format documentation • ADL website: http://www.alexandria.ucsb.edu/adl/Collection%20Development/BucketDescrip.htm • MIME types: http://www.iana.org/assignments/media-types/ • ADL literature/presentations: • Format • type: hierarchical • vocabulary: ADL Object Format Thesaurus • loosely based on MIME • multiple values: union • compare: DC.Format • ADL Webclient list: http://webclient.alexandria.ucsb.edu/mw/index.jsp Catherine Masi, National Geospatial Digital Archive

  5. Analyze and define spatial data formats • What are our preferred formats for NGDA, if any? • MW tested three geospatial formats using Sustainability Test derived from LCDF • GJ - "we can ingest anything if we have the definition representation information" • Decided to limit allowed formats to a few the first year – CASIL test suite (geotiff, shapefile) • What if there is free proprietary software, such as from ESRI, that allows one to look the files. Should we request and archive that as well? - No (UCSB) Catherine Masi, National Geospatial Digital Archive

  6. Analyze and define spatial data formats • How will we define our formats? • Using Meredith's list of Spatial Data Formats • Begin defining using LoC Digital Formats as an example • How do we know that we have sufficient semantic information to define each geospatial format? • What information is required to make the format usable? Ask the users. • What information is required to programmatically access the format if current access mechanisms become obsolete? • Prioritize and start with most important/ubiquitous formats for our archive • Cooordinate with format definitions in Jhove Catherine Masi, National Geospatial Digital Archive

  7. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • What format registries are out there? • Library of Congress Digital Formats (LCDF) • Global Digital Format Registry (GDFR) - Harvard • Global Digital Format Registry Description • Ockerbloom's Format Registry Demonstrator (FRED) • PRONOM - File format registry - UK archives • Practical, in use, not geo-spatial Catherine Masi, National Geospatial Digital Archive

  8. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • Coordinate our efforts with the LCDF, GDFR, FRED, TOM • NH initiated contact (Stephen Abrams, John Ockerbloom, Steve Morris, etc.) at DLF • Questions for DFL meeting to get discussion started. • Questions that we formulated showed that we have to solve a lot of these problems on our own, especially with regard to the technical aspects of building a FR and interaction mechanisms between LC, GDFR and our local FR Catherine Masi, National Geospatial Digital Archive

  9. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • Do the existing format registries contain geospatial formats? • No, in the future we will contribute geospatial formats to an existing registry effort such as LCDF or GDFR Catherine Masi, National Geospatial Digital Archive

  10. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • Do the existing format registries support access and contribution mechanisms? • No. Catherine Masi, National Geospatial Digital Archive

  11. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • How are Library of Congress Digital Formats stored internally? Database? XML? Directory structure? • In MS Word files Catherine Masi, National Geospatial Digital Archive

  12. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • Is there a data dictionary or other mechanism for defining fields in LCDF? • FDD Catherine Masi, National Geospatial Digital Archive

  13. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • CM contacted Steve Morris (NCSU - NDIIPP), Stephen Abrams (Harvard - GDFR) and John Mark Ockerbloom (Penn - FRED), to open up a discussion on the technical aspects of developing a geospatial format registry. • S. Abrams responded that GDFR is still only an idea rather than a reality and that a technical discussion of how our GIS formats should be managed in a GDFR-conformant way is a bit premature Catherine Masi, National Geospatial Digital Archive

  14. Develop local format registry with programmatic interface to existing authoritative/collaborative FR • What are the requirements for the NGDA Format Registry? • independent • contains sufficient semantic information to programmatically access format (UCSB) • contains geospatial reference information • definitions exist in simple documented format in simple directory structure • access/search mechanism not necessary for access • interfaces with collaborative authoritative FR for updates and contributions Catherine Masi, National Geospatial Digital Archive

  15. First steps: • CM began prototyping the physical structure of format registry using 2 CASIL formats, geotiff and shapefile. • Created directory based registry. • Incorporated info from MW's documents Spatial Data Formats and Sustainability Test • Created record layout loosely based on Library of Congress Digital Formats but including spatial reference information. • Included format spec as local website (in the case of geotiff) and as local pdf file (in the case of shapefile). • All links on record referred to local copies of format information. • All documentation about the format is located locally in that format's directory • Entries are not complete. This is just a first pass at what the html-rendered format entries will look like. Focus here is on physical structure rather than content. Catherine Masi, National Geospatial Digital Archive

  16. First steps: • Refining content using input from DV, MW and from actual data users as to what is needed to adequately define a format. • Determine sufficient semantic info to define geospatial formats • Review CASIL formats. Began to flesh out sufficient semantic info. Started with geotiff, shapefile. • Review record layout and add, change and delete fields. Catherine Masi, National Geospatial Digital Archive

  17. Next steps • Make sure format spec is complete and all information is located locally where possible. • Determine where we draw the line between format registry information/policy/higher level descriptive metadata. Format registry will stick to format spec and a few other important fields only. • Develop xml stylesheet of record layout. Decided that html, xml and pdf are acceptable archivable formats for format registry information. • Flatten the directory structure (hierarchy) because tfw, for example, is not a subtype of geotiff but can be attached to a tiff or another format. Work more on trying to find a sensible organization for the files in our FR • Link to other parts of Archive (Descriptive Metadata) from within FR Catherine Masi, National Geospatial Digital Archive

  18. Later • Develop method of search, retrieval, update • Begin to develop programmatic interface to LoC Digital Formats or other authoritative/collaborative format registry Catherine Masi, National Geospatial Digital Archive

More Related