150 likes | 267 Vues
This presentation by R. Duerr at the HDF and HDF-EOS Workshop VIII discusses the implications of HDF and HDF-EOS formats for long-term data archiving and access. It explores the Open Archival Information System (OAIS) model, addressing the needs of data users, producers, and archivists. Key topics include the structure and semantics of data packages, preservation descriptions, and the importance of context and provenance. The presentation emphasizes the necessity of robust archival formats that support usability, integrity, and data longevity in an evolving digital landscape.
E N D
HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access
Outline • Open Archival Information System reference model • Needs of data users, producers, and archivists HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
Content Information Preservation Description Information Packaging Information Package 1 Descriptive Information About Package 1 OAIS Information Packages Adapted from CCSDS 650.0-B-1 HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
OAIS Information Packages - Content Info. • Data Object - the information to be preserved • Representational Information - allows a user in the designated community to understand the data without consulting an expert • Structure / Syntax • Content / Semantics HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
OAIS Info. Packages - Preservation Description • Provenance - documents the history of the object (e.g., Instrument descriptions, Processing history, sensor descriptions, Instrument & mode, Software interface specs, etc.) • Reference - documents object identifiers and their generation mechanisms (e.g., Journal references, OID, Mission, Instrument, Data set Title, Parameters, etc.) • Fixity - documents methods used to ensure there are no undocumented changes (Method, Algorithm, checksum values, etc.) • Context - the relationship of the object to its environment (Calibration history, related data sets, funding history, etc.) HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
OAIS Info. Packages - Packaging Information • Either logically or physically binds the Content Information and the Preservation Description Information into an object stored in a defined location • Generally changes when migration occurs HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
OAIS Info. Package - Descriptive Information • Information that allows users to find, assess, and retrieve/order information of interest (e.g., the catalog, indexes, etc.) HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
OAIS Information Packages CCSDS 650.0-B-1 HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
Data User Needs • Searchable on their terms • Amounts of data ranging from the miniscule to the entire contents of the archive and anything in between • Data formatted their way • With enough supporting information to be useful to them without having to consult an expert HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
Data Producer Needs • An archive ready, willing, and able to accept their data in whatever format, on whatever media, with whatever packaging can best be accommodated by both parties • High volume data producers need to be able to work with the archive to define automated interfaces HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
Archive Needs • Data producers willing to help develop the information needed by users • An good long-term archive format HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
What is a Good Long-Term Archive Format? • Per a recent paper by Mike Folk and Bruce Barkstrom • Ease of archival storage • Ease of archival access • Usability • Data scholarship enablement • Support for data integrity • Maintainability and durability HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
What Makes a Good File Format? • Per Eric Raymond, in “The Art of Unix Programming”, Addison-Wesley, 2004 • Transparency • Interoperability • Extensibility • Storage economy • He argues that the best general purpose file format is text • He also argues that the only good justification for binary data is with very large data sets HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co
Disaster-Proofing Your Data • If you can’t keep a data set as text, then at least keep the representational information in a human readable format (preferably right with the data itself) HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access Presented by R. Duerr at the HDF and HDF-EOS Workshop VIII, October 26-28, 2004 Aurora Co