1 / 22

NSSDC Role and OAIS Implementation Brief Overview Don Sawyer

NSSDC Role and OAIS Implementation Brief Overview Don Sawyer. NSSDC Roles. NSSDC is the NASA Office of Space Science (OSS) permanent archive Astronomy, Solar & Space Plasma Physics, Planetary & Lunar data

baba
Télécharger la présentation

NSSDC Role and OAIS Implementation Brief Overview Don Sawyer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NSSDC Role and OAIS Implementation Brief Overview Don Sawyer

  2. NSSDC Roles • NSSDC is the NASA Office of Space Science (OSS) permanent archive • Astronomy, Solar & Space Plasma Physics, Planetary & Lunar data • Digital and film data spanning 1958-2002 from >1300 instruments flown on >375 spacecraft • Distinguished from OSS Active Archives (AA) • Interacts in a timely manner with all distributed OSS active archives in space physics, solar physics, astrophysics, and planetary science disciplines to acquire the OSS data and supporting metadata needed for long term preservation and understanding; • interact directly with projects when mediated by an active archive; • interact with PI's and related individuals when they have data needing long-term preservation.

  3. OSS Archive Relationships Various OSS S/C Projects Planetary AAs Astrophysics AAs Solar AAs SEC AAs NSSDC Permanent Archive Anonymous FTP DLTs, Tapes, CD/DVDs, Film, Paper PDS and SEC data on media OSS Researchers, Non-OSS Researchers Education Community, General Public

  4. NSSDC Roles (concl’d) • NASA's lead for Consultative Committee for Space Data Systems (CCSDS) Archiving and Data Packaging/Registry Working Groups (on-ground data management) • Led development of CCSDS/ISO Open Archival Information System reference model standard • Comprehensive information base about all launched spacecraft (~6000) • Host of World Data System for Satellite Information • Part of worldwide World Data Center infrastructure established ~1958

  5. NSSDC’s Permanent Archive Environment - Legacy View • ~20 TB in ~2,300 digital data sets on ~40,000 offline media • Most on tape • Most newly arriving media are CD's or DVD's • "Data set" is all data from a given source (e.g., instrument on a spacecraft) at a given "processing level." • Wide range of data characteristics (e.g., documented binaries specific to now-obsolete computers) • Also, ~2,000 data sets on large number of film media of various form factors. • Gradually being digitized into TIFF via scanning.

  6. Initial Drivers for OAIS Re-engineering • Needed to solve a migration problem • Remove dependencies of VAX VMS files on the operating system • Include record defining attributes in a standard form to accompany the data file content • Result was package of data/metadata • Had software, based on CCSDS/ISO packaging standard, that could be augmented • OAIS reference model provided an architectural view

  7. Created Archival Information Package • Single File (binary/ascii content) • Uses CCSDS/ISO packaging (SFDU) to hold multiple data objects • NSSDC defined attribute object expressed in CCSDS/ISO Parameter Value Language (PVL) • NSSDC data file content in one of four canonical forms • Two flavors each of binary and ascii • 20-byte SFDU ascii labels to separate data objects

  8. NSSDC Attribute Object • NSSDC Attribute Object • Object identification and version • Archival Storage Id ( unique) • Collection Id • Checksum over rest of attribute object • Attributes for original data stream • Date/time created, operating system, size in bytes, record format, binary/ascii flag, file name, checksum, etc. • Attributes for canonical form of data stream • Date/time created, operating system, size in bytes, record format, binary/ascii flag, file name, checksum, processing report, format identifier (ADID), etc. • Order applied encodings (e.g., tar,gzip) • Start date/time of data observations

  9. CCSDS/ISO Label for Packaging CCSDS/ISO Label for Attribute Object CCSDS/ISO Label for Sensor Data Object Attribute Object (AO) Label Label Label Sensor Data Object (SDO) Globally Unique Registry Identifiers Expressed using CCSDS/ISO language Globally Unique Registry Identifier NSSDC Permanent Archive - New Direction • Bundle data files (objects) with data_file-descriptive attribute file (object) and pointers to further documentation into OAIS "Archive Information Package (AIP)" • Write to Digital Linear Tape (DLT)-based jukebox in unix environment • Write data files and attribute files to RAID disk for ftp-based access by external customer • AIP Structure

  10. “New Direction”

  11. Migrating Data into AIPs • Have created AIPs for data previously on NSSDC's newly retired 12" WORM data dissemination jukebox • VMS-based, so some attributes placed in attribute objects compensate for loss of VMS/Files-11 support • Modified data files in cases of variable-length records, and introduced "CR/LF" for appropriate ASCII data • Now creating multi-data-file AIP and upgrading software to accommodate data migrating from legacy offline tapes • Will start ingest from tape imminently

  12. IMAGE Science Operations Centre National Space Science Data Center Data files AIPs ftp NSSDC Package Generator Configuration information IMAGE Script Facilitating Archiving via Data Supplier Support • NSSDC has provided software to the IMAGE spacecraft project • Generates attribute objects and bundles these with data files into Archive Information Packages (AIP • IMAGE script transmits these to NSSDC • Looking for other opportunities to support NASA spacecraft projects equivalently • Cost-effective data ingest

  13. NSSDC Architecture Summary • For the system architecture: • compliant with the OAIS functional modelseparates different functions : ingest, archival storage, data management, access • Compliant with the OAIS information model defines an Archival Information Package (AIP) for preservation in Archival Storage • Data are being migrated into Archival Information Packages for long-term storage on DLTs • New data received arrive as AIPs (e.g., the IMAGE project) or are put into AIPs during the Ingest process

  14. Current Activities • Developing a better integration of our metadata databases • Many have grown up over the years • Taking advantage of Java and web capabilities • Developing an Archival Information Package type that allows multiple ‘canonical data files’ in a single package file. • Needed for the migration of legacy data on magnetic tape • Needed to put small files together for ease of management • Planning a better overall integration of our architecture • E.g., tighter coupling between AIPs and other information bases

  15. Backups

  16. NSSDC AIP Schematic

  17. NSSDC Archive - Logical Architecture

  18. Archive Challenges • Making most cost-benefit favorable judgements on modernization of low-access-potential older data sets. • Convert vendor-specific binaries to IEEE-binary? Via EAST? Convert to ASCII? • Implement efficient production process for migrating data from ~10,000 tapes through AIP-creation software to nearline DLT-based permanent archive • Define post-DLT permanent archive environment • Ensuring existence of all material needed to make data correctly and independently usable • Couple such material to the data being supported

  19. NSSDC Metadata Environment • Information base (JEDS) about • All launched spacecraft, • Instruments on space science spacecraft, • NSSDC-held data sets therefrom. • Underlies "NSSDC Master Catalog" interface. • Information base (DIOnAS) about data files • Written to new nearline permanent archive • Written to anonymous nssdcftp/spacecraft_data/ • Attribute objects with technical information about data files • Information base (JIN) about data media

  20. NSSDC Metadata Environment (concl’d) • Information base (CAOIS) of CCSDS-registered data set-descriptive information (e.g., formats) • Assigns globally-unique registry identifiers • Relevant to growing fraction of NSSDC data plus other data • Array of "data set catalogs" with detailed information on NSSDC-held legacy data sets • Presently on CD's as TIFF and PDF images • Other special purpose information bases and metadata collections • NSSDC data set ID's are primary mechanism currently linking these "metadata modules"

  21. NSSDC’s Metadata Challenges • To ensure flow to NSSDC of material needed for the correct and independent use of data along with the flow of data to NSSDC • To optimally integrate metadata modules to support: • Users' finding, retrieval and use of data, • NSSDC staffers' archive management activities • To ensure that all relevant supporting material is visible to and readily retrievable by NSSDC's data-accessing customers.

  22. Software • NSSDC has growing amount of low-processing-level (lpl) data • Started archiving such data only in past decade • NSSDC has very little data set-specific READ/PROCESS software • This greatly limits usability of lpl data • Lpl data handled by systems/formats like SDDAS/IDFS and IMAGE_Archive/UDF • Major need for software standards/approaches to accompany lpl data into archives • Ensure long-term usability of such data • Archiving of relevant software source code a minimal requirement

More Related