1 / 10

Hub and Spoke (H&S)

Hub and Spoke (H&S) . Repository Interoperability Architecture with a forward-looking emphasis on preservation metadata and activities. The problem. Plethora of repositories Not just across institutions, but even with a single institution Overabundance of data sources

elu
Télécharger la présentation

Hub and Spoke (H&S)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hub and Spoke (H&S) Repository Interoperability Architecture with a forward-looking emphasis on preservation metadata and activities

  2. The problem • Plethora of repositories • Not just across institutions, but even with a single institution • Overabundance of data sources • Web crawlers like Heritrix or OCLC's WAW, digitization and scanning services, individual authors, batch ingest from legacy systems • Current integration solutions are local and ad hoc

  3. The solution • A common METS-based profile • A standard programming API • A series of scripts that use the API and METS profile for creating SIPs and DIPs which can be used across different repositories

  4. METS profile • DRAFT: http://dli.grainger.uiuc.edu/echodep/METS/DRAFTS_2006-06-29/METSProfile.xml • Foci • Repository interoperability • minimally at the file and descriptive metadata level, probably not at the structural level • Digital preservation • Web captures • Administrative metadata: technical and provenance • Integrating the PREMIS data model into METS • Priority in preserving the ‘representation’: descriptive metadata, content, and structure

  5. Phased implementation • Phase 1: Interoperability • Phase 2: Persistent Storage Layer • METS Profile as an AIP • JSR-170 content repository standard • End-user Access (search/browse/render) is low priority

  6. Details of current implementation • Based on ingest scripts that were developed to support the repository evaluation • Java except at the outermost layers where native API calls are utilized • We consider this a proof-of-concept implementation; the goal being to demonstrate round-trip interoperability between three repositories: DSpace, Eprints, and Fedora • Currently working with minimal administrative metadata

  7. Hub Generate/collect provenance metadata To-Hub Spoke Data Store / DIPs Extract format-specific technical metadata Generate/collect digital provenance metadata Embed links to digital items image.jpg Model structure of the item Embed native metadata Transform/enrich native metadata metadata.xml

  8. Hub Generate provenance metadata From-Hub Spoke SIPs Transform hub metadata to repository-compatible metadata Assemble into packages for repository ingest Add the METS file as an item in the submission package metadata.xml hubMets.xml

  9. Future Work • True repository for Archival Information Packages (AIPs) • Add a persistence layer possibly based on JSR-170 • Assign global, persistent identifiers • Support basic functionality required for a interoperability such as the Pathways’ obtain, harvest, and put services • Other preservation functions…

More Related