220 likes | 404 Vues
The OAIS experience at the British Library. Deborah Woodyard Digital Preservation Coordinator ERPANET OAIS Training Seminar, 28-29 Nov 2002. OVERVIEW. Introduction to the British Library Why the BL chose to use the OAIS model OAIS theory versus implementation Terminology Metadata
E N D
The OAIS experience at the British Library Deborah WoodyardDigital Preservation Coordinator ERPANET OAIS Training Seminar, 28-29 Nov 2002
OVERVIEW • Introduction to the British Library • Why the BL chose to use the OAIS model • OAIS theory versus implementation • Terminology • Metadata • Issues not covered by OAIS • Summary of lessons learned about using the OAIS
THE BRITISH LIBRARY • Deposit library • Aiming to get deposit legislation for digital materials • Receiving digital material by voluntary deposit, purchase and digitisation • Wide variety of types of digital material received • Require method/system for long term storage, preservation and access • Seriously embarked on developing such a system in 2000 • Initial work developed detailed functional specification of a system aligned with OAIS model concepts
WHY OAIS? • Very little current experience of a system such as this exists • No ‘off-the-shelf’ systems available • No other standards • OAIS model well developed • Considered to be the guidance for best practice • Provided excellent high level framework and convincing back-up argument for political justification for development of such a system • Provided standard terminology for communication • A good match for almost the entire system we were planning to build
OAIS THEORY vs SYSTEM IMPLEMENTATION • High level standard implies no rules for actual design or implementation • OAIS sounds like one system but is not necessarily, or even likely to be, one single entity • No formal method of implementation used • Analysed business processes and matched to OAIS functions
OAIS TERMINOLOGY • Useful as a common vocabulary which is used to communicate with internally and externally • Difficult to explain without reading a lot of the document, therefore opaque to those not heavily involved (e.g. OAIS vs OAI) • Still needed to create another glossary • Especially useful: SIP, AIP, DIP; Ingest; Content Information = Content Data Object + Representation Information • Difficulties with: defining an object; naming preservation users
OAIS METADATA TO BL METADATA • Packaging Information (i.e. how and where the bits are stored) • Content Information including Representation Information (i.e. how to interpret the bits into data) • Preservation Description Information including • Reference Information • Context Information • Provenance Information • Fixity Information (i.e. how to interpret the data into information)
CONTENT INFORMATION • Representation Information(Content data object description) • Technical details of files and resource structure • How the resource appears, is installed and runs • Documentation • Significant properties • Representation Information(Environment description) • Requirements for hardware, peripherals, • Operating system, application software, • Input and output, memory requirements and other parameters • Documentation on installation, use and location of environment components.
PRESERVATION DESCRIPTION INFORMATION • Reference Information • Identifiers & descriptive information • Context Information • Reason for creation, relationships with other resources • Provenance Information • Origin of the resource & changes made due to its life in the archive • Fixity Information • Authentication details
Agent Group Agent Identifier Agent Role Personal Agent Group Personal Agent Name Affix Personal Agent Family Name Personal Agent Given Name Personal Agent Affiliation, Personal Agent Vital Date Corporate Agent Group Corp Agent Name Corp Agent Place Event Agent Group Event Agent Name Event Agent Number Event Agent Location Event Agent Date Other Agent Group Other Agent Name Other Agent Description BL METADATA (1/8)
Descriptive Items Group Language Page Range Frequency Of Serial Issue Data Audience Title Group Primary Title Title Status Alternative Title Sub Title Series Title Series Title Number Article Title Uniform Title BL METADATA (2/8)
Subject Group LCSH DDC Name As Subject Free Text Other Subject Vocabularies BL Collection BL Classification Description Group Abstract Table of Contents Map Scale Free Text BL METADATA (3/8)
Date Group Date Issued Date Available Date Created Date Archived Licence Check Date Date Modified Date Coverage Date Valid Vital Date Event Date Other Descriptive Dates System Dates BL METADATA (4/8)
Coverage Group Temporal coverage Spatial Coverage Terms Group Price Terms Of Availability Statement Terms Of Availability Reference Type and Identifier Group Resource Type Object Type Object Preservation Category Resource Identifier System IDs Descriptive IDs Format Group BL METADATA (5/8)
Relation Group Relation Is Version Of Relation Is Format Of Relation Is Part Of Relation Is Component Of Relation Is Replaced By Relation Replaces Relation Requires Relation External Object Relation Continues History Group Custody History Digitisation History Ingest History Preservation History Process Name Process Description Process Reason Process Selection Process Specification Critical Hardware Critical Software Process Result Process Agent Process Date BL METADATA (6/8)
Object Part Group Digital Signature Digital Signature Name Operating Environment Object Part Preservation Status Viewing Software Object Part Identifier Start File Underlying Abstract Form Essence of Being External Object Group Source Relation External Object Related Information Object Other Original Environment Group Operating System Processor Type Processor Speed Hard Disc Capacity RAM Video Card Sound Card CD Speed BL METADATA (7/8)
Rights Information Group Rights Group Rights URL Rights XRML Rights Statement Rights Holder Licence Group Licence Type Licence Fee Licence Description Location Number Of Licences System Parameter Group Licence Key Extraordinary Requirements Original Carrier Copy Counter BL METADATA (8/8)
ISSUES NOT COVERED BY THE OAIS (1/3) • Boundary of the system under development: • Which materials will be stored in this system • Should descriptive information be stored internally • Should object relationships be stored internally • Should a retrieval manager component be included • Should an exit strategy (high volume data transfer) be built from day one • Changes to metadata: • Should changes be allowed without delivery and re-ingest as new item
ISSUES NOT COVERED BY THE OAIS (2/3) • Object deletion: • Not included and may be difficult to implement • Remove content or only access to content • Object identification in a volume: • In the case of corruption or requested refreshment is it necessary to be able to identify the individual object on a volume • Independent use of archive volumes: • Disaster recovery without exact same system
ISSUES NOT COVERED BY THE OAIS (3/3) • Unique identifier: • Where should it be generated • What structure should it have • How to store license information: • Scan hard copy or data entry • Where should it be stored • Data integrity: • How often should the data be checked
SUMMARY OF MAIN LESSONS LEARNED • It’s heavy • It’s complex • It doesn’t define your scope • It’s worth understanding the terminology and concepts • It is a very valuable tool and the basis of progressing the long term preservation of digital information