200 likes | 339 Vues
This presentation by Richard Pearce-Moses, Deputy Director for Technology and Information Resources at the Arizona State Library, explores the evolution of archival practices from traditional, paper-based methods to innovative digital solutions. It covers essential aspects of the Open Archival Information System (OAIS), focusing on acquisition, arrangement, metadata management, and preservation strategies. Additionally, it discusses the integration of automated business rules using Microsoft BizTalk to streamline data flow and enhance access to archival information, ensuring the long-term preservation of records.
E N D
PeDALSPersistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records
Curatorial Rationale • Transformation of traditional, paper-based practices into the digital arena • Open Archival Information System (OAIS) • Acquisition • Arrangement & description • Housing & storage • Reference and access • Preservation • Ingest • Storage • Data management • Preservation • Access
Middleware: Microsoft BizTalk • Automated business rules • Transforming SIPs to AIPs and DIPs • Mapping, generating metadata • Connecting multiple databases (“glue”) • Many OOOs • One repository • Allows communication between systems • Validation
1. OOO Recordkeeping System • For each series of records OOO and repository • Negotiate metadata you will receive • Negotiate format of the records (TIFF, PDF, XML) • Negotiate format of the submission information package • Negotiate frequency and manner of transfer • OOO develops procedures to create SIPs • Metadata, Record • Shipping manifest with hash and file names
Submission Information Packages • OOO Metadata • “Well number" , "Owner" , "Title" , "File name" • "56-000001","CITY OF TUCSON","2003 annual report","56 files\56-000001_0000.pdf" • "56-000001","CITY OF TUCSON","2004 annual report","56 files\56-000001_0000_E52B0.pdf" • "56-000001","CITY OF TUCSON","2005 annual report","56 files\56-000001_0000_E8578.pdf" • "56-000001","CITY OF TUCSON","2006 annual report","56 files\56-000001_0000_EC3F8.pdf" • Records • XML • PDF • Other formats
2. Ingest: Transfer to Drop Box • Transfer to a drop box in DMZ • FTP • Tape • Disk • Isolated for virus scanning • Validation • Were all records received without corruption? • Were any false records received?
3. Data Management: Metadata • Generate core metadata • Administrative (6 elements) • Descriptive (28 elements) • Preservation (12 elements) • Stored in “Accessions Register” • MS SQL Server
Administrative Metadata • Information created by repository to track records in the system • Accession Number • Transfer Authority • Acquisition Ingest Identifier • Acquisition Date • Unique Item Identifier • Item Location
Discovery Metadata • Information created by OOO or Repository to help retrieving records for a variety of purposes • Office of Origin, Variant name • Source • Series Title, ID • Series Dates • Series Extent • Series Description • Arrangement • Restrictions • Series Subjects, Keywords • Activity • Item Title • Originator ID • Item Extent • Item Date • Item Description • First1024 • Party and Role, Subjects, Location • Item Keywords, Form/Genre • Related Item • Language • Open Date
Preservation Metadata • Information created by Repository to support to protect integrity, support readability over time • Access Facilitators • Operating System • Access Inhibitors • Hardware • Exceptions • Signature Information • File Description • Fixity • Functionality • Software • Structural Type • Technical Infrastructure
4. Storage • Create AIP • <AIP> <Hash> </Hash> <CoreMetadata> </CoreMetadata> <Metadata> </Metadata> <Record> </Record></AIP> • Deposit in Digital Stacks (LOCKSS) • Generate manifest list to expose to LOCKSS • LOCKSS harvests from manifest server
Why LOCKSS? • Benefits • Automatic integrity checking • Automate error-correction • Geographically dispersed copies • Bitstream preservation • Committed community of support • Hardened operating system • Concerns • Maximum number of objects in a Unix file system • Community of support is small
4. Access • DIPs for public access • No administrative, preservation metadata • Formats supported by common browsers • Website • Records not confidential (by law) • SQL query engine with discovery metadata • Limited access website • In repository, selected locations • Record series with personally identifying information
5. Preservation • Bitstream preservation • Developing audit procedures • Periodic validation of dark archives against accession register • For future development • Capturing minimum preservation metadata • On-the-fly rendering tools • Long-term format migration
Community of Shared Practice • Personal Relationships • Challenge of building relationships over the Internet • Lack of rich, immediate feedback in communication • Lack of spontaneity, serendipity, play • Inter-Agency Relationships • Different practices • Laws and regulations • Money
For more information • http://rpm.lib.az.us/PeDALS/ • Principal Investigator • Richard Pearce-Moses • Project Coordinator • Sara Muth • State Partner Leads • Florida: Mark Flynn • New York: Bonnie Weddle • South Carolina: Bill Henry • Wisconsin: Helmut Knies