1 / 46

“What I Learned This Summer”:  A Week at SAA’s First Electronic Records Summer Camp

“What I Learned This Summer”:  A Week at SAA’s First Electronic Records Summer Camp. Daniel Linke University Archivist and Curator of Public Policy Papers December 14, 2007. Geisel Library at UCSD (Photo by Sara Muth). University of California, San Diego August 6-10, 2007.

rianna
Télécharger la présentation

“What I Learned This Summer”:  A Week at SAA’s First Electronic Records Summer Camp

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “What I Learned This Summer”:  A Week at SAA’s First Electronic Records Summer Camp Daniel Linke University Archivist and Curator of Public Policy Papers December 14, 2007

  2. Geisel Library at UCSD (Photo by Sara Muth) University of California, San Diego August 6-10, 2007

  3. Yes, that Geisel (Photo by Sara Muth)

  4. Eleanor Roosevelt Campus (Photo by Sara Muth)

  5. Our accommodations in the Asante dormitory (Photo by Sara Muth)

  6. My suitemates: Peter Johnson, Eric Paquette, and Dylan McDonald • (Photo courtesy of Eric Paquette)

  7. 27 attendees from a variety of institutions (government, educational, and private repositories): • UCSD, UC-Irvine, Harvard B. School, U. New Mexico, UT:Arlington, Occidental College, UWI:Madison • AZ, CA, NC, and WA State Archives • CIGNA, National Fire Protection Association, Ford, History Associates • Sacramento Archives and Museum • Marist Brothers of Canada

  8. Terrace of the college commons where we took our meals (Photos by Sara Muth)

  9. Fellow “campers” : Police Explorers Club (Photo by Sara Muth)

  10. Our classroom was within the SDSC (Photo by Eric Paquette)

  11. Our classroom (Photo by Chien-Yi Hou)

  12. Some instructors standing at the back (Photo by Chien-Yi Hou)

  13. SAA Summer School Instructors • Mark Conrad (NARA) Preservation principles • Mike Smorul (U Md) Preservation services • Reagan Moore (SDSC) Data grids • Arcot (Raja) Rajasekar (SDSC) Advanced data grids • Richard Marciano (SDSC) Preservation applications • Chien-Yi Hou (SDSC) Preservation applications

  14. What the week consisted of (in format) (Photo by Chien-Yi Hou)

  15. What the week consisted of in subjects covered • Monday • Electronic Records 101 (Conrad) • Components of an Electronic Records Program (Conrad) • Infrastructure Independence (Moore) • mySRB Tutorial (Moore) • Tuesday • Appraisal and Disposition (Conrad, Marciano, Chien-Yi) • Accessioning (Smorul, Marciano, Conrad) • Wednesday • Arrangement (Marciano, Conrad, Moore) • Description (Marciano, Rajasekar, Chien-Yi, Moore) • Thursday • Preservation (Moore, Smorul, Chien-Yi) • Access (Moore, Marciano) • Friday • Scalability (Moore, Marciano) • Getting started (Conrad, Moore)

  16. What are Electronic Records? • Easy to Define • Any Record that Can Only be Accessed With a Computer • Hard to Define • Many Records Don’t Have an Analog Equivalent • Often Difficult to Say Where the “Boundaries” of a Record Are

  17. Where Do They Come From? • Types of applications that can create electronic records • Word processing • Databases • Spreadsheets • Geographic Information Systems • E-mail • Any Computer Application Could Potentially be used to Create Electronic Records

  18. Unique Qualities: Faster than Rabbits • They Multiply! • PERMANENT Federal Electronic Records • 1 to 5% of the Total Produced • Next 15 Years – 350 Petabytes Produced (Peta = 1000 TB) • Beyond the Current State of the Art • Archivists can Identify the Wheat and Chaff • Resource Allocators are Taking Notice

  19. Unique Qualities: Handle With Care • They are Fragile! • Easily Deleted • Keeping the Contextual Information Linked to the Data is Difficult • Without this it is difficult to assert you have authentic records

  20. Unique Qualities: Manipulation • The Good: Organized or Used in Multiple Ways • Records can be more easily used. • Records that would be difficult to use in paper form can be used quite easily in electronic form. • The Not So Good: • -Records can be easily changed.

  21. Unique Qualities: Native Habitat vs. Zoo • Original Applications • Run Out of Room • Go Belly Up • Moving the Records Out of Their Native Habitat can be Challenging • Where is the Boundary Between the Records and the Application? • How do You Maintain Essential Characteristics in a Zoo (aka Preservation Environment)? • The Formats Become Obsolete, Too!

  22. COMPONENTS OF AN ELECTRONIC RECORDS PROGRAM Policies and Mandates Technical Infrastructure Social Infrastructure

  23. Technical Infrastructure • Challenge: there are NO proven methods for the long-term retention of E/R in many formats -Ongoing Empirical Research: but theory does not Make it So!

  24. Storage Resource Broker (SRB)

  25. Evolving Technology Preservation Environment Records Infrastructure Independence External World Preservation environment middleware insulates records from changes in the external world

  26. Infrastructure Independence • Use data grids to preserve records independently of the choice of technology • Management of archives properties • Map technology components to preservation principles • Capabilities that support preservation requirements • Construct preservation environment from components • Archival engineering perspective • Use infrastructure independence to enable use of new technology • View that new technology is an opportunity instead of a challenge

  27. Preservation Standards • Architectural Model • OAIS, Reference Model for an Open Archival Information System • Representation information for each record • Submission / Archival / Dissemination Information Package (SIP / AIP / DIP) • Data grid - Storage Resource Broker (SRB), integrated Rule Oriented Data System (iRODS) • Digital Library - DSpace, Fedora • Metadata • Dublin core • LCDRG, NARA Life Cycle Data Requirements Guide • PREMIS, Preservation Metadata Implementation Strategies • Metadata organization • MPEG-21, ISO/IEC TR 21000-1: MPEG-21 Multimedia Framework • METS, Metadata Encoding and Transmission Standard • OAIS, Reference Model for an Open Archival Information System • Submission / Harvesting • Producer Archive Interface (NASA) • OAI-PMH, Open Archives Initiative - Protocol for Metadata Harvesting • Data format • pdf, xml, (330 formats retrievable on web crawls) • Assessment criteria • RLG/NARA TRAC - Trustworthy Repositories Audit & Certification: Criteria and Checklist. http://wiki.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocuments/trac.pdf

  28. Ask for data Using a Data Grid – in Abstract Data Grid Data delivered • The data is found and returned • Where & how details are hidden • User asks for data from the data grid

  29. DB Storage Resource Broker Server Metadata Catalog Storage Resource Broker Server Using a Data Grid - Details Oracle ux-brk14 ux-brk12 • User asks for data • Data request goes to SRB Server • Server looks up information in catalog • Catalog tells which SRB server has data • 1st server asks 2nd for data • The data is found and returned

  30. For more details, see: Moore, Regan, “Building Preservation Environments with Data Grid Technology”, American Archivist, vol. 69, no. 1, pp. 139-158, July 2006

  31. Appraisal of ER: Get There Early • Records Need to be Appraised: • Early in Their Lifecycle • Fragile • Ephemeral • In Their Native Habitat • Functionality

  32. Technical Appraisal • For Permanent Records Have to Conduct Technical Appraisal • Feasibility of Preserving the Records • Identify all of the Digital Objects • Essential Characteristics • At Scale!

  33. Bootcamp continued… Appraise this !@#$ Disposition In Action… Arrangement In Action…

  34. Tapping into Archival Knowledge Electronic Records "Summer Camp"

  35. The Website

  36. The Website, cont’d

  37. Formulating Appraisal Rules • Retrieve root webpage ‘http://water.usgs.gov/lookup/getgislist’ • For each entry: • Create an “matching entry” collection on the SRB • Add ‘entry description’ metadata to that collection • Create “Description” subcollection • Load web page • Load all “.gif” | “.jpg” | “.jpeg” files • Load all “.doc” • Load metadata file • Create “ArcINFO” subcollection • Load all “.e00” | “.clr” | “.asc” | “.nit” | “.dlg” | “.txt” files • Create “Shape” subcollection • Load all “.shp” files • Create “SDTS” subcollection • Load all “.sdts” files • Create “Others” subcollection • Load “.tfw” | “.rdb” | “.clr” | “.asc” | “.prj” files • DECOMPRESS & LOAD “.zip” | “.gz” | “.tgz” | “.tar” | “.tar.gz” files

  38. E-FOIA Document Collections: Dep. Of State

  39. National Archives and Records Administration Transcontinental Persistent Archive Prototype Federation of Five Independent Data Grids NARA I NARA II Georgia Tech U Md SDSC MCAT MCAT MCAT MCAT MCAT Extensible Environment, can federate with additional research and education sites. Each data grid uses different vendor products.

  40. Three-tiered Cryptographic Information. CryptographicSummaryInformation Witness IntegrityToken ACE – Basic Methodology k:1 l:1 • 1 CSI/time window • 1 CSI / (n) objects • ~100MB/year • 1 IT/object • ~1KB • 1 Witness/week • ~2-3KB/year • Each tier is periodically audited separately according to policies set by managers.

  41. End of the day (Photos by Sara Muth)

  42. Club Asante Photos by Sara Muth (top) and Eric Paquette (right)

  43. Commemorative Corkscrew (Photo by Gary Spurr)

  44. Acknowledgments Slides with text are from the course instructors’ PowerPoint presentations: Conrad, et. al Photos as credited. (Photo by Eric Paquette)

More Related