1 / 30

Taming the Wild LISTSERV; or, How to Preserve Specialized E-Mail Lists

Taming the Wild LISTSERV; or, How to Preserve Specialized E-Mail Lists. Lisa M. Schmidt lisa.schmidt@matrix.msu.edu http://www.h-net.org/archive/ MATRIX: The Center for Humane Arts, Letters & Social Sciences Online Michigan State University May 23, 2007.

Télécharger la présentation

Taming the Wild LISTSERV; or, How to Preserve Specialized E-Mail Lists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Taming the Wild LISTSERV; or, How to Preserve Specialized E-Mail Lists Lisa M. Schmidt lisa.schmidt@matrix.msu.edu http://www.h-net.org/archive/ MATRIX: The Center for Humane Arts, Letters & Social Sciences Online Michigan State University May 23, 2007

  2. H-Net: Humanities and Social Sciences Online International consortium of scholars and teachers Oldest collection of born-digital and content-moderated arts, humanities, and social science material on the Internet Valuable scholarly resource More than 180 networks, or e-mail lists More than 230 “private” lists More than 1 million e-mail messages

  3. MATRIX Digital humanities research center Devoted to the application of new technologies in humanities and social science teaching and research Uses Internet technologies to improve education and increase the flow of information

  4. NHPRC Grant Conduct assessment of existing H-Net preservation policies and practices Develop an improved long-term preservation plan Apply NARA/OCLC TRAC checklist Useful to those managing large collections of electronic records Research semantic clustering search techniques

  5. Preserving E-Mail Lists as Scholarly Resources How H-Net Works Current Preservation Practices Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) Other E-Mail Preservation Projects Preservation Improvement Plan

  6. How H-Net Works:Network Configuration

  7. How H-Net Works:Backup & Security Daily incremental backups, weekly full backups Tapes cycle through system every 6 weeks Swapped tapes kept in locked cabinet in secured room Tapes replaced as needed Monthly full, permanent tape backups Tapes kept in secured room Plans to keep log and move to offsite storage Server rack kept in climate controlled, physically secured room

  8. How H-Net Works:Posting Messages H-Net runs on LISTSERV Software Users must be list subscribers to post Messages written in plain text No attachments allowed on public lists

  9. How H-Net Works:Posting Messages Message Posting Process

  10. How H-Net Works:Archiving of Lists Messages kept in flat text files called “notebooks” Post from a few seconds up to several days after approval Notebook includes messages posted during a weekly time period

  11. How H-Net Works:Archiving of Lists Ex. “h-africa.log0802a”

  12. How H-Net Works:Archiving of Lists BRS Database Newest notebook messages parsed and copied every 24 hours MD5 hashes created for each message Available for full-text search MySQL Database Cache Log browse cache extracts key metadata, creates MD5 hashes Cache builder script writes metadata to MySQL database cache

  13. How H-Net Works:Archiving of Lists Message Metadata Stored in MySQL Database

  14. How H-Net Works:Message Retrieval

  15. How H-Net Works:Message Retrieval

  16. How H-Net Works:Message Retrieval

  17. How H-Net Works:Message Retrieval

  18. How H-Net Works:Message Retrieval Constructed Persistent URL http://h-net.msu.edu/cgi-bin/logbrowse.pl?trx=vx&list=H-Albion&month=0805&week=c&msg=jeSTCR0QAxq28hhgJPZ%2beQ&user=&pw=

  19. Current Preservation Practices Message Ingest, Storage, and Retrieval Processes

  20. Current Preservation Practices Backup and storage Significant properties: message content, stored in plain text formats Authenticity Informal check by author and/or editor on posting Broken URL on message retrieval attempt Cached metadata fulfills PDI requirement

  21. Current Preservation Practices Cached Metadata

  22. TRAC Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) published by NARA and OCLC, 02/07 For certification by third party or self assessment Three sections Organizational Infrastructure Digital Object Management Technologies, Technical Infrastructure, & Security

  23. TRAC

  24. Other E-Mail Preservation Projects Preservation of Electronic Mail Collaboration Initiative North Carolina State Archives, Kentucky Department of Library and Archives, Pennsylvania State Archives http://www.ah.dcr.state.nc.us/records/EmailPreservation/default.htm Collaborative Electronic Records Project Smithsonian Institution/Rockefeller Archives Center http://siarchives.si.edu/cerp/index.htm Collection-Based Long-Term Preservation San Diego Supercomputer Center http://stinet.dtic.mil/cgi-bin/GetTRDoc?AD=ADA365661&Location=U2&doc=GetTRDoc.pdf All Used XML Encoding

  25. Preservation Improvement Plan:Backup & Storage Media refreshment schedule for all tapes Systematic sampling, remounting, reading, retensioning permanent tapes More than one set of backup tapes, or a server mirror Secure storage systems Backup log Participation in distributed storage system, such as LOCKSS or iRODS

  26. Preservation Improvement Plan:Authenticity Shorten and standardize ingest time window to seconds rather than weeks Define and document access permissions Maintain audit log that tracks all activities associated with records Perform regular authenticity checks using message digests Consider using SHA-2 for integrity checks

  27. Preservation Improvement Plan Continue to use MD5 to calculate name Generate shorter persistent URL for use as citation Awkward metadata handling Editor data should be added to what’s there, not replace it

  28. Preservation Improvement Plan:Migration Messages and Notebooks No migration strategy needed Plain text ASCII and UTF-8 stable, open formats Attachments Make private lists browsable by providing constructed URL Display attachment link in browse window Detach attachments from notebook files, store separately, link to original message Provide conversion on demand to current formats

  29. Preservation Improvement Plan:From TRAC Checklist Succession plan Periodic review or trigger event definition Technology watch Document, document, document! Technology history Change management system Staff roles, responsibilities, and authorizations Written recovery plan

  30. References H-Net Archives, Documentation, http://www.hnet.org/archive/doc.php H-Net: Humanities and Social Sciences Online, http://www.h-net.org InterPARES, http://www.interpares.org MATRIX: The Center for Humane Arts, Letters, and Social Sciences Online, http://www.matrix.msu.edu OAIS Reference Model, http://public.ccsds.org/publications/archive/650x0b1.pdf Trustworthy Repositories Audit & Certification: Criteria and Checklist, http://www.crl.edu/PDF/trac.pdf

More Related