1 / 0

Archiving the Digital Records of the SAA-UT Student Chapter

Archiving the Digital Records of the SAA-UT Student Chapter. Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013. Overview. Previous Work Determining Scope Gathering & Assessing records Appraisal & Arrangement Creating the DSpace Collections Privacy Processing

alvaro
Télécharger la présentation

Archiving the Digital Records of the SAA-UT Student Chapter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Archiving the Digital Records of the SAA-UT Student Chapter

    Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013
  2. Overview Previous Work Determining Scope Gathering & Assessing records Appraisal & Arrangement Creating the DSpaceCollections Privacy Processing Descriptive Metadata Spreadsheet Creation of the SIPS Batch Ingest Shell scripting Batch Metadata Editing Twitter Future Work Self-Archiving Guidelines
  3. Previous Work In 2011, Wendy Hagenmaier and Rachel Appel digitized SAA paper records for the Survey of Digitization class. They digitized 221 objects. They set up a basic schema in DSpace, which we used as jumping-off point.
  4. Existing Schema Community-School of Information Student Organizations Sub-community-Society of American Archivist UT Chapter Collections: Administrative Records, Archives Week, Correspondence, Events, Financial Records, Marketing, Meeting Minutes, Website
  5. Our Goal Archive all the existing born-digital records, especially the records from the past year. But more importantly, set up a self-archiving work flow that would allow future SAA members to easily archive their own records into Dspace.
  6. First Things First We wanted to gain intellectual control over the materials. We asked: “What exists and where is it? What should be included for the future?” Used Megan and Kristin’s expertise as previous officers Rachel and Wendy’s previous documentation
  7. Actually Getting the Records We asked previous SAA board members to send us anything they had. Gleaned materials from the SAA’s two websites-the general website and the Archives Week website
  8. Type of Records Images, documents, recordings, presentations and spreadsheets Files that made up the websites, html and css files mostly Twitter and Facebook accounts Listserv emails
  9. Type of Records
  10. Type of Records
  11. SAA Websites
  12. Narrowing the Scope Over 600 discrete files Experimented with archiving Twitter and Facebook-mixed results Looked into previous attempts to archive listserv emails. Facebook and the emails proved too complicated and time-consuming for the scope of this project.
  13. Appraisal Appraisal basically consisted of weeding out duplicates, of which there were a lot. Kristin managed the files that were sent to us from previous members. Megan gleaned the general SAA website. Nora worked with the Archives Week website. Over 900 files
  14. Appraisal SAA Website Structure
  15. Restructuring DSpace Large number of files Over 10 year time span We wanted to maintain the arrangement, but the current structure was too restrictive. We moved everything up a level, in order to create collections for each year
  16. New Structure Community School of Information Student Organizations. Sub-community Society of American Archivists: UT Student Chapter Sub-sub-communities Administrative Records Archives Week Correspondence Events Financial Records Marketing Meeting Minutes Website and Social Media Collections Calendar year
  17. New Structure
  18. Privacy All Financial Records collections have been set to be private.  These collections contain budgets, potential account information, and information about donations to Archives Week that the donors may wish to keep private.  All financial documents from Archives Week planning have intentionally been included with Financial Records in order to keep the Archives Week collections open to the public. The most current years (2010-2013) of Administrative Records are currently closed.  Sensitive documents in these collections include membership rosters (with emails), and mentorship program information.  EIDs have been redacted from the 2010-2011 membership rosters.
  19. Tips to ensure privacy EIDs are not to be kept in the digital archive and documents should be reviewed to be sure that they are not included. Other sensitive information may be included in the archive, but kept in a private collection.  All sensitive documents have been included in only Financial Records and Administrative Records, allowing the remaining collections to be open.  Titles of private items will be viewable to the public, but the contents of the items will not be. It is up to the discretion of the future board to determine when the closed collections may be made publicly available.  The Treasurer is responsible for reviewing current and previously deposited records for privacy issues, as the Treasurer will be most cognizant of sensitive information contained in financial and membership records.
  20. Processing—metadata gathering Kept archival copy of records safe on a flash drive Made other ‘processing’ copies for determining content and gathering metadata Created spreadsheet for entering descriptive metadata This is also when we determined intellectual arrangement of records and spotted duplicates
  21. Creation of SIPs Create extracted metadata xml file using National Library of New Zealand’s Metadata Extractor Perl script to create dublin_core formatted xml from extracted xml, and create a new directory for each Manually add original bitstream to each directory Perl script to create ‘contents’ text file Perl script to change directory names to item_001, item_002, etc. This had to be done separately for each collection (about 30 collections)
  22. Batch Ingest Staged SIPs on Vauxhall in structure mirroring the Dspace structure, and wrote batch ingest command lines before meeting with Sam Change in command line: /opt/dspace/bin/dspace import org.dspace.itemimport.ItemImport --add --eperson=msdirickson@gmail.com --collection=2081/29160 -- Problems with dublin_core files—junk!
  23. Shell Scripting Since we had so many collections, we bundled the command lines to execute using shell scripts The idea was to save time…..but… The script didn’t leave time to check for errors before moving on to the next collection Added: echo sleep 5
  24. Batch Metadata Editing Exported metadata from each sub-community: id collection dc.contributor.author dc.date.created dc.date.issued dc.identifier.uri dc.language.iso dc.publisher dc.subject dc.title Merged with our descriptive metadata files by matching with id #’s, and adding/changing dublin core fields and data: id collection dc.contributor.author – SAA-UT dc.date.created –changed from ingest date, to date of creation/use of document dc.date.issued dc.identifier.uri dc.language.iso dc.publisher dc.subject dc.title.alternative –moved filename here dc.contributor – if an individual author was known dc.title --changed from filename to descriptive title dc.coverage.spatial dc.description
  25. Batch Metadata Editing Once the spreadsheet was completely edited, we saved them as CSVs, and met with Sam again to import the metadata Each sub-community had to be imported individually (much faster than each collection!) Command line: Opt/dspace/bin/dspace metadata-import –f /opt/batch_ingests/2081-29125.csv Weird things happened with the ingest date…
  26. Batch Metadata Editing Yay, Metadata!!!
  27. Social Media Twitter provides a simple means for downloading Tweets We felt that the tweets, especially from 2012, were valuable records. The Archives Week lectures were live-tweeted, providing rich documentation for the events. The Dspace bundle includes: Zip file including CSV of tweets (with time/date stamps) Screenshot for added visual context
  28. Future Work Follow workflow and continue archiving records! Website—too complicated for a simple ingest Listserv emails Facebook Continued digitization
  29. Self-Archiving Guidelines/Workflow Naming Conventions and Standards Roles & Responsibilities Basic workflow for importing items individually to Dspace, including adding descriptive metadata Security/Access and Privacy Issues Community and Collection structure; arrangement guidelines for consistency Appraisal/Selection Policies and record priorities
More Related