The Alliance for Data Archive Technologies: Looking towards a Common Future
200 likes | 299 Vues
The Alliance for Data Archive Technologies: Looking towards a Common Future. Myron Gutmann, ICPSR Ben Evans, ASSDA Deborah Mitchell, ASSDA Kevin Schürer, UK Data Archive. Overview. Why? What? Why Now? Early Steps Understanding Process Understanding Needs Next Steps. Why?.
The Alliance for Data Archive Technologies: Looking towards a Common Future
E N D
Presentation Transcript
The Alliance for Data Archive Technologies: Looking towards a Common Future Myron Gutmann, ICPSR Ben Evans, ASSDA Deborah Mitchell, ASSDA Kevin Schürer, UK Data Archive
Overview • Why? • What? • Why Now? • Early Steps • Understanding Process • Understanding Needs • Next Steps
Why? • Data curation has been an ad hoc process, with local practices & expertise • Since the 1990s • Enormous investment in technology • Significant successes in social science(SDA, Nesstar, DVN, IPUMS, even ICPSR) • Major new ways to find & use content (Google) & architectures to deliver content (web services)
More Why • Proprietary systems unsustainable • Market too small for commercial systems • Partnerships will help avoid unnecessary duplication of effort & assure efficiency • Need to be truly global
What? • New organization to support technologies for curation, preservation, & delivery that are: • Open • Community-developed • Standards-based • Built on existing networks of social science data archives & technology centers, and … • Open to all who want to contribute
Why Now? Three Standards • DDI – Metadata Standard • OAIS – Preservation Reference Model • Repository Architecture Standards: - Fedora, D-Space & Duraspace • Organizational models like the DDI Alliance, CESSDA, Data-PASS (even the new Hathi Trust)
Why Now? Community Tech • Community-developed software has become widely used • Examples: Drupal/Plone • Examples: Fedora • Examples: SOLR/Lucene • But we shouldn’t ignore all the challenges that this software has faced
Why Now? Workflows • Improved workflow technologies are operating in many of our institutions • Some are shared in CESSDA & Data-PASS • And in other communities: Virtual Observatory • Another challenge: not the same as sharing business practices in complex organizations
Why Now? Progress So Far • SDA • Nesstar • DVN • All used in more than one archive • Not all open-source • Potential shared technologies that we can leverage in the future
1st Steps: October 2008 Meeting • ICPSR • ASSDA • UKDA • Roper Center - UConn • Odum Ins. – N. Carolina • Harvard - IQSS • Minnesota Pop. Center • Berkeley – SDA • DANS – Netherlands • DDA Denmark • Gesis – ZA • South Africa • DDI Alliance • IASSIST • Library of Congress • U.S. NSF • U.S. NIH • Canadian SSHRC ***Thanks to Library of Congress for hosting
1st Steps: After October, 2008 • Solicit needs in the form of wish lists • Authorize creation of an organization at an appropriate time • Work on raising money and finding common ground for future work
ICPSR: Standards Compliance OAIS Workflow DDI Workflow Tools for full variable-level metadata creation not dependent on proprietary software (such as SPSS) DDI Editor DDI Converter DDI 2 to 3 translator • Ingest tools • AIP Creation-Validation • SIP Creation-Validation • DIP Creation-Validation • Audit tools
Needs: Wish Lists from … • ICPSR • UKDA • ASSDA • Harvard • Roper Center • Odum Institute • DANS (Netherlands) • DDA (Denmark) • GESIS (Germany) • NSD (Norway) • Minnesota Pop. Center
Needs: A Catalog Administration • Identity management • OAIS workflow & audit (SIP/AIP/DIP) Access • Data format conversion • Setup file creation • International data sharing • Community data/User comments/Web 2.0 • Search • Confidentiality • Persistent identifiers • Visualization • Data citation • Semantic data access • Security Production Data Management • Data producer tools • Open metadata curation • Data format curation • Data management & analysis • Qualitative data management • Data integration • Metadata registries • Survey question management • Data citation Ingest • Open metadata curation • Confidentiality • Software/algorithm archiving Archival Storage • Storage fabric/architecture (FEDORA or ?) • Replication (LOCKSS) • Persistent identifiers • Content model development
Next Steps: Canberra Meeting • Prime Goal: Strategic Planning • What’s the business model? • What are the links to… • Standards? • Security? • Archiving practice & workflows? • Training & Research? • How do we measure success?
Three Major Outcomes • Goal 1: A few critical decisions • Standards, repository framework, software approaches • Goal 2: Initial Common Interests. Examples: • Fedora data/content models • Open source metadata tools (DDI 3?) • Goal 3: How do we collaborate?
Thank you! gutmann@umich.edu deborah.mitchell@anu.edu.au schurer@essex.ac.uk ben.evans@anu.edu.au