1 / 26

Why is digital preservation so difficult?

Explore the difficulties of preserving digital information, from unraveling complex levels of data to understanding the meaning of bits. Discover the importance of representation information and the need for a designated community to ensure the long-term accessibility of digital objects.

jgraybill
Télécharger la présentation

Why is digital preservation so difficult?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why is digital preservation so difficult? September 15/16 2009 Rome

  2. What’s so special about things digital? • 1’s and 0’s difficult to see

  3. Pits on CD Closeup (actual photomicrograph) of stamped CD-ROM, data pits clearly visible Picture rom: http://www.flickr.com/photos/eaj836/2559025266/

  4. CD • Need to unravel the various levels of : • Bit stuffing • Error correction codes • Logical addressing • Fragmented files • File systems

  5. Alternatives • Carve 1’s and 0’s in stone • Write very small characters into titanium sheets But • What do the bits mean?

  6. Digital objects

  7. Great Wall of China

  8. Components of an Interactive Multimedia Performance (IMP) • People: directors, performers, technicians, programmers, etc • Documents: e.g. performance plan and procedure, music scores, documentation about performance context, programmes • Musical instruments: traditional (e.g. violin, cello, piano), augmented, … • Particular focus on 3D motion of the performer • Mapping and content generation application (e.g. Max/MSP patches) • Output multimedia contents (e.g. video, graphics, sound) • Supporting applications (e.g. multimedia applications for processing and rendering) • Operating system • Hardware used in the performance: computer, camera, mixer, speakers, … Input 3D motion data Analysis & Processing Mapping Parameters Mapping GUI Multimedia Generation Multimedia output

  9. SEAWIFS IMAGE

  10. Just Format? representation information rules sfqsftfoubujpo jogpsnbujpo svmft You have a file JHOVE tells you it is WORD version 7 Format Registries – useful but not enough: formats can be used for multiple purposes e.g. audio files used to store configuration parameters

  11. Data… Level 2 GOME Satellite instrument data

  12. Example: Identification of an Attribution Right LF1. Written_Norm Art. X of Law Y Legislation is_documented_in 100% precision CR. Activity_Type CR51. Attribution_Right generates To claim authorship Singleton CR20. Perform allows Singleton Work’s Provenance 100% recall, <100% precision has_type has_type E7. Activity Kia claiming authorship E7. Activity E39. Actor F28. Expression_Creation Kia Ng Activity of Improvisation on the Violin performed_by carried_out has_right_type created E30. Right E72. Legal Object CR.Ownership Right is_on F22. Self_contained_Expression Kia’s right to claim authorship Expression of the Improvisation on the Violin Derived Property Rights became_owner_of Thanks to MetaWare FRBRoo Rights Ontology CIDOC-CRM

  13. Threats to digital objects

  14. What is needed? MONEY

  15. Disincentives for curation: cost Budget available • Future generations do NOT: • - Vote • - Pay taxes Money If cost of preserving old information increases… Time Need to show that costs will be contained

  16. Digital Preservation… • Easy to do… • …as long as you can provide money forever • Easy to test claims about repositories… • …as long as you live a long time • Reference Model for Open Archival Information System (OAIS) provides an approach • ISO 14721 and also free from http://public.ccsds.org/publications/archive/650x0b1.pdf

  17. Data – OAIS view A reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing • In 2006, the amount of digital information created, captured, and replicated worldwide was 161 exabytes(161 billion gigabytes) -roughly 3 million times the information in all the books ever written! • Between 2006 and 2010, the information added annually to the digital universe will increase more than six fold from 161 exabytes to 988 exabytes IDC (2007) The Expanding Digital Universe

  18. Key OAIS Concepts • Claiming “This is being preserved” is untestable • Essentially meaningless • Except “BIT PRESERVATION” • How can we make it testable? • Claim to be able to continue to“do something” with it • Understand/use • Need Representation Information • Still meaningless… • Things are too interrelated • Representation Information potentially unlimited • Designated Community • Many other concepts identified • Finer grained taxonomy than simply saying “metadata” • Allows one to ask if one has all the required types

  19. Information Object 1+ interpreted interpreted using Data Representation 1+ using Object Information Physical Digital Object Object 1+ Bit Sequence Representation Information The Information Model is key Recursion ends at KNOWLEDGEBASE of the DESIGNATED COMMUNITY (this knowledge will change over time and region)

  20. FITS FILE FITS DICTIONARY FITS STANDARD DICTIONARY SPECIFICATION PDF STANDARD FITS JAVA s/w XML SPECIFICATION PDF s/w UNICODE SPECIFICATION JAVA VM

  21. Rep • Info • Virtualisation /DISCIPLINE

  22. Things change/disappear How can we ensure that the information trapped in the “bits” remains understandable despite all these changes? • Software • Hardware • Environment • E.g. Network links to related information • People • What is “common knowledge” How can a digital curator even be aware of these changes?

  23. Digital Preservation • Need to preserve information & knowledge – not just “the bits” • Documents, videos are rendered – simple? • Data – must be processed – in new ways - harder • Need to manage knowledge to keep archives alive through time • Preservation is a process, not a one-time event • Preservation is expensive – costs need to be shared • The alternative is money – endless supplies of money • Open Archival Information Systems Reference Model (ISO 14721) provides a general conceptual framework and terminology • (http://public.ccsds.org/publications/archive/650x0b1.pdf) • OPEN process – not just “Open Archives” Need more than just formats

  24. Survey and preliminary results • PARSE.Insight plus Alliance for Permanent Access • Plus customised surveys with CASPAR, DCC etc • Targets • Researchers • Plus case studies in HEP, (Earth Observation and Social Sciences) • Publishers • Funders • Data managers • Almost 3000 responses so far 1) Creation and use of digital research data 2) Data Re-use 3) Data Preservation 4) Publishing Your Work 5) Final questions http://www.parse-insight.eu/

  25. Threats to preservation/re-use(early incomplete results)

More Related