420 likes | 569 Vues
LDS 3. David Tarrant @ davetaz davetaz@ecs.soton.ac.uk Open Planets Foundation / University of Southampton. Applying Preservation Principals to Linked Data Systems. iPres2012 Toronto, October 2012. Present Day. Presenting the REF The Results Evaluation Framework.
E N D
LDS3 David Tarrant @davetaz davetaz@ecs.soton.ac.uk Open Planets Foundation / University of Southampton Applying Preservation Principals to Linked Data Systems iPres2012 Toronto, October 2012 This work was partially supported by the SCAPE Project. The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137).
Presenting the REFThe Results Evaluation Framework • 5 Tools (Droid, Fits, file, fido, Tika) • 65 Versions (from 2008 to now) • 1 Govdocs Corpora • 1 Question….
How accurate are file format identification tools historically?
Why is Data Important? • Data and Metadata are knowledge. • Knowledge is power. • Knowledge enables decision. • Knowledge enables process. • Knowledge empowers action. • Knowledge enables us to say because…
Processes DATA Process Decision DATA DATA A Classic Flow Chart Data is key to making decisions
Policy DATA Process Policy DATA DATA A Preservation Flow Chart Data is key to informing policy
Policy Data - Generated • When? • Who? • What it affects? • What action is taken? • Why? Policy
Why? • Because something said so? DATA • When? • Who? • What it affects? • What action is taken? • Why? DATA DATA
Case Study Example (Opinion) • Due to format obsolescence, all flash video files are to be migrated to H264/AAC. • Input data: Study on proliferation of flash and evidence of lacking support from the rights holder, adobe. • File B was created from File Aa year ago as it was identified as being a flash video file. • Today, File Ais identified as being an ogg video file. • What has changed? Why? Does it affect me? Who generated the wrong information? Did they generate any other wrong information?
A Fact? File#1 hasIdentification application/zip
Provenance • Tarrant, David and Carr, Leslie (2012) LDS3: Applying Digital Preservation Principals to Linked Data Systems. In, Ninth International Conference on Digital Preservation (iPres2012), Toronto, Canada Tim Berners-Lee Provides 5-Star Linked Data Guide
Data!!! • One fact. • One document the fact comes from • One citation about the documents place of publication. • Who, What, Whenand Where • Who they worked for and with.
In Linked Data a document is called a named-graph. • But these also get used for two purposes!! Named-Graph File#1 hasIdentification Application/zip
The two uses of the named-graphNo. 1 – Data Publication DATA Named-Graph File#1 DATA hasIdentification Application/zip DATA
The two uses of the named-graphNo. 2 – Data Discovery/Query Named-Graph DATA File#1 hasIdentification application/zip DATA File#1 hasIdentification application/msword DATA
The two uses of the named-graphNo. 2 – Data Discovery/Query Named-Graph Named-Graph File#1 File#1 hasIdentification Works For application/zip hasIdentification File#1 hasIdentification Works For Application/zip application/msword
Quads Query Graph Source Graph 1 File#1 hasIdentification application/zip Source Graph 2 File#1 hasIdentification application/msword After all, RDF is a graph model RDF the spec, not the RDF/XML serialization
Quads Query Graph File 5.04 Source Graph 1 usesTool File#1 hasIdentification application/zip File 5.07 Source Graph 2 usesTool File#1 hasIdentification application/msword
Still with me… • Ok so what about versioning? File1/Identification/tool/file/version/5.03 File#1 hasIdentification University of Southampton File1/Identification/tool/file/version/5.07 File#1 hasIdentification application/msword
Latest File1/Identification/tool/file/version/5.03 File#1 hasIdentification /File1/Identification/tool/file/ previous version University of Southampton File1/Identification/tool/file/version/5.07 File#1 hasIdentification application/msword
www.LDS3.org • A technical solution to all the complexity, automatic: • Versioning • Linking • Annotation • Named-Graph Management • Query Management
www.LDS3.org • CRUD • SWORDv2 (Based Upon) • Oauth Authentication
In the paper • Links between P2-Registry, Pronom and LDS3 • Description of the LDS3 specification • Overview of software in the LDS3 stack (hardly any of it is new) • How LDS3 relates to Amazon S3 • More on named-graphs versioning • More on information and non-information resources.
DEMO • http://dev.lds3.org/admin/timemachine.php?uri=http://dev.lds3.org/doc/B1/E3/7F01/8ACE-43BA-9AA9-B708B7A20263
Presenting the REFThe Results Evaluation Framework • 5 Tools (Droid, Fits, file, fido, Tika) • 65 Versions (from 2008 to now) • 1 Govdocs Corpora • 1 Question….
How accurate are file format identification tools historically?
PDF 1.4 http://data.openplanetsfoundation.org/ref/pdf/pdf_1.4/
DOCX http://data.openplanetsfoundation.org/ref/docx/
The Future • Get me the identification for a file as it would have been on 3rd October 2010. GET /ref/?query=“SELECT ?identificaiton where file = X” HTTP/1.1 Accept-Datetime: Sun, 3 Oct 2010 12:00:00 GMT Accept: text/plain application/zip
LDS3 David Tarrant @davetaz davetaz@ecs.soton.ac.uk Open Planets Foundation / University of Southampton Applying Preservation Principals to Linked Data Systems iPres2012 Toronto, October 2012 This work was partially supported by the SCAPE Project. The SCAPE project is co-funded by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137).