110 likes | 247 Vues
The ENSURE Linked Data Technical Registry Workshop, presented by Robert Sharpe from Tessella, delves into the role of archives and libraries in managing representation information. It reviews the evolution of technical registries like PRONOM and discusses their challenges and successes. Key topics include the significance of descriptive cataloging, detailed contexts for archival materials, and strategies for a more user-friendly linked data registry. Insights on synchronization, usability, and the importance of provenance in data management are shared, fostering engagement and feedback among participants.
E N D
ENSURE Linked Data Registry PRELIDA Workshop 2013 Robert Sharpe, Tessella
Agenda • Archives, libraries and representation information • Previous “technical registries”: • Potted History • Issues • ENSURE linked data technical registry: • What’s different? • Why we hope it should succeed? • Conclusions and feedback…
Archives, libraries & representation information • Hold descriptive / cataloguing information for centuries: • Helps determine context and makes things unambiguous: • E.g., censusrecords • Frequency, type of information • Professions • Parish boundaries • Includes references to other sources / archives • A “representation information network” of “linked data” • With advent of digital material: • Need information on formats, rendering software etc. • Look to add “Technical Registry”
Technical Registries: Potted History 1/2 • PRONOM: • Started in 2001 • On-line from 2005 • “File format registry” • In fact, holds more… • Planets Core Registry (2008) • Holds even more entities • Both: • Database–based • Web-based GUI • Issues: • Partially populated • Hard to add new entities • Hard to synchronise
Technical Registries: Potted History 2/2 • Move to linked data: • Linked Data PRONOM • UDFR • … • Issues: • Partially populated • Hard to add new entities • Partial projects: enough to be used? • Hard for people to query: SPARQLbut not via simple GUI • Complex provenance
What’s different? • ENSURE Linked Data Technical Registry: • Less entities: more population: • Expand later • Start with synchronise issue • Good querying and user interface: • Human Search / Browse • Human View / Edit • Simple view of provenance • Long term commitment: • Will integrate with SDB/Preservica • 20+ organisations will use it
Data Model • Keep it simple: • Things actually used • Things actually populated • Add more if and when needed • Format: • ID, Name, Version, Description • Release Date, Withdrawn Date • Internal Signature, External Signature • Relationships • Not: • Assessments, Risk scores • Documents, Reference files, Agents • Intellectual Property • Technical Environments • XCDL, XCEL • Types, Faceting • Complex provenance
Allow view / edit • Needs to be simple and user friendly • Not clear it can then expand with model w/o effort?
Provenance • Blocks of information: • Format, Software, Property, Pathway • Who made change to format, when and based on what info? • Need provenance of block not each item • Store every change: • Rollback • Diff • In fact makes synchronise easy: • Receive update and detect change
Conclusions • Simple, Usable • Synchronised (as needed) • Provenance held (simply) • Expandable (with limited but not zero effort) • Being built now • Should be complete by December • Will be integrated to working repository and thus used • Will need to iterate from there… • Comments and ideas welcome