1 / 29

A DATACITE CASE STUDY FROM THE UK DATA ARCHIVE

……………………………………………………………………………………………………. A DATACITE CASE STUDY FROM THE UK DATA ARCHIVE. TOM ENSOM …………………….…………………………….… UK DATA SERVICE UK DATA ARCHIVE UNIVERSITY OF ESSEX ………………………………..……………………. C4D WORKSHOP , JULY 2013, LONDON. WHO WE ARE.

tracy
Télécharger la présentation

A DATACITE CASE STUDY FROM THE UK DATA ARCHIVE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ………………………………………………………………………………………………………………………………………………………………………………………………………… A DATACITE CASE STUDY FROM THE UK DATA ARCHIVE TOM ENSOM …………………….…………………………….… UK DATA SERVICE UK DATA ARCHIVE UNIVERSITY OF ESSEX ………………………………..……………………. C4D WORKSHOP, JULY 2013, LONDON

  2. WHO WE ARE • Established in 1968 -46 years of selecting, curating, preserving and providing access to social science data • 6,000 datasets in the collection • Over 25,000 registered users • Data and data support services for higher and further education for research, teaching and learning • Have been registered to ISO 27001 (information security standard) since June 2010

  3. OUR SERVICES • UK Data Archive itself a department of the University of Essex • Distributed service established 1 January 2003 called the Economic and Social Data Service (ESDS) • New five-year UK Data Servicefrom 2012

  4. WHAT WE DO • Research & development, innovation • Promoting best practice in data curation • Raise standards in data security and awareness of ethical/legal issues • Raise standards in data management • Data management hub • We provide guidance to ESRC researchers and anyone else who asks

  5. WE SUPPORT RESEARCHERS • Popular training materials • Managing and Sharing Guide • Training Resources • Website: http://data-archive.ac.uk/create-manage • Bespoke training events • Large and small scale workshops

  6. ENGAGEMENT WITH RDM COMMUNITY • Recently completed JISC Managing Research Data project with University of Essex • Cross support service, departmental engagement • Piloted an RDM infrastructure • http://www.data-archive.ac.uk/create-manage/projects/rd-essex • Outputs of value to RDM community: • Metadata profile for institutional data repositories • Research data plugin for EPrints

  7. WHYCITE DATA? It’s a vital part of a rigorous research process: • Acknowledges researcher’ssources • Gives data creators, authors and data curators proper credit when their work is reused • Facilitates data resource discovery and access • Helps track the use and impactof data collections

  8. OUR APPROACH TO CITATION • Required by our user agreement (End User Licence) for many years:

  9. OUR APPROACH TO CITATION • Should include enough information to ensure the exact version can be located “University of Essex. Institute for Social and Economic Research and National Centre for Social Research, Understanding Society: Wave 1, 2009-2010 [computer file]. 2nd Edition. Colchester, Essex: UK Data Archive [distributor], November 2011. SN: 6614.” • No widely agreed standard citation format yet! • Version information crucial

  10. PERSISTENT IDENTIFERS • Persistent Identifiers (PIDs) • A string identifying a clearly defined digital object • Persistence must mean enduring • Identifiers must be unique • PIDs have been attached to scientific publications for some time • Next logical step: data • Also being applied to other entities e.g. people via ORCID system

  11. CHANGES TO DATA • Our ‘data collections’ are not discrete digital objects • Approx. 15% UKDA data collections are altered within first year after publication • Versioning - we need to distinguish between major and minor changes to a data collection • Integrate processes with: • Digital preservation activities • Current ingest infrastructure / workflows

  12. MINOR CHANGES – LOW IMPACT • Publication reference added • Correction of spelling in variable labels • Small changes in variable labels • Removal of (erroneously supplied) admin variables • Correction of spelling in metadata • Minor changes in documentation • New index (keyword) terms • Additional documentation added (non-fundamental) • Change in access conditions

  13. MAJOR CHANGES – HIGH IMPACT • Adding new ‘waves’ in a data series • New variable added • New labels/value codes added • Weighting variables reconstructed • Wrong data supplied (e.g., March not April) • Mis-coded data (e.g., Don’t know/Refused mix-up) • Change in format (file migration) • Significant changes in documentation • Change in access conditions

  14. DATACITE DOIs • 2011: we started working with the British Library and DataCite to develop a permanent, reliable method of citing our data collections • DataCite • Founded by organisations from six countries • Established a citation format for research data, including a DOI • Works with data publishers, e.g. established data centres and institutional repositories

  15. WHY DATACITE? Not the only choice, but right for us: • DOI framework an international and persistent standard for identifying digital objects • Familiar within the research data domain • Centralised resolution service • Metadata registry (and thus de facto standard) • Discovery link up • API – allowing for automation of minting process (but also manual if you prefer!)

  16. DOI FORMAT 10.5255 / UKDA – SN – 1 – 1 • Resource • version • Unique archive identifier • Resource • identifier type • Readable archive identifier • Resource • identifier

  17. DOI VERSIONING Low impact change 10.5255/UKDA-SN-1-1 High impact change 10.5255/UKDA-SN-1-1 10.5255/UKDA-SN-1-2 Increments minor version - internal Increments majorversion – new DOI …………………….………………………………………………………

  18. CREATING A NEW DOI • Minimal DataCite metadata inc. requested DOI pushed to DataCite metadata store via API • DataCite API sends back an approval • Flagged behind the scenes • New data collection ‘ingested’ • Structured DOI ‘created’ • New change log • New citation file

  19. UPDATING A DOI – HIGH IMPACT • Minimal DataCite metadata inc. requested DOI pushed to DataCite metadata store via API • DataCite API sends back an approval • Flagged behind the scenes • High impact change to data collection • Incremental DOI version ‘created’ • Update change log • New citation file

  20. UPDATING A DOI – LOW IMPACT • Minimal DataCite metadata pushed to DataCite metadata store via API • DataCite API sends back an approval • Flagged behind the scenes • Low impact change to data collection • Update change log

  21. THE END RESULT… DOI: SN-####-3 SN#### Survey Waves 1-15 Instance-specific data and metadata (current) DOI: SN-####-2 SN#### Survey Waves 1-14 Instance-specific data and metadata DOI: SN-####-1 SN#### Survey Waves 1-13 Instance-specific data and metadata Jump page (= change log)

  22. OUR DOI METADATA

  23. CHALLENGES FOR THE FUTURE • Citing parts (fragments)of data collections • single files • subsets of quantitative data files • extracts of textual data • Still uncertainty over where exactly research data should go – IR, Subject Specific Repository, Data Journal? • Who should be minting DOIs? • Avoid assigning multiple identifiers to an object

  24. ESRC’s CITATION AWARENESS GUIDE

  25. ACKNOWLEDGEMENTS Thanks to the following UKDA/UKDS staff for their assistance in putting this together: • Matthew Woollard • Louise Corti • John Payne • Matthew Brumpton • Sharon Bolton

  26. CONTACT TOM ENSOM UK DATA ARCHIVE UNIVERSITY OF ESSEX WIVENHOE PARK COLCHESTER ESSEX CO4 3SQ ……………..…..……………………….. T+44 (0)1206 872974 Etensom@essex.ac.uk www.data-archive.ac.uk

More Related