170 likes | 297 Vues
Research Data Management. Philip Tarrant Global Institute of Sustainability. New research data management world. Federal funding agencies now expect researchers to include data management plans in proposals Article 26. Sharing of Findings, Data, and Other Research Products
E N D
Research Data Management Philip Tarrant Global Institute of Sustainability
New research data management world Federal funding agencies now expect researchers to include data management plans in proposals Article 26. Sharing of Findings, Data, and Other Research Products a. NSF expects significant findings from research and education activities it supports to be promptly submitted for publication, with authorship that accurately reflects the contributions of those involved. It expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the data, samples, physical collections and other supporting materials created or gathered in the course of the work. It also encourages grantees to share software and inventions or otherwise act to make the innovations they embody widely useful and usable. b. Adjustments and, where essential, exceptions may be allowed to safeguard the rights of individuals and subjects, the validity of results, or the integrity of collections or to accommodate legitimate interests of investigators.
New research data management world Easy! Get this right
Context : The scientist’s view • Design experiments that will collect the data needed to answer the research question(s) • Process those data in a way that will produce the results needed to draw sensible conclusions • Publish those conclusions so that they can be shared with the wider scientific community and ultimately the public Investigators often feel that their responsibility ends here.
Context: The data manager’s view • Consolidate the data collected by investigators and produce datasets in formats that encourage re-use by other investigators • Make those data accessible to the wider scientific community and potential citizen scientists • Publish the metadata necessary to enable the data to be interpreted by third parties The investigators’ responsibility actually ends HERE!
The “reusable” data challenge How do we: • Consolidate interdisciplinary data from disparate sources in ways that provide practical (and achievable) opportunities for powerful, complex, synthetic analysis? • Encourage commonality in scientific measurements and data standards so that we can perform a meaningful comparison between “apples” and “apples”? • Increase data re-use to extract the maximum possible value from precious research dollars • Share these data with collaborators in a way that supports answering the big questions? Someone’s responsibility ends HERE!
Current data management process Researcher completes project Researcher plans next project Researcher completes project Researcher plans next project Researcher sends data to DM Researcher publishes research Researcher starts next project DM pledges eternal friendship and reminds about metadata DM asks for data and metadata DM asks for data and metadata DM asks for data and metadata again DM asks for data and metadata again DM rescinds friendship pledge while pleading for metadata DM dies waiting for metadata
Ideally, where do we want to be? I need information…
Realistically, where do we want to be? • A single, consolidated repository (“virtual” notebook) where we can store information about projects, organization, methods and protocols, and datasets (metadata) • The means to enter data wherever we may be • A data catalog that helps find useful research data (both published and unpublished) • Visualization tools to help assess the value of data
A Single Datasource People Media/ refs Projects GIOS DB Methods Metadata Data Archives
A catalog to help find and evaluate data Organization data 1 GIOS DB Public datasets Internal datasets Metadata Metadata 2 3
Data management workflow Project initiation Data collection Data analysis Project completion Research publication Finalize datasets Create datasets Describe data attributes Create metadata record(s) Describe field methods Submit data and metadata Describe lab methods Create project record Data Management System Publish data and metadata
What does this mean to investigators? • Effort required to input project and dataset information • “Pay As You Go” model reduces the back end effort when you really want to be planning for the future • A single resilient place to store project information – accessible by all team members • Latest “version” always available • Content available as input to manuscripts • The next data management plan is… Easy!
Project Timeline • Design phase: December – February • Development: • Organization/project module: Jan – April • Metadata module: March – July • Assistance with testing: June - July • Training: July – August