1 / 18

Active Data Management Plans: From Concerns to Opportunities

Explore the experience with (A)DMPs in WLCG, HNSciCloud, and more. Discover the challenges, opportunities, and next steps for implementing active DMPs. Join the discussion on data stewardship and open data releases.

kress
Télécharger la présentation

Active Data Management Plans: From Concerns to Opportunities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experience with (A)DMPs in WLCG, HNSciCloud & More See also ADMP “CERN workshop”: https://indico.cern.ch/event/520120/ & RDA Active Data Management Plans IG

  2. Background • Requirements (suggestions) for Data Management Plans (DMPs) have been coming for several years now • Tools for preparing DMPs exist and there is active work to make these converge • (Lots of) training available but hard to get examples of “good DMPs” • Projects have been generating “DMPs” for much longer, sometimes in “ad hoc” formats • A lot of work to “improve the situation…”

  3. Workshop on Active Data Management Plans • Agenda, talks, videos, conclusions • Includes more detailed talks about HEPdata preservation & Open Data releases, as well as a full talk on ESRF DMP (later)

  4. Example Concerns • DMPs often produced at start of project – “minimum to get funding” but never revisited • Move to “Active” DMPs • Many projects have multiple Funding Agencies • Having different (even if largely overlapping) requirements can be an issue • H2020 guidelines are still evolving: • Which version do we use? The one in effect at the start of the project or the latest? • In either case this compounds the issue above • Funding for “Open Data” – in particular after project is over (cf Sustainable Business Models for Data Repositories)

  5. From Concerns to Opportunities • Much interest in making DMPs “actionable”, i.e. by machines • Numerous on-going discussions on “Active” DMPs • Too much (too soon?) emphasis on F.A.I.R. • Do we really understand how to implement this? • Are the necessary services there? • Data Stewardship costs: 5% is often quoted, but 5% of what? e.g. (LHC?, WLCG? WLCG T0?) • Open discussions on what a Data Steward is (See iPRES 2016) • A 5% “tax” is way in excess of WLCG reality (perhaps)

  6. experience

  7. HNSciCloud User Communities 2 1

  8. The “push” from Funding Agencies has helped this come about

  9. DMPs for the LHC experiments • The first LHC experiment to produce a “DMP” was CMS in 2012 • This called for Open Data Releases of significant fractions of the (cooked) data after an embargo period (see ADMP w/s) • Now all 4 main experiments have DMPs • Open Data Releases are now “routine”! • Compare this situation to a few years ago: huge progress has been made! See this talk @ ICHEP for more details

  10. CERN Plans (LHC & Beyond) • We see DMPs applying at the project level – e.g. per experiment – hopefully from a simple template • Reviewed as part of the (regular) process for all CERN experiments • This will be coupled to ISO 16363 Certification of CERN as a Digital Repository for scientific data as well as its “digital memory” • You can’t share or re-use data, nor reproduce results, if you haven’t first preserved it (data, software, documentation, knowledge) • Open Data Releases – in addition to Certification – provide a powerful way of measuring whether we are achieving our goals!

  11. DMP Actions & Next Steps • We have seen how DMPs can be of significant value to projects, as well as to Funding Agencies • We will continue to work on DMPs with HNSciCloud partners, as well as ESFRI and ESFRI-like projects • ESFRIs characterized by significant investment + annual operations budgets • Use DMPs to find synergies • Save time and Money! • Continue to push for close dialog between Funders, Service providers and User Communities

  12. “ESFRI” DMP Workshop(s) • Take the basic questions from DMP guidelines • Extend to include also data acquisition, processing and distribution (+caching?) • Ask a range of projects to present their requirements, plans and concerns along these lines • Look for – and find – synergies! • Repeat (extend) workshop every 12-18 months, as long as still useful

  13. Open (Access to) Data • It took a long time to get where we are with Open Access to publications • HEP has made the “data behind publications” available for decades (HEPdata) • “The” data is much more complex: may well require significant amounts of documentation, software, storage and computational / network resources + SUPPORT! • Perhaps this deserves its own workshop series?

  14. Question(s) • Do we work for DMPs or do DMPs work for us?

More Related