190 likes | 305 Vues
Metadata Development in the Earth System Curator. Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech 5 th GO-ESSP Community Meeting. Motivation. Two primary products of the climate community: datasets and the models used to produce them. Models. Datasets.
E N D
Metadata Development in the Earth System Curator Spanning the Gap Between Models and Datasets Rocky Dunlap, Georgia Tech 5th GO-ESSP Community Meeting
Motivation • Two primary products of the climate community: datasets and the models used to produce them Models Datasets 5th GO-ESSP Community Workshop
Motivation • Many efforts in place to provide uniform access to datasets • Additionally, groups are working to develop frameworks for component exchanges and interoperability • But, models and datasets are currently treated as distinct and separate entities • Earth System Curator’s claim: • This gap is actually an artificial barrier that inhibits access to resources and results 5th GO-ESSP Community Workshop
What is the Earth System Curator? • The goal of the Earth System Curator project is to provide a unified interface to both climate models and their output data • This means a single portal will give you access to both Models Datasets 5th GO-ESSP Community Workshop
Convergence of Models and Data • ESC begins with a crucial insight: the descriptors used for comprehensively specifying a model configuration are also needed for a scientifically useful description of the model output data • This leads to the convergence of models and data • There is a need for a common metadata formalism to unify the treatment of models and data 5th GO-ESSP Community Workshop
Metadata • Metadata is data about data • Metadata that describes not only model outputs, but the actual model configuration used to produce the data • Provenance metadata a big part of this …… …… …… ……… ……… ………… Model Metadata Model Run Output 5th GO-ESSP Community Workshop
Use Case Scenario • Jane Scientist has developed a malaria model where mosquito breeding rates are modeled as a function of rainfall and temperature • Because mosquitoes provide no feedback to the climate model, she can run her model “offline” from an existing climate dataset 5th GO-ESSP Community Workshop
Use Case Scenario • Using Curator she discovers the needed dataset, but finds out that the model used to produce it systematically underestimates rainfall • Using the configuration description in Curator, she is able to re-run the model with new parameters to correct for the rainfall bias 5th GO-ESSP Community Workshop
Current Efforts • Numerical Model Metadata (NMM) • Earth System Modeling Framework (ESMF) • Earth System Grid (ESG) • GFDL Curator database 5th GO-ESSP Community Workshop
ESMF Metadata Structures • Component • Logical entity that models a particular physical process or computational function • State • Import/Export – transport data to and from a component • Field • Physical quantity • Grid 5th GO-ESSP Community Workshop
Metadata Needs • What kinds of metadata do we need for Curator to be a success? • Metadata to describe the complex component hierarchies of current climate models, especially components that are made up of multiple sub-components • Metadata to describe applications that exist across multiple repositories (e.g., labs) • Metadata to describe couplers as first class citizens 5th GO-ESSP Community Workshop
Component Hierarchy NASA GEOS-5 ESMF Application parent/child “swappable” components 5th GO-ESSP Community Workshop
Component Hierarchy • Parent components need to identify their children • children are created/invoked/destroyed by the parent component • parent acts as a driver • Allow for additional types of science components beyond the typical atmosphere, ocean, sea ice, etc… 5th GO-ESSP Community Workshop
Multi-Repository Applications • Components are highly decoupled and have well-defined interfaces • This allows us to combine components from different labs into one model • Components should be treated as standalone entities • Components should be related to a particular framework (PRISM, ESMF, etc.) • Configuration description stored separately from general component description 5th GO-ESSP Community Workshop
Couplers • Couplers are components that act as “translators” between two components • e.g., regridding, averaging • However, couplers may also have significant science code inside, and therefore should be treated as components • We still need to distinguish couplers from other components 5th GO-ESSP Community Workshop
Deliverables • Allow researchers to archive and query Earth system models, experiments, model components, and model output data • Perform technical compatibility checking • How can we determine if two components will run together? • What about scientific compatibility? • Prototype auto-assembly of components to facilitate model runs • Involves automatic code generation of simple couplers 5th GO-ESSP Community Workshop
Broader Impacts • Improve climate prediction for policy makers • Facilitate Model Intercomparison Projects (MIPs) by allowing fast setup and execution of experiments using different model components • Encourage Curator-like activity in other domains 5th GO-ESSP Community Workshop
ESC Collaborators • NSF Funded • National Center for Atmospheric Research • NOAA Geophysical Fluid Dynamics Laboratory • MIT • Georgia Tech 5th GO-ESSP Community Workshop
Thanks! • Website: http://www.cc.gatech.edu/projects/curator/ • Questions? 5th GO-ESSP Community Workshop