1 / 17

Robert Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA

The Research Data Archive at NCAR: A Metadata System that Enables Discovery Across a Diverse Archive. Robert Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA. Outline. Introduction RDA - Then RDA - Now Data Discovery. Introduction.

davin
Télécharger la présentation

Robert Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Research Data Archive at NCAR: A Metadata System that Enables Discovery Across a Diverse Archive Robert Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA AMS 2011

  2. Outline • Introduction • RDA - Then • RDA - Now • Data Discovery AMS 2011

  3. Introduction • Purpose - support climate & weather research at NCAR; services are extended worldwide as resources permit • Observations, derived products; focus on historical atmosphere/ocean data • Metrics • Established in 1960s • 600+ datasets, 4M files, 600 TB • 7000 users annually AMS 2011

  4. Introduction • Changing data landscape • Then – small datasets, single country/experiment, specialized formats • Now – global coverage, high spatial/temporal resolutions, standard formats • Result and challenge: • Lots of diversity • How can we provide uniform discovery? AMS 2011

  5. Then AMS 2011

  6. Then • Bottom line • Increasing data diversity, evolving technology; difficult to develop good systematic discovery • README files, directory names • Primarily via personal communications • Major limiting factor – insufficient metadata • No metadata standard, dictionaries • Collection not uniform across all datasets • Rigidly-structured flat ASCII files • Archiving separate from metadata collection Unscalable System! AMS 2011

  7. Now AMS 2011

  8. Now • Developed local standard for discovery based on DIF1 & THREDDS2; applied across all datasets • Adopted GCMD3 controlled vocabularies • Local enhancements; e.g. data formats • Harvest two types of file metadata • File attribute – name, size, compression, … • File content - variables, levels, date range, ... • Storage using XML 1Directory Interchange Format, NASA/GCMD3 ; 2Thematic Realtime Environmental Distributed Data Services; 3Global Change Master Directory AMS 2011

  9. Metadata Collection AMS 2011

  10. Metadata Collection • Tools that automatically capture file metadata • Integrated with archiving activities • Web-based GUI - guided entry of dataset discovery metadata • Required fields, constrained entries AMS 2011

  11. Relational Databases AMS 2011

  12. Relational Databases All together, support accurate data discovery • Fast access • Dataset discovery metadata • Single database (~0.3M rows) • File attribute metadata • Single database (~45M rows) • Maintains dataset/data file relationships • File content metadata • Four databases structured to handle diversity of data (~920M rows) • Maintains detailed parameter relationships AMS 2011

  13. Data Discovery AMS 2011

  14. Data Discovery • Dataset discovery • Google-like dataset search • “Look For Data” interface – user-defined dataset catalogs • Auto-generated dataset pages – always up-to-date • Collections – all reanalyses, upper air obs, surface obs AMS 2011

  15. Data Discovery • Data file discovery • “Create Your Own List” for data file lists • Show specific files from terabyte-sized collections • Other • “Station Viewer” • Google maps; see stations, metadata AMS 2011

  16. Metadata Sharing • OAI-PMH • UCAR Community Data Portal (THREDDS) • Global Change Master Directory (DIF) • also Dublin Core, native • easy to add others as necessary AMS 2011

  17. Thank You! • Web: http://dss.ucar.edu • Email: dssweb@ucar.edu • Questions/comments? AMS 2011

More Related