1 / 30

ICOADS Archive Practices at NCAR

ICOADS Archive Practices at NCAR. JCOMM ETMC-III 9-12 February 2010 Steven Worley. Topics. Environment setting Data management tools and principles ICOADS NCAR Release 2.5 contributions Background Collections Future Challenges. Environment Setting.

sumi
Télécharger la présentation

ICOADS Archive Practices at NCAR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley

  2. Topics • Environment setting • Data management tools and principles • ICOADS NCAR Release 2.5 contributions • Background Collections • Future Challenges

  3. Environment Setting • ICOADS is part of a larger collection called the Research Data Archive (RDA) • RDA – briefly • 600+ datasets (atmosphere, ocean, geosciences) • 4.3M files, 462 TB (primary data) • 6000+ unique users annually, including ICOADS • Staff, 7 scientific programmers (M.S. degrees), me, and administrative assistant

  4. Data management principles • Always archive 2 copies of observational data • 3rd copy at a partner center (disaster recovery) • Free and open data access world-wide • Internet • Past – other media, cd-roms, tapes, etc. • Share what we have to build archives • E.g. Digitization of Maury data in China in exchange for global land surface data

  5. Data Management Tools • New System: Common RDA tools that homogenize data management. • Efficient • Scalable • Old System: Specialized Software to manage each data input. • Inefficient • Difficult to Scale RDA Metadata Database GCMD Metadata Server NWP Server RDA Data Server Online Disk Specialized Software Package 1 RDA Data Management Common Tool Set University Server Specialized Software Package 2 Tape Storage Specialized Software Package 3 Unidata Server

  6. Data Management tools – a few details • Common scripting structure to do routine dataset updates (dsupdt) • Very tunable • Frequency, multiple server priority list, validation • Fully integrated with RDADB • Users view is automatically update and therefore always current • Common single archiving function (dsarch) • location and copy control (MSS/HPSS storage, and online disk) • Fills all DB entries (e.g. file and dataset relationships)

  7. Data management tools • Harvest file level metadata (gatherxml) • Handle various formats (GRIB1, GRIB2, netCDF, BUFR, IMMA, ON29, etc.) • Save as <xml> and populate DB • Benefits • Problem detection • Versioning, replacement, extension • Inventory information • Drive better data service for users

  8. Data management tools • Provide access to data in tape storage archive (dsrqst) • Relatively new, not universally available across RDA - yet • Delayed mode – with DB control (many details) • Why – RDA holds 462 TB • 40 TB online – most popular small scale products • Access to more products for greater community

  9. ICOADS Release 2.5 contributions @ NCAR • Data Preparation – format evaluations, translate native formats to IMMA format • Moored research buoy delayed mode archives • TOA, PIRATA (PMEL, JAMSTEC) • World Ocean Database 2005 • Multiple ocean profile types (NODC) • Receive/archive ICOADS data processing results • NOAA/ESRL does processing - source merging, duplicate elimination, preconditioning deletion and fixes, etc.

  10. ICOADS Release 2.5 contributions @ NCAR • Create and maintain user data access interfaces • File access • IMMA and binary (observations, monthly summary statistics) • Sub-selection (time, space, parameter) • Example coming. • Output is ASCII tabular format • Runs automatically – nearly all requests completed in 10 minutes • Keep user metrics

  11. ICOADS Release 2.5 contributions @ NCAR • Near-term preliminary extensions to R2.5 • Beginning with data in 2008 and forward • Based on NCEP GTS compilation/merge • Runs on day 2 of each month – processes previous month. • Create IMMA observations and binary monthly summary statistics • Harvest file level metadata • Do all archiving of original and processed files • Automatically, update user interfaces

  12. Brief drive through ICOADS @ NCAR

  13. World-wide User Access

  14. File Level Metadata – ICOADS IMMA Example

  15. File Level Metadata – ICOADS IMMA Example

  16. 8 pages of information like this

  17. A look at 2009

  18. What is happening in 2009?

  19. World-wide User Access

  20. Similar service for the monthly summary statistics

  21. Who uses the sub-setting interfaces?2005-2009 58 Countries

  22. Background Collections • Historical • Most complete set of ALL source data used to create ALL ICOADS Releases • Beginning in mid-1980s • Copies of ALL ICOADS Releases • We do not delete any files

  23. Background Collections • Ongoing / Routine data receipts • Format conversions are done at NCDC

  24. Future Challenges • Eliminate user interface dependency on java applets – deploy java script instead. • Support “advanced” ICOADS initiative • Bias adjusted / corrected observations • Serve as a central DB / handle data ingest • Build a user interface • Continue as a full U.S. partner.

More Related