1 / 1

ATMOSPHERIC SCIENCE DATA CENTER

National Aeronautics and Space Administration. Key Components of a Successful Earth Science Subsetter Architecture Walter E. Baskin 1 , Peter Piatko 1 , John Kusterer 2 (1) Science Systems and Applications, Inc., Hampton, VA, USA. (2) NASA Langley Research Center, Hampton, VA,USA.

guido
Télécharger la présentation

ATMOSPHERIC SCIENCE DATA CENTER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. National Aeronautics and Space Administration Key Components of a Successful Earth Science Subsetter Architecture Walter E. Baskin1, Peter Piatko1, John Kusterer2 (1) Science Systems and Applications, Inc., Hampton, VA, USA. (2) NASA Langley Research Center, Hampton, VA,USA. ATMOSPHERIC SCIENCE DATA CENTER Search and Subset Application Interface Introduction At the 2010 A-Train Symposium the Atmospheric Science Data Center (ASDC) and CALIPSO science team unveiled a new CALIPSO Search and Subset Application receiving a very enthusiastic response from Atmospheric Scientists. The template of this subsetter application architecture has since been applied to the distribution of Level 2 Satellite data granules from Clouds and the Earth's Radiant Energy System (CERES) SSF swath datasets and Tropospheric Emission Spectrometer (TES) datasets. Science Data Users utilize these new tools to rapidly locate, subset, and order specific dataset parameters tailored to their requirements. JAVA HDF Subsetter • ASDC developed dedicated subsetters for the CALIPSO, CERES, and TES missions leveraging the HDF Group’s JAVA JNI libraries used in the open source HDFViewapplication. • These subsetters are deployed on Univa Grid Engine processing nodes and are managed by the Subsetter Workflow Framework. • The subsetters have the capability to return subsetted files in NetCDF format. • Types of granules subsetted from each data provider: • CALIPSO: HDF4 • CERES: HDF4 • TES: HDF-EOS (HDF5 out) Inspection of a CERES ES8 subset result file in the HDFView application • Key Components • •Interactive user interface that is tightly integrated with a PostgrSQL-PostGIS metadata database specifically tailored for the Science Product data granules to be subsetted. • •Scalable workflow framework for scheduling potentially thousands of subset processes across a configurable number of cluster processing nodes. • •Efficient subset application with high speed access to archived data granules. • •Robust Metadata mining capability focused on obtaining high resolution spatial and temporal metadata. The CALIPSO Search Subsetter User Interface automatically updates and displays the number of granules meeting the spatial and temporal constraints as the user changes them. This dynamic feedback provides a very positive user experience. New subset interfaces under development for CERES and TES datasets leverage this functionality. Details of the resulting data granules are displayed on the ‘Confirm Request’ page. Users are able to download a list of granules that meet their search criteria, browse profile plots for each resulting granule, or submit an order to subset the granules based on their spatial-temporal inputs. The CALPSO Science Team provides browse images for their LIDAR data products. These profiles are easily accessed through links under each granule result on the ‘Confirm Request’ page. Subsetter Workflow Framework Node1 Metadata Database WebUser Interface SciFlo-Univa Grid Engine Node2 Processing node running JAVA HDF Subsetter Node2 Node … Web Server FTP Site The ASDC subsetters leverage the Common Object Package and use specific methods in the Java HDF and HDF5 JNI Interfaces to directly access lower level functions in the C libraries. (source of diagram: http://www.hdfgroup.org/hdf-java-html/hdf-object/) The Subsetter Framework is a generic framework for subset processing. It uses SciFlo as its workflow engine to drive the processing, and Univa Grid Engine as its resource scheduler, so that the subsetting can be scaled across a set of computational nodes. • High Resolution Spatial Metadata Mined Directly From Archived HDF Data Granules • (Green = metadata used by new Search and Subset Applications Red = original metadata used in legacy data access applications) • Conclusion • New HDF subset and file access capabilities recently developed through ASDC’s collaboration with data providers give science data users the ability to quickly subset and mine data from large archived files, and has set the stage to directly stream desired data directly from archived files to a client’s visualization or analysis applications. • Future Work for Improving ASDC’s Subset and Science Data access • Machine-to-Machine subset interfaces • Very high granularity in spatial/temporal metadata • Geospatial plots of subsetted dataset query results • Real-time browse images of dataset query results In the ECS archive system, Level2 Tropospheric Emission Spectrometer (TES) Ozone metadata assumes global coverage for each daily granule. ASDC is currently working with the TES Science Team on a prototype search and subset application. The metadata database used in this prototype stores the observation location for every data entry in the granule as an array of points. Bounding Box queries for observations over the entire mission consistently return results in under five seconds. This ability to obtain any observation over the life of the mission within a few seconds is unprecedented . Metadata currently provided for one hour CERES Level 2 SSF granules assumed full coverage of the Earth within 20 degrees of the poles and stepped along the granule footprint boundaries at ten degree longitude intervals. A newer metadata mining technique directly detects field of view positions of the observations along the edges of the granule footprint and implements a Douglas-Peucker simplification on the resulting polygon. The updated hourly footprint polygon contains the same number of points as the original metadata polygon, and is more accurate. The original CALIPSO Level 1 LIDAR spatial metadata is defined by a LineString consisting of ten points. The Search and Subset Application uses LineString metadata constructed by approximately 50 points, greatly increasing the accuracy of two dimensional bounding box queries near the poles. www.nasa.gov IN41B-1405

More Related