210 likes | 390 Vues
Unlocking the Scientific Value of NEXRAD Weather Radar Data. Ramon Lawrence, Witek Krajewski, Anton Kruger, and Allen Bradley IIHR, University of Iowa ramon-lawrence@uiowa.edu http://www.cs.uiowa.edu/~rlawrenc/ http://www.iihr.uiowa.edu/~hml/projects/nexrad-itr. Overview.
E N D
Unlocking the Scientific Value of NEXRAD Weather Radar Data Ramon Lawrence, Witek Krajewski, Anton Kruger, and Allen Bradley IIHR, University of Iowa ramon-lawrence@uiowa.edu http://www.cs.uiowa.edu/~rlawrenc/ http://www.iihr.uiowa.edu/~hml/projects/nexrad-itr
Overview • This presentation will briefly describe: • The data collected by the NEXRAD system and its scientific value. • The current state of NEXRAD data archiving and its use in scientific discovery. • Some of the data management challenges in archiving the data. • Our architecture for archiving and querying NEXRAD data and its future directions. Our goal is to provide the science community with ready access to the vast archives and real-time information collected by the national network of NEXRAD radars. [This requires hiding the numerous data management issues.]
NEXRAD System and Generated Data • There are over 150 NEXt generation RADars (NEXRAD) that collect real-time precipitation data across the United States. • The system has been operational for about 10 years, and the amount of collected data is continually expanding. • A radar emits a coherent train of microwave pulses and processed reflected pulses. • Each processed pulse corresponds to a bin. There are multiple bins in a ray (beam). Rotating the radar 360º is a sweep. After a sweep the radar elevation angle is increased, and another sweep performed. All sweeps together form a volume.
Usefulness of NEXRAD Data • Although the NEXRAD system was designed for severe weather forecasting, data collected has been used in many areas including: • flood prediction • bird and insect migration • rainfall estimation • The value of this data has been noted by a NRC report which labeled it a “critical resource.” • Enhancing Access to NEXRAD Data—A Critical National Resource.National Academy Press, Washington D.C. ISBN 0-309-06636-0, 1999
Archiving NEXRAD Data • Despite its value, the archival system for NEXRAD data is unsatisfactory. The National Climatic Data Center (NCDC) maintains a tape archive of the RAW data, but provides few tools for finding relevant data and processing it for research. • Some real-time data is distributed by University Corporation for Atmospheric Research (UCAR) using their Unidata Internet Data Distribution (IDD) system. However, this still requires users be able to: • extract and process a RAW data stream in real-time • archive it appropriately • generate metadata and indexes for retrieving it when required • filter the data set to reduce the amount of space required • develop custom tools for analysis and processing
User/Client’s View Distributed Data Archive (NCDC, Iowa, etc.) “Find all the 2002 storms over the Ralston Creek watershed with mean arealprecipitation greater than X mm, and with a spatial extent of more than Z km2, with a duration of less than N hours. I want the data in GeoTIFF” Query Metadata Metadata Archive User/Client Get URIs Program Library Get data HTTP “Find all the 2002 storms over the Ralston Creek watershed with mean areal precipitation greater than X mm, and with a spatial extent of more than Z km2, with a duration of less than N hours. I want the data in GeoTIFF.” Metadata Archive
NEXRAD Data Management Challenges • Storing NEXRAD Level II data results in many interesting database challenges: • Data size - A historical archive of NEXRAD data consumes many terabytes of space. • Flexibility/Variability - Unlike commercial warehouses, the types of data and metadata that should be stored in the warehouse is not well understood and evolves over time. • Real-Time response - The data should be loaded and queryable in real-time as it is received from the radars. • Scientific Workflow - It is desirable to capture and share sequences of calculations on the raw data (scientific workflows) and develop tools that seemlessly interact with the archive.
Data Size Challenges • Individual NEXRAD Level II scans are not large (300-1000 KB). However, archiving 150 radars that produce 10 scans per hour results in an archive rate of 36,000 scans/day = 17 GB/day. • Although the cost of storage has decreased dramatically (1 TB for under $10,000), this still requires a hardware investment. • A major challenge is how do you find the data files of interest? • Answer: Queryable metadata that allows you to ask for files with certain properties without browsing the entire collection. • One problem: The metadata can be huge as well making it inefficient to search. Even worse, scientific metadata tends to change as research evolves. How does the system handle this?
Flexibility Challenges • Ideally, the system should allow arbitrary metadata to be associated with NEXRAD files that can easily be added, updated, and queried. • Unfortunately, relational databases do not nicely handle variable information. Although there are some known schema designs that can handle variability, they are inefficient for large data sets. • Good news: This is not unique to hydrology. Researchers in other domains are building grids to share data/metadata and face the same challenges (e.g. GriPhyn - physics grid). • Bad news: Representing and querying variable data (especially within a relational database) is an active research problem.
Flexibility Example • One way to represent variable metadata on a datafile in a relational database is to have a single table: • metadata(dataFileId, attributeName, attributeValue) • Example: • Data file 1 has three attributes: ArealCoverage, MaximumReflectivity, MinimumReflectivity. Data file 2 has two attributes, and file 3 has only 1. • Note that this schema allows any (variable) number of attributes per file. • A challenge: How would you return all files that have ArealCoverage > 5 and MaximumReflectivity > 20? Answer: Join two copies of table metadata together.
Scientific Workflow • A workflow is a sequence of steps that is performed on data. • Workflows have received considerable attention where documents must be routed between individuals. • Think of a funding proposal being internally routed through your university. • A scientific workflow is a sequence of steps performed on scientific data. Each step uses as input the output of the previous step. An example workflow in hydrology: • retrieve the raw data files of interest • remove ground clutter and Anomalous Propagation (AP) • calculate estimated rain fall • map calculations to a basin • Our goal is to support such workflows. • How to represent and store intermediary products? • How to make the tools/algorithms interoperable?
Current Status and Future Work • We have implemented a prototype version of the architecture that is currently archiving 30 radars in real-time. Some basic statistics are being generated and can be used to retrieve data files of interest. Accessible at: • http://nexrad.cs.uiowa.edu • Immediate plans: • Generate standardized metadata for use by hydrologists. • Link NEXRAD data to basin information so that rainfall estimation and flood prediction can be performed. • This research is supported by NSF ITR Grant ATM 0427422: “A Comprehensive Framework for Use of NEXRAD Data in Hydrometeorology and Hydrology”.
Project Participants • The University of Iowa (Lead) • W.F. Krajewski (PI) • A.A. Bradley, A. Kruger, R.E. Lawrence • Princeton University • J.A. Smith (PI) • M. Steiner, M.L.Baeck • National Climatic Data Center • S.A. Delgreco (PI) • S. Ansari • UCAR/Unidata Program Center • M. K. Ramamurthy (PI) • W.J. Weber
Unlocking the Scientific Value of NEXRAD Weather Radar Data Ramon Lawrence, Witek Krajewski, Anton Kruger, and Allen Bradley IIHR, University of Iowa ramon-lawrence@uiowa.edu http://www.cs.uiowa.edu/~rlawrenc/ http://www.iihr.uiowa.edu/~hml/projects/nexrad-itr Thank You!
A Watershed or Basin A watershed is an area of land that drains water, sediment and dissolved materials to a common receiving body or outlet.
NRC Quote on NEXRAD Data Archiving “[t]he limited use of ground-based radar rainfall data outside of the operational environment is partially attributed to the lack of research-quality data products and partially to poor archiving practices.” NRC Report, 2002
Metadata Basic “Find all the 2002 storms over the Ralston Creek watershed with mean areal precipitation greater than X mm, and with a spatial extent of more than Z km2, with a duration of less than N hours. I want the data in GeoTIFF” Derived/Complex “Find all the2002stormsover theRalston Creek watershedwithmean areal precipitationgreater than X mm, and with aspatial extent of more than Z km2, with aduration of less than N hours. I want the data in GeoTIFF”
CUAHSI Consortium of Universities for the Advancement of Hydrologic Sciences (CUAHSI)