330 likes | 477 Vues
NOAA ’ s National Ocean Service • Office of Response and Restoration. Serving unstructured grids using OPeNDAP : Using server-side operations to subset and subsample data. Christopher Barker NOAA Office of Response & Restoration Emergency Response Division James Gallagher OPenDAP, inc.
E N D
NOAA’s National Ocean Service • Office of Response and Restoration Serving unstructured grids using OPeNDAP: Using server-side operations to subset and subsample data Christopher Barker NOAA Office of Response & Restoration Emergency Response Division James Gallagher OPenDAP, inc.
NOAA Emergency Response Division • National Contingency Plan specifies NOAA’s role in supporting the Coast Guard: “Provide scientific expertise to support an incident response for Oil and Chemical Spills”
Key Role: Trajectory Modeling • Where is the oil (or chemical) going?
Primary Tool: GNOME(General NOAA Operational Modeling Environment) • Lagrangian element (particle) model • Forcing from external sources: • Winds • Currents • Currents: • In house model • External operational models
Example: Deepwater Horizon • Ocean models utilized: • NOAA CSDL: NGOM • Navy models: NCOM, HYCOM, IASNFS • USF: West Florida Shelf ROMS • TGLO/TAMU: TX shelf ROMS • NC State: SABGOM • All structured grid models
Unstructured Grid Models? • Unstructured Grids: • Allow resolution to vary spatially • Conform to boundaries • Nice for oil spills and particle tracking • Many more UGRID models coming online • Many papers at this conference
Some Models of Interest • FVCOM: • nGOMOFS (NOAA CSDL) • Gulf of Maine/Mass Bay (UMASS) • Salish Sea (PNNL) • SELFE: • Columbia River (OHSU) • Texas Estuaries models (UT) • ADCIRC: • Gulf of Mexico / Southern LA and Texas grid 9,108,128 nodes--18,061,765 elements
nGOMOFS (NOAA CSDL) V6 90,310 Nodes 174,550 Elements
What if I just need Mobile Bay? Mobile Bay, AL detail grid. About 300 m grid resolution along a 13 m deep navigation channel
FVCOM-GoM/GB for Mass Bay and Nantucket Sounds/Shoals Boston Inner Harbor
ADCIRC:Gulf of Mexico / Southern LA and Texas grid (SL18TX) • Gulf of Mexico / Southern LA and Texas grid 9,108,128 nodes--18,061,765 elements • Just surface currents: • 275 MB per time step (plus the grid specs)
Obstacles to using UGRID models: • No standard for data/results on UGRIDS: • Informal working group for (quite!) a few years • Recent draft standard (netcdf 3) • Work on JavaNetcdf lib to support it (SURA modeling test bed project) • Big Grids: • Need server side subsetting
How to get it done? • NOAA/ORR post-DWH funding: • Better able to response to large spills • We started talking to folks about server-side subsetting options • But we’re clients: • We’re not going to run a server • We needed something that would become an excepted standard/tool.
How to get it done? • NOAA/NESDIS noted assorted issues: • Netcdf/OpenDAP development funding limited • Multiple diverging implementations: “Unfunded Mandate” • NESDIS coordinated funding from: • Technology, Planning and Integration for Observations (TPIO) Program • OR&R • National Climatic Data Center (NCDC)
OPeNDAP-Unidata Linked Servers (OPULS) • NOAA/BAA grant supports this important collaboration between Unidata & OPeNDAP • First goal: conformance between OPeNDAP & Unidata servers, through which access is gained to growing amounts of NOAA & related data. Other short-term goals include: • Asynchronous modes, such as are needed for (delayed) access to near-line data, perhaps stored on tape, e.g. • Improved access (with server-side subsetting) to data organized on non-rectangular meshes, such as in coastal modeling • Work began in Boulder during October & will be influenced by an advisory committee (yet to be appointed)
OPeNDAP:the Data Access Protocol • DAP2 combines simple data model with a general set of operators. • Data Model: Atomic types (e.g., ‘Integer’); Arrays; Structures; Grids; and Sequences. • Operators: These provide ways to subset all but the atomic types. • Domain neutral: By keeping the semantics of the model clean, we ensure that it can be applied to many different types of data.
But how is it used? • DAP is generally used as a ‘web service’ • DAP requests are made using a URL • DAP responses are ‘documents’: • Text that contains metadata • Combination of text/metadata and binary data. • Applications read these responses and use them it whatever ways they see fit: • the netCDF client library makes legacy applications believe they are reading from a local file
About Array and Grid Selection • In addition to requesting a Grid or Array, the Selection can be used to subset in indicial space.
About Functions • Constraint Expression can contain functions • These functions can perform any operation that can be programmed. • Thus they provide a good way to extend a data server to perform new operations • These include operations that are not domain neutral • In Hyrax they are written in C++
Example URLs • The base URL: “http://test.opendap.org/opendap/data/nc/fnoc1.nc” • To get metadata: • Dataset variables: http://test.opendap.org/opendap/data/nc/fnoc1.nc.dds • … attributes: http://test.opendap.org/opendap/data/nc/fnoc1.nc.das • Or less readable in XML: http://test.opendap.org/opendap/data/nc/fnoc1.nc.ddx • To get data: • Just the variables u and v: http://test.opendap.org/opendap/data/nc/fnoc1.nc.dods?u,v • … in ASCII so it’s easy to read: http://…/opendap/data/nc/fnoc1.nc.asc?u,v • With subsetting: • http://test.opendap.org/opendap/data/nc/fnoc1.nc.asc?u[0][3:6][5:8] • Here’s a function: • http://…/nc/coads_climatology.nc.ascii?geogrid(SST,45,-80,20,-60,”1000<TIME<3000”) • This is an example of how functions can enable domain-specific behavior; this function will return an error if the Grid is not ‘geospatial’
Challenges • Unstructured Grids are not a specific type in DAP • We must choose a way, or set of ways, to represent these data • Datasets are often too large to download – subsetting must be done server-side. • Because the subsetting operations are complex, we will need to use server-side functions to implement them
Requirements • Must enable subsetting by polygonal regions • The result must be an unstructured grid itself • A subset must preserve the topological and geometric relationships present in the whole: • we can’t just regrid everything to a more convenient form.
Proposed Solution • Server-side function to add subsetting • Adopt the proposed unstructured grid encoding using netCDF3 • Result of the function will be a DAP2 response • Input is netCDF3 with some additional ‘conventions’: it can be represented in DAP2 • There are existing clients that can read DAP2 • If they understand netcdf in the new convention, they will understand the results
The server-side function • Ugrid(Mesh,<polygon>) • <polygon> is a comma separated list of latitude and longitude points • However, there is an arbitrary limit to the number of characters in a URL, so • We will also support POST when OPULS makes the transition to DAP4 • It will likely take more than a year for all of DAP4 to be realized, but POST for constraint expressions will be set in the first year.
Example ugrid() calls • http://…/model.nc?ugrid(SST,45,-80,20,-60) • When ugrid() is called with two points, it will assume the polygon is a box. • http://…/model.nc?ugrid(SST,45,-80, 45,-60, 20,-60, 20,-80) • Here the polygon the same box as above. • There’s an understood edge connecting the first and last points • Point order is important – self-intersecting polygons will raise an error.
http://…/model.nc?ugrid(SST, -71.03, 42.38, -71.06, 42.37, -71.06, 42.36, -71.06, 42.35, -71.04, 42.33 -71.01, 42.34, -71.01, 42.35, -71.03, 42.38)
Implementation • We will use the Gridfields library [Howe 05] • The library will be extended to work with the new netCDF3 file format: “Deltares CF proposal for Unstructured Grid data model” • And to work with DAP [Howe 05] Bill Howe, David Maier, “Algebraic Manipulation of Scientific Datasets,” VLDB Journal, 14(4) 2005
Progress so far • Gridfields has already been used to build a simpler server-side demonstration function • The Gridfields code has adopted GNU’s autotools to streamline its build. • We will factor out the C++ code into its own project, separate from the Python layer • This will simplify moving gridfields into the Linux community builds
Summary • Ugrid models are seeing wide deployment • Subsetting UGrids on the server is critical to the wide use of model results • UGrids will be encoded in netCDF3 • We will use a widely available open-source library to perform the actual operations • The results will be valid UGrids, in DAP • The work has begun
Use for Curvilinear grids, too? • Capture arbitrary polygon subset. • Rectangle in geo-coordinates not a rectangle in grid coordinates – We generally over sample. • But that’s not always a good solution for highly deformed grids. • What would the result look like? • A new structured grid? • An unstructured grid?
Further Discussion, etc. • Meet here at ECM: • Lunch Wed? • Discussion on UGRID Google group: https://groups.google.com/group/ugrid-interoperability • OPeNDAP Wiki: http://docs.opendap.org/index.php/Projects