1 / 25

Ilya Zaslavsky San Diego Supercomputer Center, UCSD

WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE. Ilya Zaslavsky San Diego Supercomputer Center, UCSD

wyome
Télécharger la présentation

Ilya Zaslavsky San Diego Supercomputer Center, UCSD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WEB SERVICES FOR UNIFIED ACCESS TO NATIONAL HYDROLOGIC DATA REPOSITORIES AND REAL TIME OBSERVATION DATA: CUAHSI HIS EXPERIENCE Ilya Zaslavsky San Diego Supercomputer Center, UCSD CUAHSI = Consortium of Universities for the Advancement of Hydrologic Sciences, Inc.; HIS = Hydrologic Information System Collaborative Project: UT Austin + SDSC + Drexel + Duke +Utah State www.cuahsi.org/his/

  2. SDSC Spatial Information Systems Lab Research and system development • Services-based spatial information integration infrastructure • Mediation services for spatial data, query processing, map assembly services • Long-term spatial data preservation • Spatial data standards and technologies for online mapping (SVG, WMS/WFS) • Support of spatial data projects at SDSC and beyond In Geosciences (GEON, CUAHSI, CBEO,…) services In Neurosciences (BIRN, CCDB) In regional development (NIEHS SBRP, Katrina)

  3. The Grid is becoming the backbone for collaborative science and data sharing CI is about RE-USING data and research resources !!

  4. Cyberinfrastructure for hydrology • Hydrologic observations: • Reliance on federally-organized data collection (NWIS, STORET, NCDC, etc.) with huge and complex nomenclatures •  simplifying access to federal repositories •  relatively lower emphasis on data ownership • Handling time in both UTC and local • Various spatial offsets • Multiple data types: time series, fields, spatial data • Integrative discipline: • Interoperation with atmospheric, ocean, soils, geomorphology, social datasets and services… • Community: • Organized by “natural boundaries” •  networks of relatively autonomous self-managed data nodes • Partnership with public sector water management • 96% use Windows for research; Excel, ArcGIS, Matlab – most popular

  5. The CUAHSI Community, HIS and WATERS Government: USGS, EPA, NCDC, USDA Industry: ESRI, Kisters, OpenMI CUAHSI HIS WATERS Network Information System HIS Team WATERS Testbed Domain Sciences: Unidata, NCAR LTER, GEON Super computer Centers: NCSA, TACC HIS Team: Texas, SDSC, Utah, Drexel, Duke CUAHSI: 116 Universities (Nov. 2006)

  6. Hydrologic Information System Service Oriented Architecture Web portal Interface (HDAS) Information input, display, query and output services Preliminary data exploration and discovery. See what is available and perform exploratory analyses 3rd party servers Web services interface e.g. USGS, NCDC GIS Matlab Observatory servers Workgroup HIS IDL SDSC HIS servers Splus, R D2K, I2K Programming (Fortran, C, VB) Downloads Uploads HTML -XML Data access through web services WaterOneFlow Web Services WSDL - SOAP Data storage through web services

  7. NASA Storet Ameriflux Unidata NCDC NCAR NWIS CUAHSI Web Services Excel Visual Basic ArcGIS C/C++ Matlab Fortran Access SAS Main Components • Web services for accessing hydrologic repositories • Hydrologic Observations Data Model • Hydrologic Data Access System + Time SeriesViewer + desktop clients • Collection of CUAHSI nodes

  8. Point Observations Information Model USGS Data Source Streamflow gages Network Neuse River near Clayton, NC Sites ObservationSeries Discharge, stage, start, end (Daily or instantaneous) Values 206 cfs, 13 August 2006 {Value, Time, Qualifier} • A data source operates an observation network • A network is a set of observation sites • A site is a point location where one or more variables are measured • A variable is a property describing the flow or quality of water • An observation series is an array of observations at a given site, for a given variable, with start time and end time • A value is an observation of a variable at a particular time • A qualifier is a symbol that provides additional information about the value

  9. Data Source and Network Controlled Vocabulary Tables Sites Variables Values Metadata e.g. mg/kg, cfs e.g. depth Streamflow Depth of snow pack Landuse, Vegetation e.g. Non-detect,Estimated, Windspeed, Precipitation A data source operates an observation network A network is a set of observation sites A site is a point location where one or more variables are measured A variable is a property describing the flow or quality of water A value is an observation of a variable at a particular time Metadata provide information about the context of the observation. Observations Data Model Schema (version 4.0) From Ernest To, David Maidment, CRWR

  10. Water Data Web Sites

  11. NWISWeb site output # agency_cd Agency Code # site_no USGS station number # dv_dt date of daily mean streamflow # dv_va daily mean streamflow value, in cubic-feet per-second # dv_cd daily mean streamflow value qualification code # # Sites in this file include: # USGS 02087500 NEUSE RIVER NEAR CLAYTON, NC # agency_cd site_no dv_dt dv_va dv_cd USGS 02087500 2003-09-01 1190 USGS 02087500 2003-09-02 649 USGS 02087500 2003-09-03 525 USGS 02087500 2003-09-04 486 USGS 02087500 2003-09-05 733 USGS 02087500 2003-09-06 585 USGS 02087500 2003-09-07 485 USGS 02087500 2003-09-08 463 USGS 02087500 2003-09-09 673 USGS 02087500 2003-09-10 517 USGS 02087500 2003-09-11 454 Time series of streamflow at a gaging station

  12. Challenges… (1/2) • Sites • STORET has stations, and measurement points, at various offsets… • Site metadata lacking and inconsistent (e.g. 2/3 no HUC info, 1/3 no state/county info); agency site files need to be upgraded to ODM… • A groundwater site is different than a stream gauge… • Censored values • Values have qualifiers, such as “less than”, “censored”, etc. – per value. Sometimes mixed data types.. • Units • There are multiple renditions of the same units, even within one repository • There may be several units for the same parameter code (STORET) • If no value recorded – there are no units?? • Unit multipliers • E.g. NCDC ASOS keeps measurements as integers, and provides a multiplier for each variable • Sources • STORET requires organization IDs (which collected data for STORET) in addition to site IDs • Time stamps: ISO 8601 • Data types problem (conversion to PST???) • A service to determine UTC offsets given lat/lon and date??

  13. Challenges… (2/2) • Values retrieval • USGS: by site, variable, time range • EPA: by organization-site, variable, medium, units, time range • NCDC: fewer variables, period of record applies to site, not to seriesCatalog • Variable semantics • Variable names and measurement methods don’t match • E.g. NWIS parameter # 625 is labeled ‘ammonia + organic nitrogen‘, Kjeldahl method is used for determination but not mentioned in parameter description. In STORET this parameter is referred to as Kjeldahl Nitrogen. • One-to-one mapping not always possible • E.g. NWIS: ‘bed sediment’ and ‘suspended sediment’ medium types vs. STORET’s ‘sediment’.  Ontology tagging, semantic mediation

  14. - From different database structures, data collection procedures, quality control, access mechanisms  to uniform signatures … Water Markup Language - Tested in different environments - Standards-based - Can support advanced interfaces via harvested catalogs - Accessible to community - Templates for development of new services - Optimized, error handling, memory management, versioning, run from fast servers And: working with agencies on setting up services!

  15. WaterOneFlow API • GetValues • Returns a TimeSeries • GetSiteInfo • Station Information, including a period of record • GetVariableInfo • Returns variable/parameter information -- developed to have a low barrier to entry -- terminology same as Observations Database -- reuse of common elements

  16. GetVariableInfo • Input • Vocabulary:VariableCode • Output • VariableResponse <variablesResponse xmlns:gml="http://www.opengis.net/gml" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:wtr="http://www.cuahsi.org/waterML/" xmlns="http://www.cuahsi.org/waterML/1.0/"> - <variables> - <variable> <variable>   <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode>   <variableName>Discharge, cubic feet per second</variableName>   <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units>   </variable>  </variables> </variablesResponse>

  17. <series> - <variable> <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode> <variableName>Discharge, cubic feet per second</variableName> <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units> </variable> <valueCount countIsEstimated="true">30563</valueCount> - <variableTimeInterval xsi:type="TimeIntervalType"> <beginDateTime>1923-02-01T00:00:00</beginDateTime> <endDateTime>2006-10-07T00:00:00</endDateTime> </variableTimeInterval> </series> GetSiteInfo • <sitesResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cuahsi.org/waterML/1.0/"> • - <site> • - <siteInfo> • <siteName>BIG ROCK C NR VALYERMO CA</siteName> • <siteCode network="NWIS" siteID="4622637">10263500</siteCode> • - <geoLocation> • - <geogLocation xsi:type="LatLonPointType" srs="EPSG:4269"> • <latitude>34.42083115</latitude> • <longitude>-117.8395072</longitude> • </geogLocation> • </geoLocation> • </siteInfo> • - <seriesCatalog menuGroupName="USGS Daily Values" serviceWsdl="http://localhost/WaterOneFlowDev/DailyValues.asmx"> • <note type="sourceUrl">http://waterdata.usgs.gov/nwis/dv?referred_module=sw&format=rdb&date_format=YYYY-MM-DD&begin_date=2006-11-17&site_no=10263500</note> • - <series> • <!– [snip] --> • </series> • </seriesCatalog> • - <seriesCatalog menuGroupName="USGS Unit Values" serviceWsdl="http://localhost/WaterOneFlowDev/UnitValues.asmx"> • <!-- [snip] --> • </seriesCatalog> • </site> • </sitesResponse>

  18. GetValues • NWIS, STORET, etc. • Location: NWIS:10263500 • Variable: NWIS:00060 • Time Range: 2005-08-01 to 2005-08-03 • MODIS, etc. • Location: GEOM:BOX(-180 -90,180 90) • Variable: MODIS:11/plotarea=landocean • Time Range: 2000-10-01to2001-03-01

  19. timeSeriesResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.cuahsi.org/waterML/1.0/"> • - <queryInfo> • <queryURL>http://waterdata.usgs.gov/nwis/dv?&site_no=10263500&</queryURL> • - <criteria> • <locationParam>NWIS:10263500</locationParam> • <variableParam>nwis:00060</variableParam> • - <timeParam> • <beginDateTime>2005-08-01</beginDateTime> • <endDateTime>2005-08-03</endDateTime> • </timeParam> • </criteria> • </queryInfo> • - <timeSeries> • - <sourceInfo xsi:type="SiteInfoType"> • <siteName>BIG ROCK C NR VALYERMO CA</siteName> • <siteCode siteID="4622637">10263500</siteCode> • - <geoLocation> • - <geogLocation xsi:type="LatLonPointType" srs="EPSG:4269"> • <latitude>34.42083115</latitude> • <longitude>-117.8395072</longitude> • </geogLocation> • </geoLocation> • </sourceInfo> • - <variable> • <variableCode vocabulary="NWIS" default="true" variableID="12578">00060</variableCode> • <variableName>Discharge, cubic feet per second</variableName> • <units unitsAbbreviation="cfs" unitsCode="35">cubic feet per second</units> • </variable> • - <values count="3"> • <value qualifiers="Ae" dateTime="2005-08-01T00:00:00">25</value> • <value qualifiers="Ae" dateTime="2005-08-02T00:00:00">26</value> • <value qualifiers="Ae" dateTime="2005-08-03T00:00:00">24</value> • </values> • </timeSeries> • </timeSeriesResponse>

  20. Hydrologic Data Access System

  21. Hydrologic Data Access System

  22. Remote CUAHSI HIS Node (Windows) Remote CUAHSI HIS Node (Windows) Remote CUAHSI HIS Node (Windows) Remote CUAHSI HIS Node (Windows) HODM HODM HODM HODM Web Web Web Web HDAS HDAS HDAS HDAS Web Web Web Web Services Services Services Services Service Service Service Service IIS Web Server IIS Web Server IIS Web Server IIS Web Server ASP ASP ASP ASP . . . . Net Net Net Net Web Web Web Web ArcGIS ArcGIS ArcGIS ArcGIS SQL Server SQL Server SQL Server SQL Server Service Service Service Service Technologies Technologies Technologies Technologies proxies proxies proxies proxies Data Data Data Data HIS nodes: cross-platform design GEON Data Node (Linux) Central CUAHSI HIS Node (Windows) HODM Web HDAS Web Geon Software Stack Services Service Proxy IIS Web Server ASP . Net Apache Tomcat Web ArcGIS SQL Server Service Technologies proxies Application Services, handling of spatial data types, etc Data Security management, distributed data management, integrationwith other CI projects Data Remote CUAHSI HIS Nodes (Windows)

  23. Resource registration • Shapefiles • TIFF images, GMT rasters • Web Services, WMS services • Relational databases, ASCII • PDFs, URLs • “CUAHSI data” • NetCDF • Coming: Geodatabases and ODM

  24. Possible Connections • Review of ODM • Dealing with observations/measurements rather than with sensor data? • Review of WaterOneFlow services schema • Aligning WaterOneFlow output schemas with GML/SensorML • Carrying WaterOneFlow requests/responses over WFS • Long term preservation of observation data • Water Data Interoperability Testbed?

  25. Survey of Observing Systems • NEON: http://www.neoninc.org • ORION: http://www.orionprogram.org/ • WATERS • CUASHI: http://www.cuahsi.org, http://river.sdsc.edu/hdas • CLEANER: http://cleaner.ncsa.uiuc.edu/home/ • GLEON: http://www.gleon.org/, http://lakemetabolism.org/ • CREON: http://www.coralreefeon.org/ • MoveBank: http://www.princeton.edu/~wikelski/research/index.htm • Civil Infrastructure: http://healthmonitoring.ucsd.edu/index.jsp • IRIS/USArray:http://www.iris.edu/USArray/

More Related