1 / 32

HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture. M. Piasecki November, 2007. Lecture. Demo of HydroSeek What are the search criteria? Functionality of the Engine Interface Data Sources Common Sources

melody
Télécharger la présentation

HYDROSEEK and HYDROTAGGER A Search Engine for Hydrologists GIS in Water Resources Lecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HYDROSEEK and HYDROTAGGERA Search Engine for HydrologistsGIS in Water Resources Lecture M. Piasecki November, 2007 Department of Civil, Architectural & Environmental Engineering

  2. Lecture • Demo of HydroSeek • What are the search criteria? • Functionality of the Engine Interface • Data Sources • Common Sources • Common Problems (Completeness, Syntax, Semantics) • Ontologies • Ontology details • Concept-to-data variable tagging • Architecture • Flow Chart • Technologies used • Demo of HydroTagger • Why the Tagging? • Technologies Department of Civil, Architectural & Environmental Engineering

  3. www.HydroSeek.org Department of Civil, Architectural & Environmental Engineering

  4. HIS Goals • Hydrologic Data Access System – better access to a large volume of high quality hydrologic data • Support for Observatories – synthesizing hydrologic data for a region • Advancement of Hydrologic Science – data modeling and advanced analysis • Hydrologic Education – better data in the classroom, basin-focused teaching Department of Civil, Architectural & Environmental Engineering

  5. request return request return return request NAWQA return request NAM-12 request return NWIS return request return request return request NARR Objective What we are doing now ….. • Search multiple heterogeneous data sources simultaneously regardless of semantic or structural differences between them Department of Civil, Architectural & Environmental Engineering

  6. NAWQA NWIS NARR HODM What we would like to do ….. GetValues Semantic Mediator GetValues GetValues GetValues generic request GetValues GetValues GetValues GetValues Department of Civil, Architectural & Environmental Engineering

  7. Data sources… USGS EPA CIMS TCEQ NADP Department of Civil, Architectural & Environmental Engineering

  8. Spatial Coverage STORET has 758 sites in Texas, TCEQ has 8407. STORET has 47,602 sites in Florida, NWIS has 27,906. NWIS has 121,545 in Minnesota, STORET has 22,260. Department of Civil, Architectural & Environmental Engineering

  9. Data Availability Department of Civil, Architectural & Environmental Engineering

  10. Temporal Coverage 2003-2007 1977-2003 1957-1977 Nitrogen Department of Civil, Architectural & Environmental Engineering

  11. Interface Problem NWIS ~175 form elements on a single page STORET + NWIS + TCEQ + CIMS = ???A drop down menu ∞ String search across parameter list? How about synonyms? ‘Elevation, water surface’ vs. ‘stage height’ Department of Civil, Architectural & Environmental Engineering

  12. Completeness Problem: Metadata Catalog • Better query performance • Freedom • Fewer errors Availability of geographic identifiers for stations in EPA STORET Department of Civil, Architectural & Environmental Engineering

  13. Heterogeneity Problem • Syntax E.g. date & time formats, Gregorian versus Julian • Data format/structure E.g. XML, HTML, tab/tilde/comma separated text, gunzipped tar balls… • Semanticsmore ….. Department of Civil, Architectural & Environmental Engineering

  14. Issues with Semantics • Hyponymy Parameter “Groundwater level”, “Stream stage”, “Reservoir level” versus “Water level” • Pseudo hyponymy due to lack of metadata Parameter “Manganese, 6N hydrochloric acid extracted, recoverable, dry weight, milligrams per kilogram” versus “Manganese, milligrams per kilogram” • Synonymy ‘Total Kjeldahl Nitrogen’ vs. ‘Ammonia+Organic Nitrogen’ Department of Civil, Architectural & Environmental Engineering

  15. Search Strategy Search  Fine tune  Retrieve rather than Search  Retrieve avoid ‘high precision, low recall’ and ‘low precision, high recall’ problems. Department of Civil, Architectural & Environmental Engineering

  16. Layered Ontology Model Department of Civil, Architectural & Environmental Engineering

  17. Core Navigation Compound Department of Civil, Architectural & Environmental Engineering

  18. Knowledge Base • Supports classification of search results • Entities in the ontology are associated with measured variables in a relational database • Helps solving semantic heterogeneity issues between data repositories ‘Escherichia coli’ = ‘E. coli’ ‘E. coli’ is-a ‘Indicator Organism’ ‘Copper’ is-a ‘Micronutrient’ ‘Copper’ isMeasuredIn ‘Medium’ ‘Medium’ = {Water, Soil…} ‘Micronutrient’ is-a ‘Nutrient’ OWL Ontologies Department of Civil, Architectural & Environmental Engineering

  19. Department of Civil, Architectural & Environmental Engineering

  20. http://www.cuahsi.org/his/webservices.html USGS Data Source Point Observations Information Model GetSites Streamflow gages Network GetSiteInfo Neuse River near Clayton, NC Sites GetVariables GetVariableInfo Discharge, stage (Daily or instantaneous) Variables GetValues Values 206 cfs, 13 August 2006 {Value, Time, Qualifier, Offset} • A data source operates an observation network • A network is a set of observation sites • A site is a point location where one or more variables are measured • A variable is a property describing the flow or quality of water • A value is an observation of a variable at a particular time • A qualifier is a symbol that provides additional information about the value • An offset allows specification of measurements at various depths in water Department of Civil, Architectural & Environmental Engineering

  21. Hydroseek Webservices EPA STORET USGS Daily WaterOneFlow CIMS HydroSeek USGS Realtime WaterOneFlow TCEQ MicroSoft Server VirtualEarth Map San Diego Supercomputer Center Server Native Services WaterOneFlow WaterOneFlow Drexel Server WaterOneFlow Most Hydroseek functions are available as web services (SOAP) Support for queries using GlobalChangeMasterDirectory GCMD keywords Supports output in GeographyMarkupLanguage GML as well as WaterML Department of Civil, Architectural & Environmental Engineering

  22. GetStations Request Response BoundingBox Department of Civil, Architectural & Environmental Engineering

  23. GetStationsByHU Request Response HUC_Code Department of Civil, Architectural & Environmental Engineering

  24. GetStationCatalogueFiltered Request Response Department of Civil, Architectural & Environmental Engineering

  25. GetStationCatalogue Request Response Department of Civil, Architectural & Environmental Engineering

  26. Allows searching multiple heterogeneous data sources simultaneously regardless of semantic or structural differences between them • Modular & extensible Architecture Outline Inside the CUAHSI HOD Module Department of Civil, Architectural & Environmental Engineering

  27. The Database-Ontology Link www.HdyroTagger.org Department of Civil, Architectural & Environmental Engineering

  28. 1) MappingsApproved_Table 2) FrequentUpDates_Table HydroSeek ODM neededan upgrade, i.e. additionaltables. Department of Civil, Architectural & Environmental Engineering

  29. How does the Tagging work? Step 1 Users need to register on the web-site first before they can use the HydroTagger. When registering select the testbed site you are affiliated with. Each testbed site needs ONE administrator who can then admit additional users for that specific testbed site. Please send an email to identify the designated tagger site administrator so we can promote that person to the role. Department of Civil, Architectural & Environmental Engineering

  30. How does the Tagging work? Step 2 The “Sniffer” jumps into action and trawls through the testbed sites to find and identify new variablenames (once a week, currently every Sunday night) It does so by using the regular web-services published through the WSDL (no “hacking”!!!) It returns i) data updating information and ii) variablenames used and compares these to those used by HydroSeek. WATERS Network Information System Department of Civil, Architectural & Environmental Engineering

  31. How does the Tagging work? Step 3 The Tagger now updates the HydroSeek catalogue (an amalgamation of all 10 testbed catalogues) with the newly found data entries. If it finds a new variablename (introduced during the dataloading process using the Data-Loader), it puts it into a table and offers it up to he HydroTagger GUI for semantic Tagging. Department of Civil, Architectural & Environmental Engineering

  32. Thank you…Questions? Department of Civil, Architectural & Environmental Engineering

More Related