1 / 29

Geoscience Knowledge Representation Using the SWEET Ontologies

Geoscience Knowledge Representation Using the SWEET Ontologies. Rob Raskin Jet Propulsion Laboratory. Transforming Data into Knowledge. Data Information Knowledge. Basic Elements Bytes Numbers Models Facts

mimis
Télécharger la présentation

Geoscience Knowledge Representation Using the SWEET Ontologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geoscience Knowledge Representation Using the SWEET Ontologies Rob Raskin Jet Propulsion Laboratory

  2. Transforming Data into Knowledge Data Information Knowledge Basic Elements Bytes Numbers Models Facts Services Ingest Archive Visualize Infer Understand Predict Storage File Database HDF-EOS GIS MIS Ontology Mind Interoperability Syntactic OPeNDAP WMS/WCS Semantic Volume/Density High/Low Low/High Statistics Checksum Moments Descriptive Inferential Analysis Fourier Wavelet EOF SSA Methodology Exploratory-analysis Model-based-mining Syntax Semantics

  3. What is Knowledge? • Facts, relations, meanings, contexts • Organized information • Core ingredient in “common sense” • Common understanding • In a form to apply reasoning/inference • Dynamic • Expandable

  4. Semantic Understanding is Difficult! Sea surface temperature: measured 3 m above surface Sea surface temperature: measured at surface Data quality= 5 Variable t: temperature Variable t: time Let’s eat, Grandma. Let’s eat Grandma. Time flies like an arrow. Fruit flies like a pie. LA Times headline “Mission accomplished. Major combat operations in Iraq have ended”

  5. Database vs Knowledge Base • Database • Entities and Relations • Closed world • All facts included • Knowledge base • Classes and Properties • Collection of facts • Captures corporate memory • Open world • Facts not stated may be either true or untrue

  6. PO.DAAC Knowledge Bases Public access Documents People Roles/Tasks Data Processing (Docushare) Data Products Metadata Tools/ Services Web Pages Science Concepts Missions Instruments Organiza- tions Applications Announce- ments Inquiries Computers

  7. Relations • People have roles • Instruments measure science parameters • Inquiries relate to data products • etc.

  8. Example of Knowledge-Assisted Service • Yellow Page Lookup: • cars vs automobiles • Hotels vs motels vs resorts

  9. Semantic-based Service Example: Google • Type into Google: “gymnasiums in Seattle” • Generates map of Seattle with dots locating gyms • Google understands that • Seattle is a place • Gymnasiums is a place-based service • Google understands semantics so that the search results also could include • locations near Seattle • Similar services (e.g., health club)

  10. Assertion of Facts as Triples Subject-Verb-Object representation • Flood subClassOf WeatherPhenomena • HDF subClassOf FileFormat • Pressure subClassOf PhysicalProperty • Ocean hasSubstance Water • AIRS measures Temperature

  11. Applications • Software tools can find “meaning” in resources for • Discovery • Fusion • Lineage • … • Requirements • Data products associated with objects in “science concept space” • Richer descriptions than DIFs • Data services associated with objects in “service concept space” • Richer descriptions than SERFs • Search/fusion tools that exploit ontologies

  12. Semantic Web Vision • Web page creators place XML tags around technical terms on web pages • XML tags point to knowledge base where term is “defined” • Search tools use this information to provide value-added services • Common search engines (Google) use these capabilities only minimally, at present

  13. Ontologies • Current preferred method to store “facts” • General definition: “all that is known” • Computer science definition: Machine-readable definition of terms and how they relate to one another • As with a dictionary, terms are defined in terms of other terms • Provide shared understanding of concepts • Support knowledge reuse • Support machine-to-machine communications with deeper semantics than controlled vocabulary

  14. XML-based Ontology Languages • XML satisfies desired properties for language syntax • Readable by both humans and machines • However, there are too many possible ways that XML tags can be named and used • No standardization of XML tag meanings as in HTML (<b> </b> pair => renders in bold) • Additional standardized semantics needed to exploit shared understanding of concepts

  15. RDF and OWL • W3C has adopted languages that specialize XML Resource Description Formulation (RDF) • Ontology Web Language (OWL) • Languages predefine specific tags • RDF: Class, subclass, property, subproperty, … • RDF and OWL form a nested collection of languages, each roughly a specialization of the preceding language with further shared understanding • XML • RDF • RDFS • OWL Lite • OWL DL • OWL Full

  16. Semantic Web for Earth and Environmental Terminology (SWEET) • SWEET is a concept space • Enables scalable classification of Earth system science concepts • Currently being expanded to Space science • Anybody can import, expand, and specialize the work of others • No need to regenerate a physics, chemistry, or math ontology • Concept space is translatable into other languages/cultures using “sameAs” notions

  17. SWEET Ontologies and Their Interrelationships Faceted Ontologies Living Substances Non-Living Substances Integrative Ontologies Natural Phenomena Physical Processes Human Activities Earth Realm Data Physical Properties Space Time Units Numerics

  18. SWEET as an Upper Level Earth Science Ontology Math Physics Chemistry Space import Property EarthRealm Process, Phenomena Substance Data SWEET Time import Stratospheric Chemistry Biogeochemistry Specialized domains

  19. Why an Upper-Level Ontology for Earth System Science? • Many common concepts used across Earth Science disciplines (such as properties of the Earth) • Provides common definitions for terms used in multiple disciplines or communities • Provides common language in support of community and multidisciplinary activities • Provides common “properties” (relations) for tool developers • Reduced burden (and barrier to entry) on creators of specialized domain ontologies • Only need to create ontologies for incremental knowledge

  20. How SWEET was Initially Populated • Initial sources • GCMD • Over 10,000 datasets • Over 1000 keywords • Data providers submit far more than the 1000 terms for “free-text” search • CF • Over 500 keywords • Very long term names • surface_downwelling_photon_spherical_irradiance_in_sea_water • Decomposed into facets

  21. Spatial Ontology • Concepts of 0-D, 1-D, 2-D, and 3-D objects • Default coordinate system: lat/lon/up • Polygons used to store spatial extents • Spatial attributes added (population, area, etc.) • Scientific applications include: geology to represent 3-D structure

  22. Numerical Ontologies • Numerics • Extents: interval, point, 0, positiveIntegers, … • Relations: lessThan, greaterThan, … • SpatialEntities • Extents: country, Antarctica, equator, inlet, … • Relations: above, northOf, … • TemporalEntities • Extents: duration, century, season, … • Relations: after, before, …

  23. Numerical Ontologies (cont.) • Numeric concepts defined in OWL only through standard XML XSD spec • Intervals defined as restrictions on real line • Numerical relations defined in SWEET • lessThan, max, … • Cartesian product (multidimensional spaces) added in SWEET • Numeric ontologies used to define spatial and temporal concepts

  24. Conceptual Ontologies • Phenomena • ElNino, Volcano, Thunderstorm, Deforestation) • Each has associated, spatial/temporal extent, EarthRealms, PhysicalProperties etc. • Specific instances included • e.g., 1997-98 ElNino • Human Activities • Fisheries, IndustrialProcessing, Economics, Public Good • State • History or state of planet or component

  25. SWEET Users • ESML- Earth Science Markup Language • ESIP - Earth Science Information Partner Federation • GEON- Geosciences Network • GENESIS- Global Environmental & Earth Science Information System • IRI- International Research Institute (Columbia) • LEAD- Linked Environments for Atmospheric Discovery • MMI- Marine Metadata Initiative • NOESIS • PEaCE- Pacific Ecoinformatics and Computational Ecology • SESDI- Semantically Enabled Science Data Integration • VSTO- Virtual Solar-Terrestrial Observatory

  26. Collaboration Web Site • Discussion tools • Blog, wiki, moderated discussion board • Version Control/ Configuration Management • Trace dependencies on external ontologies • Tools to search for existing concepts in registered ontologies • Ontology Validation Procedure • W3C note is formal submission method • Registry/discovery of ontologies • Support workflows/services for ontology development

  27. Community Issues • Content • Maintain alignment given expansion of classes and properties • Standards and Conventions • Agreement on standards for use of OWL • Fuzzy representation conventions • Review Board • Who will oversee and maintain for perpetuity (or at least through the next funding cycle) • ESIP Federation? ESSI? • Global Support • Provide tools to visualize and appreciate the big picture

  28. Update/Matching Issues • No removal of terms except for spelling or factual errors • Subscription service to notify affected ontologies when changes made • Must avoid contradictions • Additions can create redundancy if sameAs not used • Humans must oversee “matching” • CF has established moderator to carry out analogous additions • OWL “import” imports entire file • Associate community with ontology terms • Community tagging

  29. Best Practices • Keep ontologies small, modular • Be careful that “Owl:Import” imports everything • Use higher level ontologies where possible • Identify hierarchy of concept spaces • Model schemas • Try to keep dependencies unidirectional

More Related