1 / 44

Use of RDF/OWL in Ingrid

Use of RDF/OWL in Ingrid. M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society http://iridl.ldeo.columbia.edu/ontologies/. Why RDF?. Make implicit semantics explicit Web-based system for interoperating semantics

rad
Télécharger la présentation

Use of RDF/OWL in Ingrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Use of RDF/OWL in Ingrid M.Benno Blumenthal and John del Corral International Research Institute for Climate and Society http://iridl.ldeo.columbia.edu/ontologies/

  2. Why RDF? Make implicit semantics explicit Web-based system for interoperating semantics RDF/OWL is an emerging technology, so tools are being built that help solve the semantic problems in handling data

  3. Standard Metadata Standard Metadata Schema/Data Services Datasets Tools Users

  4. StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema Datasets Datasets Datasets Datasets Datasets Tools Tools Tools Tools Tools Users Users Users Users Users Many Data Communities

  5. StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema Datasets Datasets Datasets Datasets Datasets Tools Tools Tools Tools Tools Users Users Users Users Users Super Schema Standard metadata schema

  6. StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema Datasets Datasets Datasets Datasets Datasets Tools Tools Tools Tools Tools Users Users Users Users Users Super Schema: direct Standard metadata schema/data service

  7. Flaws • A lot of work • Super Schema/Service is the Lowest-Common-Denominator • Science keeps evolving, so that standards either fall behind or constantly change

  8. StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema Datasets Datasets Datasets Datasets Datasets Tools Tools Tools Tools Tools Users Users Users Users Users RDF Standard Data Model Exchange Standard metadata schema RDF RDF RDF RDF RDF RDF

  9. RDF RDF RDF RDF RDF StandardMetadataSchema StandardMetadataSchema StandardMetadataSchema StandardMetadataSchem StandardMetadataSchema RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF Datasets Datasets Datasets Datasets Datasets Tools Tools Tools Tools Tools Users Users Users Users Users RDF Data Model Exchange Standard metadata schema RDF

  10. Why is this better? • Maps the original dataset metadata into a standard format that can be transported and manipulated • Still the same impedance mismatch when mapped to the least-common-denominator standard metadata, but • When a better standard comes along, the original complete-but-nonstandard metadata is already there to be remapped, and “late semantic binding” means everyone can use the new semantic mapping • Can use enhanced mappings between models that have common concepts beyond the least-common-denominator • EASIER – tools to enhance the mapping process, mappings build on other mappings

  11. queries queries queries RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF Architecture Virtual (derived) RDF

  12. Example: Search Interface Additional Semantics Dataset Ontology Search Ontology Datasets Search Interface Users

  13. Sample Tool: Faceted Search http://iridl.ldeo.columbia.edu/ontologies/query2.pl?...

  14. Distinctive Features of the search • Search terms are interrelated • terms that describe the set of returns are displayed (spanning and not) • Returned items also have structure (sub-items and superseded items are not shown)

  15. Architectural Features of the search • Multiple search structures possible • Multiple languages possible • Search structure is kept in the database, not in the code http://iridl.ldeo.columbia.edu/ontologies/query2.pl

  16. RDF: framework for writing connections Triplets of • Subject • Property (or Predicate) • Object URI’s identify things, i.e. most of the above Namespaces are used as a convenient shorthand for the URI’s

  17. Datatype Properties {WOA} dc:title “NOAA NODC WOA01” {WOA} dc:description “NOAA NODC WOA01: World Ocean Atlas 2001, an atlas of objectively analyzed fields of major ocean parameters at monthly, seasonal, and annual time scales. Resolution: 1x1; Longitude: global; Latitude: global; Depth: [0 m,5500 m]; Time: [Jan,Dec]; monthly”

  18. Object Properties {WOA} iridl:isContainerOf {Grid-1x1}, {Grid-1x1} iridl:isContainerOf {Monthly}

  19. WOA01 diagram

  20. Standard Properties {WOA} dcterm:hasPart {Grid-1x1}, {Grid-1x1} dcterm:hasPart {MONTHLY} Alternatively {WOA} iridl:isContainerOf {Grid-1x1}, {iridl:isContainerOf} rdfs:subPropertyOf {dcterm:hasPart}

  21. Data Structures in RDF Object properties provide a framework for explicitly writing down relationships between data objects/components, e.g. vague meaning of nesting is made explicit Properties also can be related, since they are objects too {SST} rdf:type {cfatt:non_coordinate_variable}, {SST} cfobj:standard_name {cf:sea_surface_temperature}, {SST} netcdf:hasDimension {longitude}

  22. Virtual Triples Use Conventions to connect concepts to established sets of concepts Generate additional “virtual” triples from the original set and semantics RDFS – some property/class semantics OWL – additional property/class semantics: more sophisticated (ontological) relationships SWRL – rules for constructing virtual triples

  23. OWL Language for expressing ontologies, i.e. the semantics are very important. However, even without a reasoner to generate the implied RDF statements, OWL classes and properties represent a sophistication of the RDF Schema However, there are many world views in how to express concepts: concepts as classes vs concepts as individuals vs concept as predicate

  24. Define terms • Attribute Ontology • Object Ontology • Term Ontology

  25. Attribute Ontology • Subjects are the only type-object • Predicates are “attributes” • Objects are datatype • Isomorphic to simple data tables • Isomorphic to netcdf attributes of datasets • Some faceted browsers: predicate = facet

  26. Object Ontology • Objects are object-type • Isomorphic to “belongs to” • Isomorphic to multiple data tables connected by keys • Express the concept behind netcdf attributes which name variables • Concepts as objects can be cross-walked • Concepts as object can be interrelated

  27. Example: controlled vocabulary {variable} cfatt:standard_name {“string”} Where string has to belong to a list of possibilities. {variable} cfobj:standard_name {stdnam} Where stdnam is an individual of the class cfobj:StandardName

  28. Example: controlled vocabulary Bi-direction crosswalk between the two is somewhat trivial, which means all my objects will have both cfatt:standard_name and cfobj:standard_name

  29. Example: controlled vocabulary If I am writing software to read/write netcdf files, I use the cfatt ontology and in particular cfatt:standard_name If I am making connections/cross-walks to other variable naming standards, I use cfobj:standard_name

  30. Term Ontology Concepts as individuals Simple Knowledge Organization System (SKOS) is a prime example The ontology used here is slightly different: facets are classes of terms rather than being top_concepts

  31. Nuanced tagging Concepts as objects can be interrelated: specific terms imply broader terms Object ends up being tagging with terms ranging from general to specific. Search can then be nuanced tagging can proceed in absence of perfect information

  32. Mapping to Object Oriented Programming • ActiveRDF • Elmo

  33. Faceted Search Explicated

  34. Search Interface • Items (datasets/maps) • Terms • Facets • Taxa

  35. Search Interface Semantic API {item} dc:title dc:description rss:link iridl:icon dcterm:isPartOf {item2} dcterm:isReplacedBy {item2} {item} trm:isDescribedBy {term} {term} a {facet} of {taxa} of {trm:Term}, {facet} a {trm:Facet}, {taxa} a {trm:Taxa}, {term} trm:directlyImplies {term2}

  36. Faceted Search w/Queries http://iridl.ldeo.columbia.edu/ontologies/query2.pl?...

  37. queries queries queries RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF Architecture Virtual (derived) RDF

  38. IRI RDF Architecture Data Servers MMI Ontologies JPL bibliography Start Point Standards Organizations RDF Crawler Location Canonicalizer RDFS Semantics Owl Semantics SWRL Rules SeRQL CONSTRUCT Time Canonicalizer Sesame Search Queries Search Interface

  39. Cast of Characters NC – netcdf data file format CF – Climate and Forecast metadata convention for netcdf SWEET - Semantic Web for Earth and Environmental Terminology (OWL Ontology) IRIDL – IRI Data Library

  40. CF attributes NC basic attributes IRIDL attributes/objects CF data objects CF Standard Names (RDF object) SWEET Ontologies (OWL) Location CF Standard Names As Terms IRIDL Terms SWEET as Terms Search Terms Gazetteer Terms

  41. Thoughts • Pure RDF framework seems currently viable for a moderate collection of data • Potential for making a lot of implicit data conventions explicit • Explicit conventions can improve interoperability • Simple RDF concepts can greatly impact searches

  42. Future Work Possibilities More Usable Search Interface Tagging Interface that uses tag interrelationships to simplify choices Data Format translation using semantics “Related Object Browsing” given a dataset, find related data, papers, images Document/execute/create analysis trees Stovepipe conventions/bash-to-fit Less Monolithic IRI Data Library

  43. Implications for Curator/Metafor • Reproducibility implies complete metadata • Non-standard complete metadata just needs to be mapped to more standard schemes • A multiple-scheme system like RDF retains reproducibility even with partial mapping to standards • Should be able to measure the misfit – find the space of the “unexplained” – guidance for developing standards.

  44. Stovepipe Conventions • Fixed Schema • Agreed upon metadata domain • Agreed upon data domain • Designed to be a partial solution General server software needs to decide whether data legitimately fits the standard User contemplates bash-to-fit

More Related