1.12k likes | 1.33k Vues
Introduction to Databases: From Data to Knowledge Bases. Instructors: Bertram Ludaescher Kai Lin. Overview. 08:30-9:30 Introduction to KR (1h) 9:30 – 9:45 BREAK (15’) 9:45 -11:20 Intro to KR (1h45’) 11:20-11:50 Demos (30’) 11:50-13:15 LUNCH (1h25’). Demonstrations/Hands-on (~30’)
E N D
Introduction to Databases:From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin
Overview • 08:30-9:30 Introduction to KR (1h) • 9:30 – 9:45 BREAK (15’) • 9:45 -11:20 Intro to KR (1h45’) • 11:20-11:50 Demos (30’) • 11:50-13:15 LUNCH (1h25’) • Demonstrations/Hands-on (~30’) • Ontology-enabled data integration • Concept map creation tool • Ontology creation tool
The Problem: Scientific Data Integrationor: … from Questions to Queries …
Ontology Cheat Sheet (1/2) • What is an ontology? An ontology usually … • specifies a theory (a set of logic models) by … • defining and relating… • concepts representing features of a domain of interest • Also overloaded (sloppy) for: • Controlled vocabularies • Database schema (relational, XML Schema/DTD, …) • Conceptual schema (ER, UML, … ) • Thesauri (synonyms, broader term/narrower term) • Taxonomies (classifications) • Informal/semi-formalknowledgerepresentations • “Concept spaces”, “concept maps” • Labeled graphs / semantic networks (RDF) • Formal ontologies, e.g., in [Description] Logic (OWL) • “formalization of a specification” constrains possible interpretation of terms
Ontology Cheat Sheet (2/2) • What are ontologies used for? • Conceptual models of a domain or application, (communication means, system design, …) • Classification of … • concepts (taxonomy) and • data/object instances through classes • Analysis of ontologies e.g. • Graph queries (reachability, path queries, …) • Reasoning (concept subsumption, consistency checking, …) • Targets for semantic data registration • Conceptual indexes and views for • searching, • browsing, • querying, and • integration of registered data
Ontologies as Metadata++ Ontologies = Smarter Metadata TM
Smarter (Meta)data I: Logical Data Views Adoption of a standard (meta)data model => wrap data sets into unified virtual views Source: NADAM Team (Boyan Brodaric et al.)
“smart discovery & querying” via multiple, independent concept hierarchies (controlled vocabularies) • data at different description levels can be found and processed Smarter Metadata II: Multihierarchical Rock Classification for “Thematic Queries” (GSC) –– or: Taxonomies are not only for biologists ... Genesis Fabric Composition Texture
Biomedical Informatics Research Network http://nbirn.net Smarter Metadata III:Source Contextualization & Ontology Refinement The next frontier: Capturing Knowledge about Dynamic Processes “Process Ontologies”
domain knowledge Knowledge representation AGE ONTOLOGY Show formations where AGE = ‘Paleozic’ (without age ontology) Show formations where AGE = ‘Paleozic’ (with age ontology) Nevada +/- a few hundred million years Ontology-Enabled Application Example:Geologic Map Integration
Integrated querying of multiple datasets via different “ontologies” (conceptual views)
! ? OK – we got to work on the color coding ;-) Querying by Chemical Composition: Results Note the fine differences in shades of gray: DO know: It’s NOT there! DON’T know! (not registered)
Querying w/ British Rock Classification Uses a GSC BRC inter-ontology articulation mapping
British Rock Classification Query: Results Uses a GSC BRC inter-ontology articulation mapping
but first: what states are we looking at? The Query: Show sedimentary rocksThe Puzzle: Find the 17 differences in the results…
Differing Conceptual Views: Why? • We are looking at the same datasets – why do they lookdifferent? • Different rock classifications (GSC, BGS) are used as “targets” for registering data to • Not every rock name/rock type found in the raw data is found in both classifications • The mapping (“articulation”) between the classifications is an approximation only • Yet: having “conceptual views” (even if different) on the data really seems like a good idea…
Geologic Map Integration • Given: • Geologic maps from different state geological surveys (shapefiles w/ different data schemas) • Different ontologies: • Geologic age ontology • Rock classification ontologies: • Multiple hierarchies (chemical, fabric, texture, genesis) from Geological Survey of Canada (GSC) • Single hierarchy from British Geological Survey (BGS) • Problem: • Support uniform queries using different ontologies • Support registration w/ ontology A, querying w/ ontology B
A Multi-Hierarchical Rock Classification “Ontology” (really:Taxonomy) Genesis Fabric Composition Texture
Demonstration ofOntology-enabled Map Integration (OMI) v2 “Semantic Registration” Data Ontology enabled Map Integrator {A,B} ontology A Data ontology B Application (B) Data ontology C Application (C) Data Ontologies Applications Data sets
Ontology Mapping: Overview • Align ontologies • Integrate data sets which are registered to different ontologies • Query data sets through different ontologies Ontology 1 register Data set 1 queries Ontology mappings Ontology 2 register Data set 2
click on Ontologies click on Datasets click on Applications An Ontology-based Mediator Geology Workbench: Initial State
Name Space Can be used to import this ontology into others Click to check its detail click on Ontology Submission Choose an OWL file to upload Geology Workbench: Uploading Ontologies
Choose an ontology Click on Submission Select a shapefile Data set name Geology Workbench: Data (to Ontology!) RegistrationStep 1: Choose Classes
It contains information about geologic age Geology Workbench: Data RegistrationStep 2: Choose Columns for Selected Classes
Two terms are not matched any ontology terms Manually mapping algonkian into the ontology Geology Workbench: Data RegistrationStep 3: Resolve Mismatches
All areas with the age Paleozoic Click on the name Choose interesting Classes Geology Workbench: Ontology-enabled Map Integrator
New query interface Run it Switch from Canadian Rock Classification to British Rock Classification Ontology mapping between British Rock Classification and Canadian Rock Classification Submit a mapping Geology Workbench: Change Ontology
…………….. <owl:Ontology> <owl:imports rdf:resource= "http://compute5.sdsc.geongrid.org:8080/workbench/jsp/ontologies/genesis.owl" /> </owl:Ontology> ……………. <owl:Class rdf:ID="Ultramafite"> <rdfs:subClassOf rdf:resource="#Ultramafic"/> <rdfs:subClassOf rdf:resource= "http://compute5.sdsc.geongrid.org:8080/workbench/jsp/ontologies/genesis.owl#Igneous"> </owl:Class> …………….. Ontology Repository • Accept user-defined ontologies in OWL • Any ontology saved in the system or accessible by can be imported into another user-defined ontology ( inter-ontology references) • Provide tool to browse the ontologies in the repository composition.owl
Ontology-Enabled Map Integration: Where do we stand? • The simple case (done) : • ontologies contain only the subclass relation • More complicate cases (coming soon) : • ontologies contain classes with attributes • ontologies with constraints in Description Logic • Implementation: • v1,v2 prototypes: detail-level registration to ontology • v3 (portal): item-level registration to ontology
Domain Knowledge Ontologies Arizona Current Ontology Registration (Item-level) v3
Other distributed apps Kepler, DLESE, … Client Access (via web services) myOntology.owl myDataset.foo Search condition(s) spatialtemporalconcept User actions add delete manipulate metadata metadata ResourceRegistration GEONsearch GEONworkbench GEON Workspace (user) GEON Catalog SRB Log GEONmiddleware external services Gazetteer, DLESE, … Geologic Age, Chronos, … System Overview User Access (via Portal)
Complex Multiple-Worlds Mediation and XML • XML is Syntax • DTDs talk about element nesting • XML Schema schemas give you data types • need anything else? => write comments! • Domain Semantics is complex: • implicit assumptions, hidden semantics • sources seem unrelated to the non-expert • Need Structure and Semantics beyond XML trees! • employ richer OO models • make domain semantics and “glue knowledge” explicit • use ontologies to fix terminology and conceptualization • avoid ambiguities by using formal semantics
Integrated-DTD := XQuery(Src1-DTD,...) Integrated-CM := CM-QL(Src1-CM,...) Ontologies DMs, PMs Logical Domain Constraints No Domain Constraints IF THEN IF THEN IF THEN Structural Constraints (DTDs), Parent, Child, Sibling, ... Classes, Relations, is-a, has-a, ... C1 A = (B*|C),D B = ... C2 R C3 . . .... .... .... XML Elements .... (XML) Objects XMLModels Raw Data Raw Data ConceptualModels Raw Data XML-Based vs. Model-Based Mediation CM ~ {Descr.Logic, ER, UML, RDF/XML(-Schema), …} CM-QL ~ {F-Logic, OWL, …}
Knowledge Representation:Relating Theory to the World via Formal Models Source: John F. Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations “All models are wrong, but some models are useful!”
Glossary (wordreference.com) • ontologynoun1 (Philosophy) the branch of metaphysics that deals with the nature of being2 (Logic) the set of entities presupposed by a theory • taxonomynoun1a the branch of biology concerned with the classification of organisms into groups based on similarities of structure, origin, etc.b the practice of arranging organisms in this way2 the science or practice of classification [ETYMOLOGY: 19th Century: from French taxonomie, from Greek taxis order + -nomy] • thesaurusnoun(plural: -ruses, -ri [-raı])1 a book containing systematized lists of synonyms and related words2 a dictionary of selected words or topics3 (rare) a treasury[ETYMOLOGY: 18th Century: from Latin, Greek: treasure]
Glossary (wordreference.com) • conceptnoun1 an idea, esp. an abstract ideaexample: the concepts of biology2 (Philosophy) a general idea or notion that corresponds to some class of entities and that consists of the characteristic or essential features of the class3 (Philosophy) a the conjunction of all the characteristic features of something b a theoretical construct within some theory c a directly intuited object of thought d the meaning of a predicate4 [modifier] (of a product, esp. a car) created as an exercise to demonstrate the technical skills and imagination of the designers, and not intended for mass production or sale[ETYMOLOGY: 16th Century: from Latin conceptum something received or conceived, from concipere to take in, conceive] • contingent adjective1 [when postpositive, often foll by on or upon] dependent on events, conditions, etc., not yet known; conditional2 (Logic) (of a proposition) true under certain conditions, false under others; not necessary3 (in systemic grammar) denoting contingency (sense 4)4 (Metaphysics) (of some being) existing only as a matter of fact; not necessarily existing5 happening by chance or without known cause; accidental6 that may or may not happen; uncertain • glossary noun (plural: -ries); an alphabetical list of terms peculiar to a field of knowledge with definitions or explanations. Sometimes called: gloss[ETYMOLOGY: 14th Century: from Late Latin glossarium; see gloss2]
1st Attempt: Ontologies in CS • An ontology is ... • an explicit specification of a conceptualization[Gruber93] • a shared understanding of some domain of interest [Uschold, Gruninger96] • Different aspects: • a formal specification (reasoning and “execution”) • ... of a conceptualisation of a domain (community) • ... of some part of the world of interest (application, science domain) • Provides: • A common vocabulary of terms • Some specification of the meaning of the terms (semantics) • A shared “understanding” for people and machines
Ontology as a philosophical discipline • Ontology as a philosophical discipline, which deals with the nature and the organization of reality: • Ontology as such is usually contrasted with Epistemology, which deals with the nature and sources of our knowledge [a.k.a. Theory of Knowledge]. Aristotle defined Ontology as the science of being as such: unlike the special sciences, each of which investigates a class of beings and their determinations, Ontology regards all the species of being qua being and the attributes which belong to it qua being" (Aristotle, Metaphysics, IV, 1). • In this sense Ontology tries to answer to the question: What is being? What exists? • the nature of being, not an enumeration of “stuff” around us…
Some different uses of the word “Ontology” [Guarino’95] 1. Ontology as a philosophical discipline 2. Ontology as a an informal conceptual system 3. Ontology as a formal semantic account 4. Ontology as a specification of a “conceptualization” 5. Ontology as a representation of a conceptual system via a logical theory 5.1 characterized by specific formal properties 5.2 characterized only by its specific purposes 6. Ontology as the vocabulary used by a logical theory 7. Ontology as a (meta-level) specification of a logical theory http://ontology.ip.rm.cnr.it/Papers/KBKS95.pdf
Ontologies vs Conceptualizations • Given a logical language L ... • ... a conceptualization is a set of models of L which describes the admittable (intended) interpretations of its non-logical symbols (the vocabulary) • ... an ontology is a (possibly incomplete) axiomatization of a conceptualization. set of all models M(L) logic theories (consistent sets of sentences; closed under logical consequence) ontology conceptualization C(L) [Guarino96] http://www-ksl.stanford.edu/KR96/Guarino-What/P003.html