1 / 79

Ontology Engineering: Tools and Methodologies

Ontology Engineering: Tools and Methodologies. Ian Horrocks <horrocks@cs.man.ac.uk> Information Management Group School of Computer Science University of Manchester. Tutorial Resources. http://www.cs.man.ac.uk/~horrocks/nsd07/. Ontologies. Ontology: Origins and History.

harlow
Télécharger la présentation

Ontology Engineering: Tools and Methodologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology Engineering: Tools and Methodologies Ian Horrocks <horrocks@cs.man.ac.uk> Information Management Group School of Computer Science University of Manchester

  2. Tutorial Resources http://www.cs.man.ac.uk/~horrocks/nsd07/

  3. Ontologies

  4. Ontology: Origins and History • In Philosophy, fundamental branch of metaphysics • Studies “being” or “existence” and their basic categories • Aims to find out what entities and types of entities exist

  5. Ontology in Information Science • An ontology is an engineering artefact consisting of: • A vocabulary used to describe (a particular view of) some domain • An explicit specification of the intended meaning of the vocabulary. • Often includes classification based information • Constraints capturing background knowledge about the domain • Ideally, an ontology should: • Capture a shared understanding of a domain of interest • Provide a formal and machine manipulable model

  6. Example Ontology (Protégé)

  7. The Web Ontology Language OWL

  8. OWL History • Semantic Web led to requirement for a “web ontology language” • set up Web-Ontology (WebOnt) Working Group • WebOnt developed OWL language • OWL based on earlier languages RDF, OIL and DAML+OIL • OWL now a W3C recommendation(i.e., a standard) • OWL is a family of 3 languages: OWL Lite, OWL DL and OWL Full • OIL, DAML+OIL and OWL (DL & Lite) based on Description Logics • Many OWL DL/Lite tools & ontologies • Relatively few OWL Full tools or ontologies

  9. HappyParent´Parent u8hasChild.(Intelligent t Athletic) HappyParent´Parent u8hasChild.(Intelligent t Athletic) HappyParent´Parent u8hasChild.(Intelligent tAthletic) HappyParent´Parentu8hasChild.(Intelligentt Athletic) What Are Description Logics? • A family of logic based Knowledge Representation formalisms • Descendants of semantic networks and KL-ONE • Describe domain in terms of concepts (classes), roles (properties, relationships) and individuals • Operators allow for composition of complex concepts • Names can be given to complex concepts, e.g.: HappyParent´Parent u8hasChild.(Intelligent t Athletic)

  10. Animal Mat Black IS-A Felix has-color Cat IS-A IS-A sits-on Why (Description) Logic? • OWL exploits results of 15+ years of DL research • Well defined (model theoretic) semantics [Quillian, 1967]

  11. Why (Description) Logic? • OWL exploits results of 15+ years of DL research • Well defined (model theoretic) semantics • Formal properties well understood (complexity, decidability) I can’t find an efficient algorithm, but neither can all these famous people. [Garey & Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.]

  12. Why (Description) Logic? • OWL exploits results of 15+ years of DL research • Well defined (model theoretic) semantics • Formal properties well understood (complexity, decidability) • Known reasoning algorithms

  13. Pellet CEL Why (Description) Logic? • OWL exploits results of 15+ years of DL research • Well defined (model theoretic) semantics • Formal properties well understood (complexity, decidability) • Known reasoning algorithms • Implemented systems (highly optimised) KAON2

  14. Why the Strange Names? • Description Logics are a family of KR formalisms • Mainly distinguished by available operators • Available operators indicated by letters in name, e.g., S : basic DL (ALC) plus transitive roles (e.g., ancestor R+) H : role hierarchy (e.g., hasDaughter v hasChild) O : nominals/singleton classes (e.g., {Italy}) I : inverse roles (e.g., isChildOf ´ hasChild–) N : number restrictions (e.g., >2hasChild, 63hasChild) • Basic DL + role hierarchy + nominals + inverse + NR = SHOIN • The basis for OWL-DL • SHOIN is very expressive, but still decidable (just) • Decidable  we can build reliable tools and reasoners

  15. Why (Description) Logic? • Foundational research was crucial to design of OWL • Informed Working Group decisions at every stage, e.g.: • “Why not extend the language with feature x, which is clearly harmless?” • “Adding x would lead to undecidability - see proof in […]”

  16. Class/Concept Constructors • C is a concept (class); P is a role (property); x is an individual name • XMLS datatypes as well as classes in 8P.C and 9P.C • Restricted form of DL concrete domains

  17. Knowledge Base / Ontology Axioms

  18. Knowledge Base / Ontology • A TBox is a set of “schema” axioms (sentences), e.g.: {Parent v Person u>1hasChild, HappyParent´Parent u8hasChild.(Intelligent t Athletic)} • An ABox is a set of “data” axioms (ground facts), e.g.: {John:HappyParent, John hasChild Mary} • An OWL ontology is just a SHOIN KB

  19. OWL RDF/XML Exchange Syntax <owl:Class> <owl:intersectionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Parent"/> <owl:Restriction> <owl:onProperty rdf:resource="#hasChild"/> <owl:allValuesFrom> <owl:unionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Intelligent"/> <owl:Class rdf:about="#Athletic"/> </owl:unionOf> </owl:allValuesFrom> </owl:Restriction> </owl:intersectionOf> </owl:Class> E.g., Parent u8hasChild.(Intelligent t Athletic):

  20. Ontology Reasoning

  21. Why Ontology Reasoning? • Given key role of ontologies in many applications, it is essential to provide tools and services to help users: • Design and maintain high quality ontologies, e.g.: • Meaningful— all named classes can have instances

  22. Why Ontology Reasoning? • Given key role of ontologies in many applications, it is essential to provide tools and services to help users: • Design and maintain high quality ontologies, e.g.: • Meaningful— all named classes can have instances • Correct— captures intuitions of domain experts

  23. Why Ontology Reasoning? • Given key role of ontologies in many applications, it is essential to provide tools and services to help users: • Design and maintain high quality ontologies, e.g.: • Meaningful— all named classes can have instances • Correct— captures intuitions of domain experts • Minimally redundant— no unintended synonyms  Banana split Banana sundae

  24. Why Ontology Reasoning? • Given key role of ontologies in many applications, it is essential to provide tools and services to help users: • Design and maintain high quality ontologies, e.g.: • Meaningful— all named classes can have instances • Correct— captures intuitions of domain experts • Minimally redundant— no unintended synonyms • Answer queries, e.g.: • Find more general/specific classes • Retrieve individuals/tuples matching a given query

  25. Ontology Applications

  26. e-Science • E.g., Open Biomedical Ontologies Consortium (GO, MGED) • Used, e.g., for “in silico” investigations relating theory and data • E.g., relating data on phosphatases to (model of) biological knowledge

  27. CentralSulcus Frontal Lobe Parietal Lobe Occipital Lobe Temporal Lobe Lateral Sulcus Medicine • Building/maintaining terminologies such as Snomed, NCI, Galen and FMA • Used, e.g., for semi-automated annotation of MRI images

  28. Organising Complex Information • E.g., UN-FAO, NASA, Ordnance Survey, General Motors, Lockheed Martin, …

  29. Organising Complex Information • E.g., UN-FAO, NASA, Ordnance Survey, General Motors, Lockheed Martin, …

  30. OWL Experiences and Directions • Workshop at ESWC’07 (Innsbruck, Austria) • Brings together users, implementors and researchers • Submissions include: • Enterprise Integration (Mitre) • Product development (Lockheed Martin) • Role based access control (NASA) • Healthcare (SNOMED) • Agriculture and fisheries (UN Food & Agriculture Organization) • Oral Medicine (Chalmers) • …

  31. Ontology Engineering

  32. Ontology Engineering Tasks • Typical tasks in Ontology Engineering: • author concept descriptions • refine the ontology • manage errors • integrate different ontologies • (partially) reuse ontologies • These tasks are highly challenging; need for: • tool & infrastructure support • design methodologies

  33. Tools and Infrastructure • Editors/environments • Protégé, Swoop, TopBraid Composer, Construct, Ontotrack, …

  34. CEL Tools and Infrastructure • Editors/environments • Oiled, Protégé, Swoop, Construct, Ontotrack, … • Reasoning systems • Cerebra, FaCT++, Kaon2, Pellet, Racer, … Pellet KAON2

  35. Entity Endurant Perdurant Quality Substantial Event Stative Achievement Accomplishment Tools and Infrastructure • Editors/environments • Oiled, Protégé, Swoop, Construct, Ontotrack, … • Reasoning systems • Cerebra, FaCT++, Kaon2, Pellet, Racer, … • Design methodologies • Modularity, foundational ontologies, etc.

  36. Development & Maintenance

  37. Development Environments • Most widely used free to download toos are • Protégé (Stanford / Manchester) -- be sure to get v4.x • Swoop (UMD / Clark & Parsia) • Commercial tools include • TopBraid, RacerPro, … • Facilities typically include • Range of display modes and editing features • Visualisation • Consistency and subsumption checking • Useful extras may include • Debugging and explanation • Repair • Integration and/or partitioning http://code.google.com/p/swoop/ http://protege.stanford.edu/

  38. Demo Ontologies • GALEN • http://www.cs.man.ac.uk/~horrocks/OWL/Ontologies/galen.owl • NCI • http://www.mindswap.org/2003/CancerOntology • Tambis • http://www.cs.man.ac.uk/~horrocks/OWL/Ontologies/tambis.owl

  39. GALEN • Ontology about medical terms and surgical procedures. • Work started in the 90s within the OpenGALEN project. • Main applications: • Integration of clinical records, and • decision support. • GALEN: • is very large (~35,000 concepts), • is fairly expressive (SHIF description logic), • has not been classified yet by any DL reasoner • We will look at a smaller version, which: • is still large (~3,000 concepts), • is similarly expressive as full GALEN, • was first classified by the FaCT system.

  40. GALEN: The Ontology at a Glance • Size: • ~ 3,000 classes • ~ 500 object properties • no individuals or datatypes • Expressivity • ~350 General Concept Inclusion Axioms (GCIs). • Concept constructors: • Conjunction (intersectionOf) • Existential restrictions (someValuesFrom) • 150 functional properties • 26 transitive properties

  41. GALEN: The (Unclassified) Hierarchies • The class hierarchy: • Number of subsumption relations: 1,978 • Maximum depth of the tree: 13 • No multiple inheritance • The property hierarchy: • 4 properties with multiple inheritance

  42. GALEN: Concept definitions and GCIs Concept definition • Axiom of the form A ´ C with: • A a concept name • C a (possibly complex) concept • A definition assigns a nameAto a complex concept C Some examples: LungPathology ´ Pathology u9 locativeAttribute.Lung RenalTransplant ´ Transplanting u9 actsOn.Kindney

  43. GALEN: Concept definitions and GCIs Inclusion axioms: • Axioms of the form A v C: • A is a concept name • C is a possibly complex concept • Represent an incomplete (``partial’’) definition • Examples: XRayMachine v ImagingDevice Candida v Fungus u9 hasFunction.AerobicMetabolicProcess • In GALEN, some of these can be very complex: • check out the definitions of Knee Joint and Kidney!

  44. GALEN: Concept definitions and GCIs General Concept Inclusion Axioms (GCIs) • Axioms of the form C ´ D • C,D can be complex • May describe general (background) knowledge about the ontology Examples: Secretion u9 actsSpecificallyOn.Leucocidin v 9 isFunctionOf.StraphilococcusAureus Transport u9 actsOn.Glucose u9 carriesFrom.Blood v 9 carriesTo.Cell

  45. Classifying GALEN Ontology statistics (revisited): • Number of class subsumption relations: 6729 • 1978 of which are “told” and the rest inferred • Maximum depth of the class tree: 15 • As opposed to 13 in the case of the unclassified tree • Classes with multiple inheritance: 408 • All multiple inheritance relations have been inferred! • This was intended in the design of GALEN • Maximum depth of the property tree: 9 • No change with respect to the “told” tree • Properties with multiple inheritance: 4 • Again, no change with respect to the ``told’’ tree Reasoning is mostly performed on classes and not on properties

  46. Modeling Choices • The “upper” part: • Composed of the domain-independent concepts and roles. • Examples: • TopCategory, DomainCategory, GeneralisedStructure… • Shallowly defined (mostly a taxonomy) • The “domain specific” part: • Examples: • Plant, LungPathology, … • Richly defined • Much more than just a taxonomy!

  47. Inferred Knowledge A trivial subsumption: • Why is PathologicalCondition a subclass of DomainCategory? • Simply look at the definition of Pathological Condition! Another example: • Why is PathologicalBehavior a subclass of PathologicalCondition? • Look at the definition of both classes • Notice that Behavior is a subclass of DomainCategory A non-trivial subsumption: • Why is AchalasiaProcesses a PathologicalBodyProcesses?

  48. Classifying GALEN • Simple and multiple inheritance • Focus, for example, on PathologicalBodyProcess • Navigate to its super-classes • Visualisation can be useful • In Swoop we can “Fly the mother ship”!

  49. The NCI Ontology • Huge bio-medical ontology describing the Cancer domain • Maintained by dozens of domain experts • Contains information about: • genes, • diseases, • drugs, • research institutions, … All with a cancer-centric focus

  50. NCI: The Ontology at a Glance • Size: • ~ 30.000 classes • ~ 70 object properties • no individuals or datatypes • Expressivity • Concept constructors: • Conjunction (intersectionOf) • Existential restrictions (someValuesFrom) • Axioms: • Definitions (no GCIs) • Domain and range of properties

More Related