440 likes | 622 Vues
Ontologies for Information Fusion. Deborah L. McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA 94305 USA 650-723-9770 dlm@ksl.stanford.edu. What is an Ontology?. General Description Logics*. Formal taxonomy.
E N D
Ontologies for Information Fusion Deborah L. McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA 94305 USA 650-723-9770 dlm@ksl.stanford.edu
What is an Ontology? General Description Logics* Formal taxonomy Thesauri -> “narrower term” relation Frames (properties) Catalog/ ID Term Hierarchy (e.g. Yahoo!) Formal instance General Logic Terms/ glossary Value Restrs. *based on AAAI ’99 Ontologies panel – Gruninger, Lehmann, McGuinness, Uschold, Welty Updated by McGuinness, additional input from Gruninger, Uschold, and Rockmore Deborah L. McGuinness
Some uses of (simple) Ontologies Simple ontologies (taxonomies) provide: • Controlled shared vocabulary (search engines, authors, users, databases, programs/agents all speak same language) • Site Organization, Navigation Support, Expectation setting • “Umbrella” Upper Level Structures (for extension e.g., UNSPSC) • Browsing support (tagged structures such as Yahoo!) • Search support (query expansion approaches such as FindUR, e-Cyc) • Sense disambiguation (e.g., TAP) Deborah L. McGuinness
Uses of Ontologies II • Interoperability Support • Consistency Checking • Completion • Support for validation and verification testing (e.g. http://ksl.stanford.edu/projects/DAML/chimaera-jtp-cardinality-test1.daml ) • Configuration support • Structured, “surgical” comparative customized search • Generalization / Specialization • Query and answer analysis and refinement See pedagogical wine agent example at: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/ Deborah L. McGuinness
KSL Wine AgentSemantic Web Integration • Agent receives an analysis and retrieval task description and uses emerging web standards to provide answer description and return specific answers. (Given a meal description, describe the class(es) of matching wines and retrieve some from web.) • DAML+OIL / OWL for representing a domain ontology of foods, wines, their properties, and relationships between them • JTP theorem prover for deriving appropriate pairings • DQL for querying a knowledge base consisting of the above information • Inference Web for explaining and validating answers (descriptions or instances) • [Web Services for interfacing with vendors] • Connections to online web agents/information services • Utilities for conducting and caching the above transactions • Info: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/ Deborah L. McGuinness
Implications and Needs for Ontology-enhanced applications • Ontology language syntax and semantics (DAML+OIL, OWL) • Upper level/core ontologies for reuse/extension (Cyc, SUMO, CNS coalition, DAML-S…) • Environments for creation of ontologies (Protégé, Sandpiper, Construct, OilEd, …) • Environments for maintenance of ontologies: evolution, diagnostics, merging, partitioning, views, versions, (Chimaera, OntoBuilder, Prompt, …) • Reasoning environments (Cerebra, Fact, JTP, Snark, …) • Distributed explanation support facilitating trust (Inference Web) • Surrounding tools – semantic search (TAP, FindUR, …), agent platforms, • Training (conceptual modeling, reasoning usage, tutorials – OWL Guide, Ontologies 101, OWL Tutorial, …) Deborah L. McGuinness
Inference Web Infrastructure for trust. Supports explanation of reasoning and retrieval tasks by storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments • DAML+OIL/OWL specification of proofs is interlingua for proof interchange • Proof browser for displaying IW proofs and their explanations (possibly from multiple inference engines) • Registration for inference engines/rules/languages; pedigree • Proof explainer for abstracting proofs into more understandable formats • Proof generation service for facilitate the creation of IW proofs by inference engines • Hosted service available integrated with Stanford’s JTP reasoner and SRI’s SNARK reasoner. Integrated in DQL Client/Server, Wine Agent, … • Discussions with Boeing, Cycorp, Fetch, ISI, Northwestern, SRI, UT, UW, W3C, … Deborah L. McGuinness
DAML/OWL Language • Web Languages • RDF/S • XML • Extends vocabulary of • XML and RDF/S • Rich ontology representation language • Language features chosen for efficient implementations DAML-ONT DAML+OIL OWL OIL Formal Foundations Description Logics Frame Systems FACT, CLASSIC, DLP, … Deborah L. McGuinness
Discussion/Conclusion • Ontologies are exploding; core of many applications as seen at IF2003 • Business/govt. “pull” is driving ontology tools and languages • New generation applications need more expressive ontologies and more back end reasoning • User base is broader thus tools are providing support aimed at audience larger than KR&R-trained people • Distributed ontologies motivating more supporting tools: merging, analysis, explanation support, incompleteness techniques, versioning, etc. • Scale and distribution of the web force mind shift (no longer monolithic single ontologies) • Everyone is in the game – Government (DARPA, NSF, NIST, ARDA…), DSTO, EU, W3C, consortiums, business, … • Consulting and product companies are in the space (not just academics) Good time to bring ontologies into Info. Fusion in a larger way Deborah L. McGuinness
A few US Govt. Programs • DARPA: • DAML – DARPA Agent Markup Language • RKF – Rapid Knowledge Formation • HPKB – High Performance Knowledge Base • PBA – Predictive Battle Space Awareness • EPCA – Enduring Personalized Cognitive Assistant/PAL/CALO, KnowledgePad • ARDA: • AQUAINT – Question Answering • NIMD – Novel Intelligence for Massive Data Deborah L. McGuinness
Pointers • Selected Papers: • McGuinness. Ontologies come of age, 2003 • Das, Wei, McGuinness, Industrial Strength Ontology Evolution Environments, 2002. • Kendall, Dutra, McGuinness. Towards a Commercial Strength Ontology Development Environment, 2002. • McGuinness Description Logics Emerge from Ivory Towers, 2001. • McGuinness. Ontologies and Online Commerce, 2001. • McGuinness. Conceptual Modeling for Distributed Ontology Environments, 2000. • McGuinness, Fikes, Rice, Wilder. An Environment for Merging and Testing Large Ontologies, 2000. • Brachman, Borgida, McGuinness, Patel-Schneider. Knowledge Representation meets Reality, 1999. • McGuinness. Ontological Issues for Knowledge-Enhanced Search, 1998. • McGuinness and Wright. Conceptual Modeling for Configuration, 1998. • Selected Tutorials: • -Smith, Welty, McGuinness. OWL Web Ontology Language Guide, 2003. • Noy, McGuinness. Ontology Development 101: A Guide to Creating your First Ontology. 2001. • Brachman, McGuinness, Resnick, Borgida. How and When to Use a KL-ONE-like System, 1991. • Languages, Environments, Software: • OWL - http://www.w3.org/TR/owl-features/ , http://www.w3.org/TR/owl-guide/ • DAML+OIL: http://www.daml.org/ • - Inference Web - http://www.ksl.stanford.edu/software/iw/ • - Chimaera - http://www.ksl.stanford.edu/software/chimaera/ • FindUR - http://www.research.att.com/people/~dlm/findur/ • - TAP – http://tap.stanford.edu/ • - DQL - http://www.ksl.stanford.edu/projects/dql/ Deborah L. McGuinness
Extras Deborah L. McGuinness
General Nature of Descriptions class a WINE superclass a LIQUID a POTABLE-THING grape: chardonnay, ... [>= 1] sugar-content: dry, sweet, off-dry color: red, white, rose price: a PRICE winery: a WINERY grape dictates color (modulo skin) harvest time and sugar are related general categories number/card restrictions structured components Roles/ properties value restrictions interconnections between parts Deborah L. McGuinness
A Few Observations about Ontologies • Ontologies can be built by non-experts by COTS and academic tools • Verity’s Topic Editor, Constructor, Collaborative Topic Builder, GFP, Chimaera, Protégé, OIL-ED, etc. • Ontologies can be semi-automatically generated • from crawls of site such as yahoo!, amazon, excite, etc. • Semi-structured sites can provide starting points • Ontologies are exploding (business pull instead of technology push) • e-commerce - MySimon, Amazon, Yahoo! Shopping, VerticalNet, … • Controlled vocabularies (for the web) abound - SIC codes, UMLS, UNSPSC, Open Directory (DMOZ), Rosetta Net, SUMO • Business interest expanding: ontology directors, business ontologies are becoming more complicated (roles, value restrictions, …), VC firms interested- Vulcan’s HALO project • Markup Languages growing XML,RDF, DAML, OWL,RuleML, xxML • “Real” ontologies are becoming more central to applications • Search companies moving towards them – Yahoo, recently Google Deborah L. McGuinness
Processing • Given a description of a meal, • Use DQL to state a premise (the meal) and query the knowledge base for a suggestion for a wine description or set of instances • Use JTP Theorem Prover to deduce answers (and proofs) • Use Inference Web to explain results (descriptions, instances, provenance, reasoning engines, etc.) • Access relevant web sites (wine.com, wine commune, …) to access current information • Use DAML-S for markup and protocol* http://www.ksl.stanford.edu/projects/wine/explanation.html Deborah L. McGuinness
Querying multiple online sources Deborah L. McGuinness
FindUR Architecture Content to Search: CLASSIC Knowledge Representation System Research Site Technical Memorandum Calendars (Summit 2005, Research) Yellow Pages (Directory Westfield) Newspapers (Leader) Internal Sites (Rapid Prototyping) AT&T Solutions Worldnet Customer Care Medical Information Content (Web Pages or Databases Content Classification Domain Knowledge Domain Knowledge Search Technology: Search Engine Verity (and topic sets) GUI supporting browsing and selection Collaborative Topic Set Tool User Interface: Verity SearchScript, Javascript, HTML, CGI, CLASSIC Results (standard format) Results (domain specific) Deborah L. McGuinness
<rdfs:Class rdf:ID="BLAND-FISH-COURSE"> • <daml:intersectionOf rdf:parseType="daml:collection"> • <rdfs:Class rdf:about="#MEAL-COURSE"/> • <daml:Restriction> • <daml:onProperty rdf:resource="#FOOD"/> • <daml:toClass rdf:resource="#BLAND-FISH"/> • </daml:Restriction> • </daml:intersectionOf> • <rdfs:subClassOf rdf:resource="#DRINK-HAS-DELICATE-FLAVOR-RESTRICTION"/> • </rdfs:Class> • <rdfs:Class rdf:ID="BLAND-FISH"> • <rdfs:subClassOf rdf:resource="#FISH"/> • <daml:disjointWith rdf:resource="#NON-BLAND-FISH"/> • </rdfs:Class> • <rdf:Description rdf:ID="FLOUNDER"> • <rdf:type rdf:resource="#BLAND-FISH"/> • </rdf:Description> • <rdfs:Class rdf:ID="CHARDONNAY"> • <rdfs:subClassOf rdf:resource="#WHITE-COLOR-RESTRICTION"/> • <rdfs:subClassOf rdf:resource="#MEDIUM-OR-FULL-BODY-RESTRICTION"/> • <rdfs:subClassOf rdf:resource="#MODERATE-OR-STRONG-FLAVOR-RESTRICTION"/> […] • </rdfs:Class> • <rdf:Description rdf:ID="BANCROFT-CHARDONNAY"> • <rdf:type rdf:resource="#CHARDONNAY"/> • <REGION rdf:resource="#NAPA"/> • <MAKER rdf:resource="#BANCROFT"/> • <SUGAR rdf:resource="#DRY"/> […] • </rdf:Description> Deborah L. McGuinness
DAML/OWL Language • Web Languages • RDF/S • XML • Extends vocabulary of • XML and RDF/S • Rich ontology representation language • Language features chosen for efficient implementations DAML-ONT DAML+OIL OWL OIL Formal Foundations Description Logics Frame Systems FACT, CLASSIC, DLP, … Deborah L. McGuinness
Issues • Collaboration among distributed teams • Interconnectivity with many systems/standards • Analysis and diagnosis • Scale • Versioning • Security • Ease of use • Diverse training levels / user support • Presentation style • Lifecycle • Extensibility Deborah L. McGuinness
Services Ontologies DAML-S http://www.daml.org/services/ • publication references • ontology specifications • examples A few interesting projects using DAML-S: • MyGrid: (http://mygrid.man.ac.uk) • AgentCities (http://www.agentcities.org) • Services composer (http://www.mindswap.org/~evren/composer/) Deborah L. McGuinness
General Nature of Descriptions a WINE a LIQUID a POTABLE grape: chardonnay, ... [>= 1] sugar-content: dry, sweet, off-dry color: red, white, rose price: a PRICE winery: a WINERY grape dictates color (modulo skin) harvest time and sugar are related general categories structured components interconnections between parts Deborah L. McGuinness
SUMO • Available in KIF (first order logic), DAML, LOOM and XML • May be used without fee for any purpose (including for profit) • Mapped by hand to 100,000 synsets of WordNet lexicon • Validated with formal theorem proving • 52 publicly released versions created over two years (approximately 1,000 concepts, 4000 assertions, and 750 rules so far) • Specialized with dozens of free domain ontologies • In use by companies, universities and government around the world • Acadmica Sinica – Taiwan, U Arizona, lookwayup.com, NIST etc • Available at http://ontology.teknowledge.com Deborah L. McGuinness
Chimaera – A Ontology Environment Tool • An interactive web-based tool aimed at supporting: • Ontology analysis (correctness, completeness, style, …) • Merging of ontological terms from varied sources • Maintaining ontologies over time • Validation of input • Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language • Used in commercial and academic environments; used in HORUS to support counter-terrorism ontology generation • Available as a hosted service from www-ksl-svc.stanford.edu • Information:www.ksl.stanford.edu/software/chimaera Deborah L. McGuinness
Layer Cake Foundation Deborah L. McGuinness
Some Pointers • Ontologies Come of Age Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html • Ontologies and Online Commerce Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-and-online-commerce-abstract.html • DAML+OIL: http://www.daml.org/ • WEBONT: http://www.w3.org/2001/sw/WebOnt/ • OWL: http://www.w3.org/TR/owl-features/ Deborah L. McGuinness
E-Commerce Search (starting point Forrester Research modified by McGuinness) • Ask Queries - multiple search interfaces (surgical shoppers, advice seekers, window shoppers) - set user expectations (interactive query refinement) - anticipate anomalies • Get Answers - basic information (multiple sorts, filtering, structuring) - modify results (user defined parameters for refining, user profile info, narrow query, broaden query, disambiguate query) - suggest alternatives (suggest other comparable products even from competitor’s sites) • Make Decisions - manipulate results (enable side by side comparison) - dive deeper (provide additional info, multimedia, other views) - take action (buy) Deborah L. McGuinness
The Need For KB Analysis • Large-scale knowledge repositories will necessarily contain KBs produced by multiple authors in multiple settings • KBs for applications will typically be built by assembling and extending multiple modular KBs from repositories that may not be consistent • KBs developed by multiple authors will frequently • Express overlapping knowledge in different, possibly contradictory ways • Use differing assumptions and styles • For such KBs to be used as building blocks - They must be reviewed for appropriateness and “correctness” • That is, they must be analyzed Deborah L. McGuinness
Our KB Analysis Task • Review KBs that: • Were developed using differing standards • May be syntactically but not semantically validated • May use differing modeling representations • Produce KB logs (in interactive environments) • Identify provable problems • Suggest possible problems in style and/or modeling • Are extensible by being user programmable Deborah L. McGuinness
A Few Observations about Ontologies • Ontologies can be built by non-experts by COTS and academic tools • Verity’s Topic Editor, Constructor, Collaborative Topic Builder, GFP, Chimaera, Protégé, OIL-ED, etc. • Ontologies can be semi-automatically generated • from crawls of site such as yahoo!, amazon, excite, etc. • Semi-structured sites can provide starting points • Ontologies are exploding (business pull instead of technology push) • e-commerce - MySimon, Amazon, Yahoo! Shopping, VerticalNet, … • Controlled vocabularies (for the web) abound - SIC codes, UMLS, UNSPSC, Open Directory (DMOZ), Rosetta Net, SUMO • Business interest expanding: ontology directors, business ontologies are becoming more complicated (roles, value restrictions, …), VC firms interested- Vulcan’s HALO project • Markup Languages growing XML,RDF, DAML, OWL,RuleML, xxML • “Real” ontologies are becoming more central to applications • Search companies moving towards them – Yahoo, recently Google Deborah L. McGuinness