Explanation: The Next Phase in Question Answering

Explanation:The Next Phase in Question Answering Deborah L. McGuinness Knowledge Systems Laboratory Stanford University http://www.ksl.stanford.edu dlm@ksl.stanford.edu

Outline • Motivation – Question Answering Systems Need to Provide Justifiable Answers • Explanation is a Necessary Component for Trust • Explanation requirements as gathered from DARPA, ARDA, academic, and commercial needs • Inference Web introduction: An Explanation Infrastructure for the Semantic Web (work with Pinheiro da Silva) • Registry • Portable Proofs for Interoperability • Explainer • Browser • Conclusion McGuinness 2003

Motivation - TRUST If users (humans and agents) are to use and integrate system answers, they must trust them. System transparency supports understanding and trust. Even simple “lookup” systems should be able to provide information about their sources. As Question Answering systems become more complex, they may incorporate multiple hybrid information sources, multiple information manipulation techniques, integration of reasoners, conflict resolution strategies, prioritization, assumptions, etc., all of which may need explanation. Thus, systems should be able to explain their actions, sources, and beliefs. McGuinness 2003

Requirements – Knowledge Provenance • Source name (CIA World Fact Book) • Author of original information • Date of original information and any updates • Authoritativeness of Source (is this considered reliable or certified reliable by some third party) • Degree of belief • Degree of completeness (can the closed world assumption be made for inference?) • Term or phrase meaning (in natural language or formal language) • Term inter-relationships (ontological relations including subclass, superclass, part-of, etc.) McGuinness 2003

Requirements – Reasoning Information • Reasoner used, authors, version #, etc. • Reasoning method (tableaux, model elimination, etc.) • Inference rules supported by reasoner • Reasoner soundness and completeness • Reasoner assumptions (closed world, open, unique names, etc.) • Detailed trace of inference rules applied (with appropriate variable bindings to provide conclusion) • Term coherence • Assumptions used in derivation • Source consistency (is there support for A and not A) • Support for alternative reasoning paths to a single conclusion McGuinness 2003

Reqs - Presentation • Presentation needs to be manageable (thus stand alone fragments are required) • Fragments need to be stand-alone • Proofs need to be pruned • Support for proof and explanation navigation • Web browser compatibility • Follow-up question support • Alternative justifications should be available McGuinness 2003

Reqs – Distribution and Interoperability • Explanations must work in heterogeneous environments • Must interoperate on the web • Representations must be portable, shareable, and combinable • Proof interlingua required • Proof/explanation presentation - Presentation should have manageable (small) portions that are meaningful alone (without the context of an entire proof), users should be supported in asking for explanations and follow-up questions, users should get automatic and customized proof pruning, web browsing option, multiple formats, customizable, etc. McGuinness 2003

Requirements – Explanation Generation • Provide abbreviated description of information manipulation path • Provide machine and user understandable descriptions (may require use of a formal language such as DAML+OIL, OWL, RDF) • Machine understandable representation of information manipulations (axioms such as FOL Semantics for DAML+OIL (Fikes&McGuinness) • Description of rewrite rules for abstraction McGuinness 2003

Inference Web Framework for explaining reasoning tasks by storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by reasoners. • DAML+OIL/OWL specification of proofs is an interlingua for proof interchange • Proof browser for displaying IW proofs and their explanations (possibly from multiple inference engines) • Registration for inference engines/rules/languages • Proof explainer for abstracting proofs into more understandable formats • Proof generation service for facilitate the creation of IW proofs by inference engines • Prototype implementation with Stanford’s JTP reasoner and SRI’s SNARK reasoner • Integrated with DQL and JTP in a few web agents for demonstrations • Discussions with Boeing, Cycorp, Fetch, ISI, Northwestern, SRI, UT, UW, W3C, … Collaborative work with Pinheiro da Silva McGuinness 2003

IW Registry and Registrar • IW Registry has meta-data useful for disclosing data provenance and reasoning information such as descriptions of • inference engines along with their supported inference rules • Information sources such as organizations, publications and ontologies • Languages along with their axioms • The Registry is managed by the IW Registrar McGuinness 2003

Inference Engine Registration (1) • Engine registration involves the creation of an engine entry and its association with entries of inference rules • Rule entries can be either reused or added to the registry • An entry for SRI’s SNARK engine • An entry for SNARK’s Binary Resolution inference rule McGuinness 2003

Inference Engine Registration (2) • Otter’s binary resolution, hyper-resolution and paramodulation rules were reused for the registration of SNARK • Assumption and negated conclusion rules were added for SNARK Rule reuse addition addition McGuinness 2003

Inference Engine Registration (3) Summarizing the Inference Engine Registration process: • Use the registry to include meta-information about the engine and its rules • Add an entry for the new inference engine • Identify the core inference rules supported by the engine • Add unregistered core inference rules, if any • Associated the core rules with the core inference engine • Prepare the engine to dump proofs in the IW format • Implement a routine for calling the proof generator service • Example routines in Java and Lisp can be provided • Publish successful results of the proof generator services in portable proof format (OWL/DAML/RDF/XML compliant files) • Browse your proofs in the IW Browser McGuinness 2003

Inference Web Architecture World Wide Web Caption Document maintenance Registrars Agent dependency non-IW documents URL reference Registry entries Web agent Web document Reasoner agent IW Browsers Inference engines proof fragments McGuinness 2003

Generation of IW proofs (1) Send node information: reasoner ID, labeling sentence in KIF, rule ID, antecedent URIs, bindings, and sourceID Registry WWW Registrar (can collect statistics, provide feedback,…) (2) Verify information (3) Return proof fragments Proof generator service Reasoner Proof fragments (4) publish proof fragments McGuinness 2003

Portable Proofs • Proof Interlingua • Written in DAML+OIL (soon to be OWL) • Question Answering systems dump proofs in this format • http://www.ksl.stanford.edu/software/IW/spec/ McGuinness 2003

McGuinness 2003

Proofs and Explanations • Proofs can be displayed using the browser • Rewriting rules may be used to abstract proofs into more manageable explanations • Rewriting rules may leverage information about language axioms such as the DAML+OIL axiom set McGuinness 2003

Wine Agent Example McGuinness 2003

McGuinness 2003

Conclusion • Proof specification ready for feedback/use http://www.ksl.stanford.edu/software/iw/ • Proof browser prototype operational and expanding • Recent: ground axiom collection, source doc/ontology collection, aggregation view • Current: multiple formats, simplification, pruning, …) • Registration service expansion - integration with XML database, use in PAL, registration of services (with Fetch) • Inference engine integration work JTP functional, SNARK mostly done, KM under investigation. • Integration with web services – current: KSL Wine Agent, KSL DQL client (NIMD implementation), begin with registration of web services (Fetch) • Documentation – more examples, etc. More comments solicited (thanks to date to some for comments including Berners-Lee, Chalupsky, Chaudhri, Clark, Connolly, Forbus, Hawke, Hayes, Lenat, Murray, Porter, Reed, Waldinger, …) McGuinness 2003

McGuinness 2003

Technical Infrastructure Reqs • Provenance information - explain where source information: source name, date and author of last update, author(s) of original information, trustworthiness rating, etc. • Reasoning information - explain where derived information came from: the reasoner used, reasoning method, inference rules, assumptions, etc. • Explanation generation – provide abbreviated descriptions of the proof – may include reliance on adescription of the representation language (e.g., DAML+OIL, OWL, RDF, …), axioms capturing the semantics, rewriting rules based on axioms, other abstraction techniques, etc. • Distributed web-based deployment of proofs - build proofs that are portable, sharable, and combinable that may be published on multiple clients, registry is web available and potentially distributed, … • Proof/explanation presentation - Presentation should have manageable (small) portions that are meaningful alone (without the context of an entire proof), users should be supported in asking for explanations and follow-up questions, users should get automatic and customized proof pruning, web browsing option, multiple formats, customizable, etc. McGuinness 2003

Architecture McGuinness 2003

Integration with SNARK • Done by non-SNARK author to test strategies for integration • Tests alternative reasoning strategy – proof by contradiction • No special modifications made as a test of leverage • Learned some new requirements (CNF processing, reasoning modes may be useful, …) • Initial integration fairly easy • More complete integration in process McGuinness 2003

SNARK Example: nuclear threats “Weapons-grade nuclear material may be derived from uranium ore if refining technology is available, or it may be acquired from a black market source. Foobarstan is known to have either uranium ore or a black market source, but not both. Foobarstan will build a nuclear warhead if and only if it can obtain nuclear material, a detonator, and the bomb casing. A warhead and a missile, or a warhead and a truck, constitute a nuclear threat. Foobarstan has either a missile or a truck.” QUESTION: Is Foobarstan a nuclear threat? (1) ore  refiner  material (2) black-mkt  material (3) black-mkt  ore (4) black-mkt  ore (5) material  detonator  casing  warhead (6) material  warhead (7) detonator  warhead (8) casing  warhead (9) warhead  missile  nuke (10) warhead  truck  nuke (11) missile  truck McGuinness 2003

Example: proof by contradiction McGuinness 2003

Example: a proof tree McGuinness 2003

An example in FOL McGuinness 2003

Registering SNARK: next steps • Add support for ‘source’ and ‘author’ fields • Match with IW-registered ontologies where possible • Standardize treatment of SNARK rewrites • When do rewrites correspond to resolution, hyperresolution, paramodulation? • Utilize SNARK rewrites for IW abstraction strategies • Consider tableaux approaches for explanation • Implement correct handling of SNARK procedural attachments • SNARK includes procedural attachments for math, lists • User can define new procedural attachments on the fly • This constitutes an inference rule with an open-ended definition • Track variable bindings through course of proof • Integrate IW interface into SNARK standard release McGuinness 2003

Extra McGuinness 2003

Proof browsing: an example (1) • Tools can be used for browsing IW proofs. The following example demonstrates the use of the IW Browser to visualize, navigate and ask follow-up questions. • Lets assume a Wines ontology: • Determination of the type of a concept or instance is a typical problem on the Semantic Web. A reasoner may ask either about the type of an object and may also ask if an object is of a particular type Example DAML KB: <rdf:RDF xmlns:rdf =“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdf:Description rdf:ID="TonysSoftShell"> <rdf:type rdf:resource="#CRAB"/> </rdf:Description> <rdfs:Class rdf:ID="CRAB"> <rdfs:subClassOf rdf:resource="#SHELLFISH"/> </rdfs:Class> <rdfs:Class rdf:ID="SHELLFISH"> <rdfs:subClassOf rdf:resource="#SEAFOOD"/></rdfs:Class> </rdf:RDF> Example Query:(rdf:type TonysSoftShell ?X) McGuinness 2003

Proof browsing: An example (2) Proof browsing: An example (2) • Browsers can display portions of proofs. • Selecting premises users can navigate throughout proof trees. McGuinness 2003

Trust Disclosure Trust Disclosure • IW proofs can be used: • to provide provenance for “lookup” information • to display (distributed) deduction justifications • to display inference rule static information McGuinness 2003

Technical Requirements • annotate information with meta information such as source, date, author, … at appropriate granularity level (per KB, per term, …) • explain where source information is from • explain where derived information came from • prune information and explanations for presentation (utilizing user context and information context for presentation) • provide a query language capable of expressing user requests along with filtering restrictions • provide a ubiquitous source annotation language • provide a ubiquitous proof language for interchange • Compare answers • propagate meta information appropriately (if I got something from a source I consider trusted and you consider me a trusted source, you may want to consider my source trusted as well) • Identify multiple (or unknown) truth values McGuinness 2003

Explanation: The Next Phase in Question Answering

Explanation: The Next Phase in Question Answering

Presentation Transcript

LogAnswer Deduction-Based Question-Answering

Semantic Retrieval for Question Answering

Building a Simple Question Answering System

NL Question-Answering using Naïve Bayes and LSA

Answering a question using RADS

Web-based Factoid Question Answering (including a sketch of Information Retrieval )

Answering a “ DBQ ” *

Natural Language Processing Question Answering

Answering the “What is” Question

Multi-Perspective Question Answering Using the OpQA Corpus

Question Answering Technologies

Question Answering Passage Retrieval Using Dependency Parsing

Spanish Question Answering Evaluation

Question Answering as Question-Biased Term Extraction: A New Approach toward Multilingual QA

Answering the Security Question

Question Answering

Question Answering

A Noisy Approach to Question Answering

A Tripartite Question Answering Architecture for Integrating Diverse Knowledge Resources