Increasing Trust in Answers from Intelligence Applications: the Inference Web Approach

Increasing Trust in Answers from Intelligence Applications:the Inference Web Approach Deborah McGuinness Co-Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University dlm@ksl.stanford.edu http://www.ksl.stanford.edu/people/dlm Inference Web is joint work with Pinheiro da Silva, Fikes, Chang, Deshwal, Narayanan, Glass, Makarios, Jenkins, Millar, Ding, …

Semantic Web Layers Ontology Level • Languages (CLASSIC, DAML-ONT, DAML+OIL, OWL, …) • Environments (FindUR, Chimaera, OntoBuilder/Server, Sandpiper Tools, …) • Standards (NAPLPS, …, W3C’s WebOnt, W3C’s Semantic Web Best Practices, EU/US Joint Committee, OMG ODM, … Rules • SWRL (previously CLASSIC Rules, explanation environment, extensibility issues, contracts, …) Logic • Description Logics Proof • PML, Inference Web Services and Infrastructure Trust • IWTrust, NSF with W3C/MIT http://www.w3.org/2004/Talks/0412-RDF-functions/slide4-0.html

Motivation – Trust and Understanding If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them. System transparency supports understanding and trust. Even simple “lookup” systems benefit from providing information about their sources. Systems that manipulate information (with sound deduction or potentially unsound heuristics) benefit from providing information about their manipulations. Goal: Provide interoperable infrastructure that supports explanations of sources, assumptions, and answers as an enabler for trust.

Requirements gathered from… DARPA Agent Markup Language (DAML) Enable the next generation of the web DARPA Personal Assistant that Learns (PAL) Enable computer systems that can reason, learn, be told what to do, explain what they are doing, reflect on their experience, & respond robustly to surprise DARPA Rapid Knowledge Formation (RKF) Allow distributed teams of subject matter experts to quickly and easily build, maintain, and use knowledge bases without need for specialized training DARPA High Performance Knowledge Base (HPKB) Advance the technology of how computers acquire, represent & manipulate knowledge ARDA Novel Intelligence for Massive Data (NIMD) Avoid strategic surprise by helping analysts be more effective (focus attention on critical information and help analyze/prune/refine/explain/reuse/…) ARDA Advanced Question & Answering for Intelligence (AQUAINT) Advance QA against structured and unstructured info Consulting including search, ecommerce, configuration, …

Requirements Information Manipulation Traces • hybrid, distributed, portable, shareable, combinable encoding of proof fragments supporting multiple justifications Presentation • multiple display formats supporting browsing, visualization, summaries,… Abstraction • understandable summaries Interaction • multi-modal mixed initiative options including natural-language and GUI dialogues, adaptive, context-sensitive interaction Trust • source and reasoning provenance, automated trust inference [McGuinness & Pinheiro da Silva, ISWC 2003, J. Journal of Web Semantics 2004]

Selected History • Historical explanation research motivated by explaining theorem provers in practice • Web version originally aimed at explaining hybrid (FOL / special purpose) reasoners in a distributed environment like the web. • User demand drove focus on provenance extensions • Current web environment and programs, such as NIMD, drove connections with extraction engines • Current view: Any question answering system can be viewed as some kind of information manipulator that may benefit from and/or require explanation

Inference Web * Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers • IW’s Proof Markup Language (PML) is an interlingua for proof interchange. It is written in W3C’s recommended Ontology web language (OWL) • IWBase is a distributed repository of meta-information related to proofs and their explanations • IW Registration services provide support for proof generation and checking • IW Browser provides display capabilities for PML documents containing proofs and explanations (possibly from multiple inference engines) • IW Abstractor provides rewriting capabilities enabling more understandable presentations • IW Explainer provides multi-modal dialogue options including alternative strategies for presenting explanations, visualizations, and summaries *Work with Pinheiro da Silva

Explainable System Structure Explanation Trust Interaction Presentation Abstraction PML Proof Interlingua (PML) InferenceML Information Manipulation Data Information Manipulation Data Source Provenance Data Source Provenance Data Inference Rule Specs

Registry Information IWBase has core and domain-specific repositories of meta-data useful for disclosing knowledge provenance and reasoning information such as descriptions of • Question answering systems (Inference Engines, Extractors, …) along with their supported inference rules • Information sources such as organizations, publications and ontologies • Representation languages along with their axioms

Enable the visualization of proofs (and abstracted proofs) Proofs can be “extracted” and browsed from both local and remote PML node sets and can be combined Links provide access to proof-related meta-information Browsing Proofs select select

Browsing Proofs

Explainer Present • Query • Answer • Abstraction of justification (PML information) • Limited meta information • Suggests drill down options (also provides feedback options)

UIMA Explanation

Follow-up : Metadata

Follow-up: Assumptions

Explaining Extracted Entities (Techies) Sentences in English Sentences in annotated English Sentences in logical format, i.e., KIF

Further Observations on Explaining Extracted Entities Source: fbi_01.txt Source Usage: span from 01 to 78 Same conclusion from multiple extractors conflicting conclusion from one extractor This extractor decided that Person_fbi-01.txt_46 is a Person and not Occupation

Search / Configuration

KSL Wine AgentSemantic Web Integration Example • Uses emerging web standards to enable smart web applications • Given a meal description • Deborah’s Specialty • Describe matching wines • White, Dry, Full bodied… • Retrieve some specific options from web • Forman Chardonnay from DLM’s cellar, ThreeSteps from wine.com, …. • Info: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/

KSL Wine Agent Semantic Web Integration Technology • OWL • for representing a domain ontology of foods, wines, their properties, and relationships between them • JTP theorem prover • for deriving appropriate pairings • DQL/OWL QL • for querying a knowledge base • Inference Web • for explaining and validating answers • (descriptions or instances) • Web Services • for interfacing with vendors • Connections to online web agents/information services • Utilities for conducting and caching the above transactions

Knowledge Provenance Elicitation “has opinion” “has opinion” “has opinion” BBC NYT CNN DA DA DA A->(A^B) A B MP ^I DA A^B A->(A^B) A A^B A B A^B Dir.Ass. MP ^I A^B (CNN,BBC) (BBC,NYT) (CNN) XYZ says ‘A^B’ is the answer for my question. Provenance information may be essential for users to trust answers. Data provenance (aka data lineage) is defined and studied in the database literature. [Buneman et al., ICDT 2001] [Cui and Widom, VLDB 2001] Why should I believe this? Knowledge provenance extends data provenance by adding data derivation provenance information [Pinheiro da Silva, McGuinness & McCool, Data Eng. Bulletin, 2003]

IWTrust: Trust in Action FSP NYT CNN DA DA DA A->(A^B) A B MP ^I DA A^B A->(A^B) A A^B A B A^B DA MP ^I B (CNN,FSP) (FSP,NYT) (CNN) Google-2.0 says ‘A^B’ is the answer for my question. Trust can be inferred from a Web of Trust. ++ Why should I trust the answer? ? ++ 0 IWTrust provides infrastructure for building webs of trust. + The infrastructure includes a trust component responsible for computing trust values for answers. IWTrust is described in [Zaihrayeu, Pinheiro da Silva & McGuinness, iTrust 2005] A^B 0 + ? ? + ++ 0

Explanation Application Areas Theorem proving First-Order Theorem Provers – Stanford (JTP (KIF/OWL/…)); SRI (SNARK); University of Texas, Austin (KM); SATisfiability Solvers – University of Trento (JSAT) Information extraction – IBM (UIMA), Stanford (TAP) Information integration/aggregation – USC ISI (Prometheus,Mediator -> Fetch); Rutgers , Stanford (TAP) Task processing – SRI International (SPARK) Service composition – Stanford, U. of Toronto, UCSF (SDS) Semantic matching – University of Trento (S-MATCH) Debugging ontologies – U of Maryland, College Park (SWOOP/Pellet)* Problem solving – University of Fortaleza Trust Networks – U. of Trento (IWTrust), UMd*

Inference Web Contributions 6 5 2 4 3 2 1 4 4 • Language for encoding hybrid, distributed proof fragments based on web technologies. Support for both formal and informal proofs (information manipulation traces). Explanation Trust Interaction 2. Support (registry, language, services) for knowledge provenance Presentation Abstraction 3. Declarative inference rule representation for checking hybrid, distributed proofs. Proof Markup Language Inference Meta-Language Information Manipulation Data Provenance Meta-data Inference Rule Specs 4. Multiple strategies for proof abstraction, presentation and interaction. 5. End-to-end trust value computation for answers. 6. Comprehensive solution for explainable systems

Status Inference Web infrastructure (PML, browser, explainer, registry, toolkit) being used in government programs such as PAL and NIMD, commercial research labs – IBM, Boeing, SRI, Universities – USC, U MD, … Integration and registration process underway with extraction community Useful now for helping decide if information is trustworthy, comes from authoritative sources, consistent, reliable Benefits from more meta data and more information population but is useful in an incremental nature

Technical Status Some focus areas: Follow-up question support Trust Contradiction support Abstraction techniques Extraction extensions Task-oriented reasoning support Query manager explanation support Toolkit for embedding Open issues for explanation Granularity of explanations Meta information filtering Abstraction techniques Requests / Suggestions?

More Info: Inference Web: http://iw.stanford.edu OWL: http://www.w3.org/TR/owl-features/http://www.w3.org/TR/owl-guide/ DAML+OIL: http://www.daml.org/ WineAgent: www.ksl.stanford.edu/people/dlm/webont/wineAgent/ Chimaera: http://www.ksl.stanford.edu/software/chimaera/ OWL-QL/DQL: http://www.ksl.stanford.edu/projects/dql/ UIMA: http://www.research.ibm.com/UIMA/ dlm@ksl.stanford.edu

Extras

Background AT&T Bell Labs AI Principles Dept • Description Logics, CLASSIC, explanation, ontology environments • Semantic Search, FindUR, Collaborative Ontology Building Env • Apps: Configurators, PROSE/Questar, Data Mining, … Stanford Knowledge Systems, Artificial Intelligence Lab • Ontology Evolution Environments (Diagnostics and Merging) Chimaera • Explanation and Trust, Inference Web • Semantic Web Representation and Reasoning Languages, DAML-ONT, DAML+OIL, OWL, • Rules and Services: SWRL, OWL-S, Explainable SDS, KSL Wine Agent `McGuinness Associates • Ontology Environments: Sandpiper, VerticalNet, Cisco… • Knowledge Acquisition and Ontology Building – VSTO, GeON, ImEp,… • Applications: GM: Search, etc.; CISCO : meta data org, etc.; • Boards: Network Inference, Sandpiper, Buildfolio, Tumri, Katalytik

KSL Wine Agent: Semantic Web Integration (Toy) Example • Uses emerging web standards to enable “smart” web application • Given a meal description • Deborah’s Specialty, a crab dish, … • Describe matching wines • White, Dry, Full bodied… • Retrieve some specific options from web • Forman Chardonnay from DLM’s cellar, ThreeSteps from wine.com, …. • Explain description or specific suggestion • Info: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/

KSL Wine Agent Semantic Web Integration Technology • OWL: for representing a domain ontology of foods, wines, their properties, and relationships between them • JTP theorem prover: for deriving appropriate pairings • Chimaera: ontology diagnostics and ontology merging • DQL/OWL QL : for querying a knowledge base • Inference Web: for explaining and validating answers (descriptions or instances) • Web Services: for interfacing with vendors • Connections to online web agents/information services • Utilities for conducting and caching the above transactions

Inference Web in KANI Context

Summary • Tools are emerging that support understanding information • Understanding/Evaluating information can help focus a user’s attention and enable trust, reuse, and filtering • Semantic Web infrastructure (OWL, Structured query languages, Semantic Search, Extractors, Reasoners, Explanation Infrastructure, ….) is ready for use and a growing trend

Knowledge Provenance Multiple Sources Answer Source Source

Extra

Inferences drawn by Information Extraction Document: CIA Report 117 Document: FBI Report 282 AER: (Person “Mr. Ramazi”) AER: (Org “Select Gourmet Foods”) AER: (Person “Abdul Ramazi”) BER: (Org “Select Gourmet Foods”) BER: (Person “Abdul Ramazi”) BER: (Org “SGF”) BRR: (hasOwner “SGF”, “Abdul Ramazi”) ARR: (hasOwner “Select Gourmet Foods”, “Mr. Ramazi”) MCR: (equals “Abdul Ramazi”, “Mr. Ramazi”, AbdulRamazi) MCR: (equals “Select Gourmet Foods”, “SGF”, SelectGourmetFoods) MCR: (hasOwner SelectGourmetFoods, AbdulRamazi)

Infrastructure: Core IWBase Statistics for relevant domain independent meta-data: Inference Engines 29 Axioms 56 Declarative Rules 38 select Method Rules 10 Derived Rules 6 Languages 12 select

Explaining Answers: GUI Explainer Users can exit the explainer providing feedback about their satisfiability with explanation(s) Select action Users can ask for alternative explanations or summaries

Follow-up: PML Abstraction(Techies only)

Browsing Present • Query • Answer • Alternate formats (KIF, English, Raw, …) • Graph Structure (with lens view) • Annotations

Knowledge Provenance Multiple Sources Answer Source Source

IWTrust:Improving User Trust in Answers from the Web Ilya Zaihrayeu ITC-IRST Paulo Pinheiro da Silva Deborah L. McGuinness Stanford University

Trusting Answers • It may be challenging for users to establish their degrees of trust, untrust, mistrust and distrust in a web application answer if the answer is provided without any kind of justification • Knowledge Provenance (KP) is a description of both the origins of knowledge and the reasoning process to produce an answer • Users may need KP to establish a degree of trust in the answer • Which sources were used? • Who are the authors of such sources? • Which engines were used? • What are the assumptions of the engines? Are the engines’ rules sound? • KP itself may not be enough for trusting the answer • I may not know anything about one or more sources in the KP • I may have no information about the reliability of one or more of then engines in the KP

Trusting Answers from the Web • The overall process of establishing a degree of trust in answers from web applications is particularly complex since applications may rely on: • Hybrid and distributed processing, e.g., web services, the Grid • Large number of heterogeneous, distributed information sources, e.g., the Web • information sources with more variation in their reliability, e.g., information extraction • Sophisticated information integration methods, e.g., SIMS, TSIMMIS • The definition of trust is a significant part of the process • The task of keeping, encoding, sharing and gathering KP for partial answers towards the generation of the KP for answers is another part of the process • The use of KP to derive trust values for answers is yet another part of the process

The Inference Web • The Inference Web is an infrastructure supporting explanations for answers from the web • The Proof Markup Language (PML) is used to encode answer justification, i.e., information manipulation traces, proofs • IWBase is used to annotate PML documents with proof-related data, i.e., trust values for sources and engines • User U1 asks question Q • A question answering system returns the set of answers {A1,A2,…,An} PML Documents IWBase S1 (A1, t11, t12,...) IE1 Q(U1) S2 (A2, t21, t22,...) ... ... S3 (An, tn1, tn2,...) IE2

Inference Web and KP Inference Web is an infrastructure supporting KP for answers derived by multiple methods • Information extraction –IBM (UIMA), Stanford (TAP) • Information integration –USC ISI (Prometheus/Mediator); Rutgers University (Prolog/Datalog) • Task processing –SRI International (SPARK) • Theorem proving • First-Order Theorem Provers –SRI International (SNARK); Stanford (JTP); University of Texas, Austin (KM) • SATisfiability Solvers –University of Trento (J-SAT) • Expert Systems –University of Fortaleza (JEOPS) • Service composition – Stanford, University of Toronto, UCSF (SDS) • Semantic matching –University of Trento (S-Match) • Debugging ontologies – University of Maryland, College Park (SWOOP/Pellet) • Problem solving –University of Fortaleza (ExpertCop)

The Inference Web Trust (IWTrust) IW TrustNet t6-7 t7-S1 u7 u6 t7-IE1 t6-3 t4-S4 t3-4 u4 u3 S4 t5-6 t4-S3 t1-3 t1-5 t1-IE2 u1 u5 IW Trust Framework • IWTrust extends the Inference Web to support trust computation • IW TrustNet is a social network of source recommenders • A trust component implementing an algorithm to compute trust values for answers • Trust values are used to rank answers and answer justifications • User U1 trusts U3 to a degree t1-3 PML Documents IWBase S1 (A1, t11, t12,...) IE1 Q(U1) S2 (A2, t21, t22,...) ... ... S3 (An, tn1, tn2,...) IE2

Increasing Trust in Answers from Intelligence Applications: the Inference Web Approach