1 / 9

Justification/Explanation Evaluation Breakout Session

Justification/Explanation Evaluation Breakout Session. Stefano Bertolo Richard Fikes. AQUAINT PI Meeting Monterey, California June 11-13, 2002. 6/12/02. Straw Man Proposal. General Evaluation Principles Required Characteristics Desirable Characteristics. General Evaluation Principles.

kacy
Télécharger la présentation

Justification/Explanation Evaluation Breakout Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Justification/Explanation Evaluation Breakout Session Stefano Bertolo Richard Fikes AQUAINT PI Meeting Monterey, California June 11-13, 2002 6/12/02

  2. Straw Man Proposal • General Evaluation Principles • Required Characteristics • Desirable Characteristics

  3. General Evaluation Principles • Scope of the evaluation • Not evaluating precision and recall • Are evaluating the quality of the justification(s) the system provides in support of the answer(s) it has returned for a given question • Independence of correctness and justification • Justification(s) will be evaluated whether or not the answer it/they justify is correct • Reward reasonable justifications for an incorrect answer • Penalize unreasonable or unhelpful justifications for a correct answer

  4. Straw Man Proposal • General Evaluation Principles • Scope of the evaluation • Independence of correctness and justification • Required Characteristics • Desirable Characteristics

  5. Required Characteristics • Accountability • Justifications must be able to identify the sources on which they depend • If a justification has multiple "steps" (where the meaning of "step" is system-dependent), the justification will need to identify the source(s) on which each step depends • A system will be penalized for each justification step that does not identify the source on which it depends • Understandability of justifications • The justification(s) that the system ranks as the most intelligible must be pronounced understandable by a panel of human scorers • The modality of the presentation is left undetermined and need not be fluent English

  6. Required Characteristics • Meaningful ranking of justifications • Present justifications in an order that the user would find appropriate with respect to a prespecified criterion • If J1 is presented before J2, user should agree that – • Confidence – J1 encodes evidence that is at least as reliable as that encoded by J2 • Conciseness – J1 is at least as concise as J2 • Intelligibility – J1 is at least as easy to follow as J2

  7. Straw Man Proposal • General Evaluation Principles • Scope of the evaluation • Independence of correctness and justification • Required Characteristics • Accountability • Meaningful ranking of justifications • Understandability of justifications • Desirable Characteristics

  8. Desirable Characteristics • If a justification has a satisfactory score on required characteristics, it will be rewarded for various desirable characteristics • Natural language presentation Reward presenting justifications in a natural language • Justification clustering Reward partitioning justifications into clusters with a well-defined semantic which is clearly explained to the user • Justification persistence Reward justifications that can be saved and inspected off-line with no loss of information • Agent-accessible API Reward justifications being accessible to software agents via an API

  9. Straw Man Proposal • General Evaluation Principles • Scope of the evaluation • Independence of correctness and justification • Required Characteristics • Accountability • Meaningful ranking of justifications • Understandability of justifications • Desirable Characteristics • Natural language presentation • Justification clustering • Justification persistence • Agent-accessible API

More Related