1 / 70

Textual Entailment: A Perspective on Applied Text Understanding

Textual Entailment: A Perspective on Applied Text Understanding. Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman , Idan Szpektor, Roy Bar Haim Bar Ilan University, Israel Maayan Geffet Hebrew University, Israel

terena
Télécharger la présentation

Textual Entailment: A Perspective on Applied Text Understanding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Textual Entailment:A Perspective on Applied Text Understanding Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman, Idan Szpektor, Roy Bar Haim Bar Ilan University, Israel Maayan Geffet Hebrew University, Israel Hristo Tanev, Bernardo Magnini, Alberto Lavelli, Lorenza Romano ITC-irst, Italy Bonaventura Coppola and Milen Kouylekov University of Trento and ITC-irst, Italy

  2. Talk Focus: A Framework for “Applied Semantics” • The textual entailment task – what and why? • Empirical evaluation – PASCAL RTE Challenge • Problem scope, decomposition and analysis • Different perspective on semantic inference • Probabilistic framework • Cf. syntax, MT – clear task, methodology and community

  3. Variability Ambiguity Natural Language and Meaning Meaning Language

  4. Variability of Semantic Expression All major stock markets surged Dow gains 255 points Dow ends up Stock market hits a record high Dow climbs 255 The Dow Jones Industrial Average closed up 255

  5. Variability Recognition –Major Inference in Applications Question Answering (QA) Information Extraction (IE) Information Retrieval (IR) Multi Document Summarization (MDS)

  6. Typical Application Inference QuestionExpected answer formWhoboughtOverture? >> XboughtOverture Overture’s acquisitionby Yahoo Yahoo bought Overture hypothesized answer text • Similar for IE: X buy Y • Similar for “semantic” IR: t: Overture was bought … • Summarization (multi-document) – identify redundant info • MT evaluation (and recent proposals for MT?)

  7. KRAQ'05 Workshop - KNOWLEDGE and REASONING for ANSWERING QUESTIONS (IJCAI-05) CFP: • Reasoning aspects:    * information fusion,    * search criteria expansion models     * summarization and intensional answers,    * reasoning under uncertainty or with incomplete knowledge, • Knowledge representation and integration:    * levels of knowledge involved (e.g. ontologies, domain knowledge),    * knowledge extraction models and techniques to optimize response accuracy,    * coherence and integration.

  8. Inference for Textual Question Answering Workshop (AAAI-05) CFP: • abductions, default reasoning, inference with epistemic logic or description logic • inference methods for QA need to be robust, cover all ambiguities of language • available knowledge sources that can be used for inference … but similar needs for other applications – can we address a uniform empirical task?

  9. Applied Textual Entailment: Abstract Semantic Variability Inference • QA: “Where was John Wayne Born?” • Answer: Iowa Hypothesis (h): John Wayne was born in Iowa inference Text (t): The birthplace of John Wayne is in Iowa

  10. The Generic Entailment Task • Given the text t, can we infer that h is (most likely) true? Hypothesis (h): John Wayne was born in Iowa inference Text (t): The birthplace of John Wayne is in Iowa

  11. Classical Entailment Definition • Chierchia & McConnell-Ginet (2001):A text t entails a hypothesis h if h is true in every circumstance (possible world) in which t is true • Strict entailment - doesn't account for some uncertainty allowed in applications

  12. “Almost certain” Entailments t:The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS. t: According to the Encyclopedia Britannica, Indonesia is the largest archipelagic nation in the world, consisting of 13,670 islands. h: 13,670 islands make up Indonesia.

  13. Textual Entailment ≈Human Reading Comprehension • From a children’s English learning book(Sela and Greenberg): • Reference Text:“…The Bermuda Triangle lies in the Atlantic Ocean, off the coast of Florida. …” • Hypothesis (True/False?):The Bermuda Triangle is near the United States ???

  14. Reading Comprehension QA By Canadian Broadcasting Corporation T: The school has turned its one-time metal shop – lost to budget cuts almost two years ago - into a money-making professional fitness club. Q: When did the metal shop close? A: Almost two years ago

  15. Recognizing Textual Entailment (RTE) ChallengePASCAL NOE Challenge2004-5 Ido Dagan, Oren glickman Bar-Ilan University, Israel Bernardo Magnini ITC-irst, Trento, Italy

  16. Generic Dataset by Application Use • QA • IE • Similar for “semantic” IR: Overture was acquired by Yahoo • Comparable documents (summarization) • MT evaluation • Reading comprehension • Paraphrase acquisition

  17. Some Examples • 567 development examples, 800 test examples

  18. Dataset Characteristics • Examples selected and annotated manually • Using automatic systems where available • Balanced True/False split • True – certain or highly probable entailment • Filtering controversial examples • Example distribution? • Mode –explorative rather than competitive

  19. Arthur Bernstein Competition “… Competition, even a piano competition, is legitimate … as long as it is just an anecdotal side effect of the musical culture scene, and doesn’t threat to overtake the center stage” Haaretz News Paper Culture Section, April 1st, 2005

  20. Submissions • 17 participating groups • 26 system submissions • Microsoft Research: manual analysis of dataset at lexical-syntactic matching level

  21. Broad Range of System Types • Knowledge sources and inferences • Direct t-h matching: • Word overlap / Syntactic tree matching • Lexical relations: • WordNet & statistical (corpus based) • Theorem Provers / Logical inference • Adding a fuzzy scoring mechanism • Supervised / unsupervised learning methods

  22. Accuracy

  23. Where are we?

  24. What’s next – RTE-2 • Organizers: • Bar Ilan, CELCT (Trento), MITRE, MS-Research • Main dataset: utilizing real systems outputs • QA, IE, IR, summarization • Humanperformance dataset • Reading comprehension, human QA (planned) • Schedule (RTE website): • October – development set • February – results submission (test set January) • April 10 – PASCAL workshop in Venice! • right after EACL

  25. Other Evaluation Modes • Entailment subtasks evaluations • Lexical, lexical-syntactic, alignment… • “Seek” mode: • Input: h and corpus • Output: All entailing t’s in corpus • Captures nicely information seeking needs, but requires post-run annotation (like TREC) • Contribution to specific applications

  26. Decomposition ofEntailment Levels Empirical Modeling of Meaning Equivalence and Entailment ACL-05 Workshop Roy Bar-Haim Idan SzpektorOren Glickman Bar-Ilan University

  27. Why? • Entailment Modeling is Complex!! • Was apparent at RTE1 • How can we decompose it, for • Better analysis and sub-task modeling • Piecewise evaluation • Avoid “this is the performance of my complex system…” methodology

  28. Combination of Inference Types T  H

  29. Combination of Inference Types T Co-reference Syntactic trans. paraphrasing Lexical world knowledge H Diverse inference types, different levels of representation

  30. Defining Intermediate Models Lexical Lexical-syntactic

  31. Lexical Model • T and H are represented as bag of terms • T L H if • for each term u  H there exists a term v  T such that v L u • v Lu if • they share the samelemma and POS OR • they are connected by a chain of lexical transformations

  32. Lexical Transformations • We assume perfect word sense disambiguation

  33. Lexical Entailment - Examples • #1952 from RTE1 (TH) ? TLH

  34. Lexical Entailment - Examples • #1361 from RTE1 (TH)

  35. Lexical Entailment - Examples • #1361 from RTE1 (TH) Synonym

  36. Lexical Entailment - Examples • #1952 from RTE1 (TH) Synonym TLH 

  37. Lexical Entailment - Examples • #2127 from RTE1 (TH) ? TLH

  38. Lexical Entailment - Examples • #2127 from RTE1 (TH) TLH 

  39. Lexical-Syntactic Model • T and H are represented by syntactic dependency relations • T LS H if the relations within H can be matched by the relations in T • The coverage can be obtained through a sequence of lexical-syntactic transformations

  40. Lexical-Syntactic Transformations • We assume perfect disambiguation and reference resolution

  41. Lexical-Syntactic Entailment - Examples • #1361 from RTE1 (TH) subj subj TLSH 

  42. Lexical-Syntactic Entailment - Examples • #2127 from RTE1 (TH) subj subj TLSH 

  43. Beyond Lexical-Syntactic Models • Future work…

  44. Empirical Analysis

  45. Annotation • 240 T-H pairs of RTE1 dataset • T L H ; T LS H • High annotator agreement (authors) • Kappa: “substantial agreement”

  46. Model evaluation results • Low precision for Lexical model • Lexical match fails to predict entailment • High precision for Lexical Syntactic model • Checking syntactic relations is crucial • Medium recall for both levels • Higher levels of inference are missing

  47. contribution of individual componentsRTE 1 positive examples Lexical Lex-Syn

  48. Summary (1) • Annotating and analaysing entailment components • Guide research on entailment • Opens new research problems and redirects old ones

  49. Summary (2) • Allows better evaluation of systems • Performance of individual components • Future work – expand analysis to additional levels of representation and inferences • Identify the exciting semantic phenomena …

More Related