Web Based Probabilistic Textual Entailment

Web Based Probabilistic Textual Entailment Oren Glickman,Ido Dagan and Moshe Koppel Bar Ilan Univ.

Classical Entailment Definition • A text t entails an hypothesis h if h is true in every circumstance (possible world) in which t is true • i.e., the truth of t implies the truth of h

Probabilistic Entailment • Example 312: • (t) Gandhi can be defeated in the next elections in India if between now and 2009, BJP can make Rural India Shine. • (h) Next elections in India will take place in 2009. • tdoes not entail h (in the classical sense) • Then why is it annotated as True?!

Rational • Example 312: • (t) Gandhi can be defeated in the next elections in India if between now and 2009, BJP can make Rural India Shine. • (h) Next elections in India will take place in 2009. • t does add substantial information about the correctness of h • Given that t was stated we’d expect that h is most likely true

A Probabilistic Space • T: The set of all texts • H: The set of all hypotheses • propositional statements which can be assigned a truth value • w: a possible world • truth assignment (to {0=False, 1=True}) for all hypotheses • W - the set of all possible worlds (2H)

A Generative Model We assume a probabilistic generative model: • At each generation event a text is produced along with a (hidden) possible world • based on a probability distribution over T W.

Probabilities • For a given text t and hypothesis h, we consider the following probabilities: • P(Trh=1) = P(h is assigned a truth value of 1) • P(Trh=1| t) = P(h is assigned a truth value of 1 given that the generated text is t)

Textual entailment relationship Definition: • t probabilistically entails h if: • P(Trh = 1| t) > P(Trh= 1) (≡ positive PMI) • t increases the likelihood of h being true

Lexical Entailment • Are the individual terms in h entailed from t • not necessarily holding the right relations • Example #2070: • (t) The Queen of Holland is now owned by Robert Mouawad. • (h) Robert Mouawad is the Queen of Holland.

A Probabilistic Lexical Model • Goal: capture lexical co-occurrence statistics • Assumption 1: Independent lexical truth assignments • Assumption 2: Alignment Iv -- the event that a generated text contains v

Estimating Lexical Entailment Probabilities from the Web • web documents -- sample generated by source • Problem: • Truth assignments not observed • Assumption 3: • Term is true iff appears in document • P(Tru=1|Iv) = P(Iu|Iv) • co-occurrence counts from search engine

Challenge Submission • Tokenize text and remove stop words • Collect counts from AltaVista • Classification: • p = P(Trh = 1| t) • t  h if p > λ ; conf = p • Conf = 1-p for negative examples • λ tuned on dev set

Results

Resulting Alignments • Some good: Japan  Japanese, voter  vote • Some dubious: turnout  half, percent  less

Precision-Recall • High confidence  low precision!!

Did the probs help? Baseline: P(w1|w2) = { 1 w1=w2 ; 0otherwise

Conclusions • Defined probabilistic setting – as needed for modeling probabilistic entailment • Proposing: t probabilistically entails h if it increases the likelihood that h is true • A concrete probabilistic model • incorporating word co-occurrence statistics • based on the proposed setting • The simple model performs as well as more complex systems!

Web Based Probabilistic Textual Entailment

Web Based Probabilistic Textual Entailment

Presentation Transcript

Textual Entailment as a Framework for Applied Semantics

Textual Entailment: A Perspective on Applied Text Understanding

Recognizing Textual Entailment

From Textual Entailment to Knowledgeable Machines

Recognizing Textual Entailment using UNL framework

Textual Entailment, QA4MRE, and Machine Reading

Third Recognizing Textual Entailment Challenge

EVALITA 2009 Recognizing Textual Entailment (RTE) Italian Chapter

Textual Entailment

Recognizing Textual Entailment Challenge PASCAL

Textual Entailment

Relation Alignment for Textual Entailment Recognition

Baselines for Recognizing Textual Entailment

Recognizing Textual Entailment with LCC’s Groundhog System

Tree Mining and Textual Entailment

Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach

Textual entailment inference in machine translation

TEXTUAL ENTAILMENT Lliçons sobre un desastre anunciat

Towards a probabilistic Model for Lexical Entailment

Recognizing Textual Entailment using the UNL framework

Textual entailment inference in machine translation

Using Maximal Embedded Subtrees for Textual Entailment Recognition