Empirical Development of an Exponential Probabilistic Model

Empirical Development of anExponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI), MIT

Goal: Better Generative Model • Generative v. discriminative model • Applies to many applications • Information retrieval (IR) • Relevance feedback • Using unlabeled data • Classification • Assumptions explicit

Using a Model for IR Hyper-learn • Define model • Learn parameters from query • Rank documents • Better model improves applications • Trickle down to improve retrieval • Classification, relevance feedback, … • Corpus specific models

Overview • Related work • Probabilistic models • Example: Poisson Model • Compare model to text • Hyper-learning the model • Exponential framework • Investigate retrieval performance • Conclusion and future work

Related Work • Using text for retrieval algorithm • [Jones, 1972], [Greiff, 1998] • Using text to model text • [Church & Gale, 1995], [Katz, 1996] • Learning model parameters • [Zhai & Lafferty, 2002] Hyper-learn the model from text!

Probabilistic Models • Rank documents by RV =Pr(rel|d) • Naïve Bayesian models RV =Pr(rel|d)

Probabilistic Models • Rank documents by RV =Pr(rel|d) • Naïve Bayesian models # occs in doc = Pr(dt|rel) features t RV =Pr(rel|d) Pr(d|rel) 8 words • Open assumptions • Feature definition • Feature distribution family Defines the model!

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents Pr(dt|rel) =

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents • Poisson Model • θ: specifies term distribution dt -θ θ e Pr(dt|rel) = dt!

Example Poisson Distribution + θ=0.0006 Pr(dt|rel) Pr(dt|rel)≈1E-15 Term occurs exactlydt times

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents • Learn a θ for each term • Maximum likelihood θ • Term’s average number of occurrence • Incorporate prior expectations

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents • For each document, find RV • Sort documents by RV = Pr(dt|rel). words t RV

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents Which step goes wrong? • For each document, find RV • Sort documents by RV = Pr(dt|rel). words t RV

Using a Naïve Bayesian Model • Define model • Learn parameters from query • Rank documents dt -θ θ e Pr(dt|rel) = dt!

How Good is the Model? + θ=0.0006 Pr(dt|rel) 15 times Term occurs exactlydt times

How Good is the Model? + θ=0.0006 Pr(dt|rel) Misfit! 15 times Term occurs exactlydt times

Hyper-learning a Better FitThrough Textual Analysis Using an Exponential Framework

Hyper-Learning Framework • Need framework for hyper-learning Mixtures Poisson Bernoulli Normal

Hyper-Learning Framework • Need framework for hyper-learning • Goal: Same benefits as Poisson Model • One parameter • Easy to work with (e.g., prior) Mixtures Poisson Bernoulli Normal One parameter exponential families

Exponential Framework • Well understood, learning easy • [Bernardo & Smith, 1994], [Gous, 1998] Pr(dt|rel) = f(dt)g(θ)e • Functions f(dt) and h(dt) specify family • E.g., Poisson: f(dt) = (dt!)-1,h(dt) = dt • Parameter θ term’s specific distribution θh(dt)

Using a Hyper-learned Model • Define model • Learn parameters from query • Rank documents

Using a Hyper-learned Model • Hyper-learn model • Learn parameters from query • Rank documents

Using a Hyper-learned Model • Hyper-learn model • Learn parameters from query • Rank documents • Want “best” f(dt) and h(dt) • Iterative hill climbing • Local maximum • Poisson starting point

Using a Hyper-learned Model • Hyper-learn model • Learn parameters from query • Rank documents • Data: TREC query result sets • Past queries to learn about future queries • Hyper-learn and test with different sets

Recall the Poisson Distribution + Pr(dt|rel) 15 times Term occurs exactlydt times

Poisson Starting Point - h(dt) + h(dt) Pr(dt|rel) =f(dt)g(θ)e θh(dt) dt

Hyper-learned Model - h(dt) Hyper-learned Model - h(dt) + h(dt) Pr(dt|rel) =f(dt)g(θ)e θh(dt) dt

Poisson Distribution + Pr(dt|rel) 15 times Term occurs exactlydt times

Hyper-learned Distribution Hyper-learned Distribution + Pr(dt|rel) 15 times Term occurs exactlydt times

Performing Retrieval • Hyper-learn model • Learn parameters from query • Rank documents

Performing Retrieval Labeled docs • Hyper-learn model • Learn parameters from query • Rank documents θh(dt) Pr(dt|rel) = f(dt)g(θ)e • Learn θ for each term

Learning θ • Sufficient statistics • Summarize all observed data • τ1: # of observations • τ2: Σobservations d h(dt) • Incorporating prior easy • Map τ1 and τ2θ 20 labeled documents

Performing Retrieval • Hyper-learn model • Learn parameters from query • Rank documents

Results: Labeled Documents Results: Labeled Documents Precision Recall

Performing Retrieval • Hyper-learn model • Learn parameters from query • Rank documents Short query

Retrieval: Query Retrieval: Query • Query = single labeled document • Vector space-like equation RV = Σa(t,d) + Σb(q,d) • Problem: Document dominates • Solution: Use only query portion • Another solution: Normalize t in doc q in query

Retrieval: Query Precision Recall

Conclusion • Probabilistic models • Example: Poisson Model • Hyper-learning the model • Exponential framework • Learned a better model • Investigate retrieval performance - Bad text model - Easy to work with - Heavy tailed! - Better …

Future Work • Use model better • Use for other applications • Other IR applications • Classification • Correct for document length • Hyper-learn on different corpora • Test if learned model generalizes • Different for genre? Language? People? • Hyper-learn model better

Questions? Contact us with questions: Jaime Teevan teevan@ai.mit.edu David Karger karger@theory.lcs.mit.edu

Empirical Development of an Exponential Probabilistic Model

Empirical Development of an Exponential Probabilistic Model

Presentation Transcript

DBMS with probabilistic model

The Colonial Origins of Comparative Development: An Empirical Investigation

THE EXPONENTIAL GARCH MODEL

Probabilistic Model Checking

Probabilistic Model of Range

Probabilistic Model

Observational Test of Halo Model: an empirical approach

Empirical Model of CSF Flow

An Empirical Model of Decadal ENSO Variability

An evaluation of the probabilistic information in multi-model ensembles

The Probabilistic Model

Model Oriented Programming: An Empirical Study of Comprehension

Probabilistic Voting Model

Empirical Development of an Exponential Probabilistic Model

Empirical Model

The Colonial Origins of Comparative Development: An Empirical Investigation

An evaluation of the probabilistic information in multi-model ensembles

Exponential Model

Model Oriented Programming: An Empirical Study of Comprehension