280 likes | 390 Vues
This study investigates the efficacy and nuances of the Poisson query generation model within Information Retrieval (IR). It covers the background of query generation, the implementation of the Poisson language model, and the smoothing techniques associated with it. A comparative analysis between Poisson and multinomial models is presented, alongside empirical experiments showcasing performance differences. The work discusses the benefits of using Poisson models in query generation, including their handling of term frequency and variance, aiming to enhance the landscape of language modeling in IR.
E N D
A Study of Poisson Query Generation Model forInformation Retrieval Qiaozhu Mei, Hui Fang, and ChengXiang Zhai University of Illinois at Urbana-Champaign
Outline • Background of query generation in IR • Query generation with Poisson language model • Smoothing in Poisson query generation model • Poisson v.s. multinomial in query generation IR • Analytical comparison • Empirical experiments • Summary
Query Generation IR Model [Ponte & Croft 98] Document Language Model QueryLikelihood d1 q d2 dN • Scoring documents with query likelihood • Known as the language modeling (LM)approach to IR • Different from document generation
Interpretation of LM d • d: a model for queries posed by users who like document d[Lafferty and Zhai 01] • Estimate d using document d use d to approximate the queries used by users who like d • Existing methods differ mainly in the choice of d and how d is estimated (smoothing) • Multi-Bernoulli: e.g, [Ponte and Croft 98, Metzler et al 04] • Multinomial: (most popular) e.g., [Hiemstra et al. 99, Miller et al. 99, Zhai and Lafferty 01]
Query q: “text mining” T H H Multinomial: Toss a die to choose a word text Query q: “text mining” model text mining mining Multi-Bernoulli vs. Multinomial Multi-Bernoulli: Flip a coin for each word Doc: d text mining … model text mining model clustering text model text …
Problems of Multinomial • Does not model term absence • Sum-to-one over all terms • Reality is harder than expected: • Empirical estimates: mean (tf) < variance (tf) (Church & Gale 95) • Estimates on AP88-89: • All terms: : 0.0013; 2: 0.0044 • Query terms: : 0.1289; 2: 0.3918 • Multinomial/Bernoulli: mean > variance
Poisson? • Poisson models frequency directly (including zero freq.) • No sum-to-one constraint on different w • Mean = Variance • Poisson is explored in document generation models, but not in query generation models
Related Work • Poisson has been explored in document generation models, e.g., • 2-Poisson Okapi/BM25 (Robertson and Walker 94) • Parallel derivation of probabilistic models (Roelleke and Wang 06) • Our work add to this body of exploration of Poisson • With query generation framework • Explore specific features Poisson brings in LM
Research Questions • How can we model query generation with Poisson language model? • How can we smooth such a Poisson query generation model? • How is a Poisson model different from a multinomial model in the context of query generation retrieval?
Query Generation with Poisson Poisson: Each term as an emitter Query: receiver Rates of arrival of w: : |q| text mining model clustering text model text … 1 text [ ] 3/7 2 mining [ ] 2/7 0 model [ ] / 1/7 0 clustering [ ] / 1/7 1 [ ] … Query: “mining text mining systems”
MLE Query Generation with Poisson (II) q = ‹c(w1, q), c(w2 , q), …, c(wn , q)› |q| [ c(w1, q) ] text mining model clustering text model text … text w1 1 [ c(w2, q) ] mining w2 2 [ c(w3, q) ] w3 model 3 [ c(w4, q) ] clustering w4 4 [ c(wN, q) ] … wN N
Background Collection text 0.02 mining 0.01 model 0.02 … system 0 text 0.0001 mining 0.0002 model 0.0001 … system 0.0001 Smoothing Poisson LM text mining model clustering text model … Query: text mining systems + ? e.g., text: * 0.02 + (1- )* 0.0001 system: * 0 + (1- )* 0.0001 Different smoothing methods lead to different retrieval formulae
1 Smoothing Poisson LM + • Interpolation (JM): • Bayesian smoothing with Gamma prior: • Two stage smoothing: Gamma prior 2
Smoothing Poisson LM (II) • Two-stage smoothing: • Similar to multinomial 2-stage (Zhai and Lafferty 02) • Verbose queries need to be smoothed more 3 A smoothed version of document model (from and ) e.g., A background model of user query preference Use when no user prior is known
Analytical: Equivalency of basic models • Equivalent with basic model and MLE: • Poisson + Gamma Smoothing = multinomial + Dirichlet Smoothing • Basic model + JM smoothing behaves similarly (with a variant component of document length normalization )
Benefits: Per-term Smoothing • Poisson doesn’t require “sum-to-one” over different terms (different event space) • Thus in JM smoothing and 2-stage smoothing can be made term dependent (per-term) • multinomial cannot achieve per-term smoothing • Can use EM algorithm to estimate ws. w w
Benefits: Modeling Background • Traditional: as a single model • Not matching the reality • as a mixture model: increase variance • multinomial mixture (e.g., clusters, PLSA, LDA) • Inefficient (no close form, iterative estimation) • Poisson mixture (e.g., Katz’s K-Mixture, 2-Poisson, Negative Binomial) (Church & Gale 95) • Have close forms, efficient computation
Hypotheses • H1: With basic query generation retrieval models (JM smoothing and Gamma smoothing): Poisson behaves similarly to multinomial • H2: Per-term smoothing with Poisson may out-perform term independent smoothing • More help on verbose queries • H3: Background efficiently modeled as Poisson mixtures may perform better than single Poisson
Experiment Setup • Data: TREC collections and Topics • AP88-89, Trec7, Trec8, Wt2g • Query type: • Short keyword (keyword title); • Short verbose (one sentence); • Long verbose (multiple sentences); • Measurement: • Mean average precision (MAP)
H1: Basic models behave similarly • JM+Poisson JM+Multinomial • Gamma/Dirichlet > JM (Poisson/ Multinomial) • JM + Poisson JM + Multinomial • Gamma/Dirichlet > JM (Poisson/ Multinomial) MAP
H2: Per-term outperforms term-independent smoothing Per-term > Non-per-term
Improvement Comes from Per-term JM + Per-term > JM 2-stage + Per-term > 2-stage Significant improvement on verbose query
H3: Poisson Mixture Background Improves Performance Katz’ K-Mixture > Single Poisson
Poisson Opens Other Potential Flexibilities • Document length penalization? • JM introduced a variant component of document length normalization • Require more expensive computation • Pseudo-feedback? • in the 2-stage smoothing • Use feedback documents to estimate term dependent ws. • Lead to future research directions
Summary • Poisson: Another family of retrieval models based on query generation • Basic models behave similarly to multinomial • Benefits: per-term smoothing and efficient mixture background model • Many other potential flexibilities • Future work: • explore document length normalization and pseudo-feedback • better estimation of per-term smoothing coefficients