1 / 15

Conditional Topic Random Fields

Conditional Topic Random Fields. Jun Zhu and Eric P. Xing (@ cs.cmu.edu ) ICML 2010 Presentation and Discussion by Eric Wang January 12, 2011. Overview. Introduction – nontrivial input features for text. Conditional Random Fields CdTM and CTRF Model Inference Experimental Results.

aloha
Télécharger la présentation

Conditional Topic Random Fields

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conditional Topic Random Fields Jun Zhu and Eric P. Xing (@cs.cmu.edu) ICML 2010 Presentation and Discussion by Eric Wang January 12, 2011

  2. Overview • Introduction – nontrivial input features for text. • Conditional Random Fields • CdTM and CTRF • Model Inference • Experimental Results

  3. Introduction • Topic models such as LDA are not “feature-based” in their inability to efficiently incorporate nontrivial features (contextual or summary features). • Further, they assume a bag-of-words construction, discarding order information that may be important. • The authors propose a model that addresses both feature and independence limitations using a conditional random field (CRF) than a fully generative model.

  4. Conditional Random Fields • A conditional random field (CRF) is a way to label and segment structured data that removes independence assumptions imposed by HMMs. • The underlying idea of CRFs is that a sequence of random variables Y is globally conditioned on a sequence of observations X. Image source Hanna M. Wallach. Conditional Random Fields: An Introduction. Technical Report.. Department of Computer and Information Science, University of Pennsylvania, 2004.

  5. Conditional Topic Model • Assume a set of features denoting arbitrary local and global features. • The topic weight vector is defined as where fis a vector of feature functions defined on the features a and

  6. Conditional Topic Model • The inclusion of Y is in following sLDA where the topic model regresses to a continuous or discrete response. • is the standard topic distributions over words. • This model does not impose word order dependence.

  7. Feature Functions • Consider, for example, the set of word features “positive adjective”, “negative adjective”, “positive adjective with an inverting word”, “negative adjective with an inverting word”, so M=4. • If the word is “good” will yield a feature function vector while the word “not bad” will yield • The features are then concatenated depending on the topic assignment of the word . Suppose = h, then the feature f for “good” is a length MK vector: [ 1 0 0 0]’ [ 0 0 0 1]’ [ 0 0 0 0 | 0 0 0 0 |…| 1 0 0 0 |…| 0 0 0 0 | 0 0 0 0 ]’ k=2 k=h k=K-1 k=K k=1

  8. Conditional Topic Random Fields • The generative process of CTRF for a single document is

  9. Conditional Topic Random Fields • The term is a conditional topic random field over the topic assignments of all the words in one sentence and has the form • In the linear chain CTRF, the authors consider both singleton and pairwise feature functions • The cumulative feature function value on a sentence is • The pairwise feature function is assumed to be zero if Pairwise Singleton

  10. Model Inference • Inference is performed in a similar variational fashion as in Correlated Topic Models (CRM). • The authors introduce a relaxation of the lower bound due to the introduction of the CRF, although for the univariate CdTM, the variational posterior can be computed exactly. • A close form solution is not available for , so an efficient gradient descent approach is used instead.

  11. Empirical Results • The authors use hotel reviews built by crawling TripAdvisor. • The dataset consists of 5000 reviews with lengths between 1500 and 6000 words. The dataset also includes an integer (1-5) rating for each review. Each rating was represented by 1000 documents. • POS tags were employed to find adjectives. • Noun phrase chunking was used to associate words with good or bad connotations. The authors also extracted whether an inverting word is with 4 words of each adjective. • Lexicon size was 12000 when rare and stop words were removed.

  12. Comparison of RatingPrediction Accuracy Equation Source: Blei, D. & McAuliffe, J. Supervised topic models. NIPS, 2007.

  13. Topics

  14. Ratings and Topics • Here, the authors show that supervised CTRF (sCTRF) shows good separation of rating scores among the topics (top row) compared to MedLDA (bottom row).

  15. Feature Weights • Five features were considered: Default–equal to one for any word; Pos-JJ–positive adjective; Neg-JJ–negative adjective; Re-Pos-JJ–positive adjective that has a denying word before it; and Re-Neg-JJ–negative adjective that has a denying word before it. • The default feature dominates when truncated to 5 topics, but becomes less important at higher truncation levels.

More Related