140 likes | 273 Vues
Statistical Topic Models for Integrating and Analyzing Opinions in Blog articles. Yue Lu Qiaozhu Mei ChengXiang Zhai. 190,451 posts. 4,773,658 results. Why Opinion Integration?. What have been said about Barack Obama? the health care reform? Hurricane Katrina? Al-Qaeda? .
 
                
                E N D
Statistical Topic Models for Integrating and Analyzing Opinions in Blog articles Yue Lu Qiaozhu Mei ChengXiangZhai
190,451 posts 4,773,658results Why Opinion Integration? • What have been said about Barack Obama? the health care reform? Hurricane Katrina? Al-Qaeda? How to digest all?
Opinions Come in Two Kinds 4,773,658results 190,451 posts How to integrate and benefit from both? Q1
Opinions Come with Context Source Author How to benefit from context? Q2 Time Location
Topics B Statistical Topic Models: PLSA [Hofmann 99], [Zhai et al. 04] Topic model = unigram language model = multinomial distribution Document Generate a word in a document government0.3 response 0.2.. 1 d1 2 oil 0.1price 0.05 d2 w … dk k pray 0.2bless 0.15 Collection background Is 0.05the 0.04a 0.03 .. B
Topics B PLSA Estimation Generate a word in a document Document ? ? 1 d1 2 ? d2 w Log-likelihood of the collection … ? dk k ? Collection background Is 0.05the 0.04a 0.03 .. B Estimated with Maximum Likelihood Estimator (MLE) through an EM algorithm
Topics 1 - B B Exploiting Expert Opinions in PLSA How to integrate and benefit from both? Q1 [Lu & Zhai www08] Document Add as Dirichlet priors Governmentresponse r1 1 d1 Expert Opinions Oil price r2 2 Blog Opinions d2 w … dk k Collection background Is 0.05the 0.04a 0.03 .. MLE MAP B
Topics 1 - B B Exploiting Opinion Context in PLSA How to benefit from context? Q2 Document [Mei et al. www06] Topic Coverage condition on context 1 c1 Year=06 d1 Spatiotemporal Context Year=08 c2 2 w Blog Opinions d2 P(i|time, location) … dk k Collection background P(i,location|time) B Is 0.05the 0.04a 0.03 .. P(time|i, location)
Spatiotemporal Analysis on Hurricane Katrina Snapshot of Topic Coverage P(i=Government Response,location|time) Hurricane Katrina
Spatiotemporal Analysis on Hurricane Katrina P(time|i, location=Texas) Topic life cycle Hurricane Katrina
Summary • Problem: opinion integration and analysis • Approaches: • Unsupervised statistical topic models • Domain independent, general and robust • Many potential applications: • Intelligence analysis • Public opinion tracking • … • Future Work: • System/toolkit building • More interactive support • More NLP: co-reference