190 likes | 310 Vues
This paper explores a novel Topic-Perspective Model for social tagging systems that integrates user, document, word, and tag data within a unified framework. By simulating the annotation generation process, the model addresses both visible and hidden variables, improving tag recommendation, prediction, clustering, and classification tasks. Key methodologies such as Variational Expectation Maximization and Gibbs Sampling are employed to estimate parameters. Experiments conducted on datasets like del.icio.us demonstrate the model's efficacy in enhancing personalized search capabilities and generating relevant tags for new documents.
E N D
INTRODUCTION social data--social annotations--tags a new type of information source tag recommendation、prediction 、clustering、classification、IR
RELATED WORK1 • Topic Analysis using Generative Models text mining: 1.Naïve Bayesian model, 2.Probabilistic Latent Semantic Indexing (PLSI) model, 3.Latent Dirichlet Allocation (LDA) model • correlated LDA, switchLDA, Link-LDA, Topic-Link LDA
RELATED WORK2 • Generative Models for Social Tagging 1.Conditionally-independent LDA (CI-LDA) model 2.Community-based categorical annotation (CCA) model 3.correlated or correspondence LDA (CorrLDA) model
DXK doc-topic分布 KXW topic-word分布 KXT topic-tag分布
Topic-Perspective Model • 真实模拟annotation的生成过程,user 、document、word、tag统一在一个模型中 • motivation:表示和连接可见的及不可见的变量 • Output:user perspective可用于个性化搜素
UXL user-persp分布 DXK doc-topic分布 KXW topic-word分布 KXT topic-tag分布 LXT persp-tag分布 a vector indicating the probability each tag is generated from topics
Parameter Estimation • Variational expectation maximization • Expectation propagation • Gibbs sampling
Experiments and results • Datasets: del.icio.us, 1-2 2009, 41190 documents, 4414 users, 28740 tags, 129908 words, 10% test, 90% train • Evaluation Criterion: perplexity.概括归纳新文档的tags的能力
Experiment Setup • Topic K Perspective L 的选择