1 / 1

A Non-parametric Bayesian Approach [WSDM’14]

User Modeling in Search Engine Logs. Hongning Wang, Advisort : ChengXiang Zhai, Department of Computer Science, University of Illinois at Urbana-Champaign Urbana, IL 61801 USA { wang296,czhai}@Illinois.edu. Margin rescaling. p(Q). per-user basis adaptation baseline. p(Q). p(Q).

deon
Télécharger la présentation

A Non-parametric Bayesian Approach [WSDM’14]

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. User Modeling in Search Engine Logs Hongning Wang, Advisort: ChengXiang Zhai, Department of Computer Science, University of Illinois at Urbana-Champaign Urbana, IL 61801 USA {wang296,czhai}@Illinois.edu Margin rescaling p(Q) per-user basis adaptation baseline p(Q) p(Q) q1 q3 q2 Non-linear kernels q1 q1 q3 q3 q2 q2 f1 f1 f1 Complexity of adaptation Lose function from any linear learning-to-rank algorithm, e.g., RankNet, LambdaRank, RankSVM A Non-parametric Bayesian Approach [WSDM’14] A Ranking Model Adaptation Approach [SIGIR’13] Use cross-training to determine feature grouping f2 Group 1 f2 f2 Group k Group c In this work, we study the problem of user modeling in the search log data and propose a generative model, dpRank, within a non-parametric Bayesian framework. By postulating generative assumptions about a user's search behaviors, dpRank identifies each individual user's latent search interests and his/her distinct result preferences in a joint manner. Experimental results on a large-scale news search log data set validate the effectiveness of the proposed approach, which not only provides in-depth understanding of a user's search intents but also benefits a variety of personalized applications. In this work, we propose a general ranking model adaptation framework for personalized search. The proposed framework quickly learns to apply a series of linear transformations, e.g., scaling and shifting, over the parameters of the given global ranking model such that the adapted model can better fit each individual user's search preferences. Extensive experimentation based on a large set of search logs from a major commercial Web search engine confirms the effectiveness of the proposed method compared to several state-of-the-art ranking model adaptation methods. Methods Methods • Adjust the generic ranking model’s parameters with respect to each individual user’s ranking preferences Dirichlet Process Prior Aggregated level: information shared by all the users Latent User Groups …… …… Modeling of result preferences • Linear regression based model adaptation Individual level: characterize user’s own interest y y Induced optimization problem in the same complexity as the original problem x x Group1 Modeling of search interest • Instantiation of RankSVM A fully generative model for exploring users’ search behaviors Pairwise ranking model 1. Draw latent user groups from DP: Gibbs sampling for posterior inference 2. Draw group membership for each user from DP: 3. To generate a query in user u: 3.1 Draw a latent user group c: 3.2 Draw query qi for user u accordingly: 3.3 Draw click preferences for qi accordingly: Global model … … … Group2 Group2 Groupk Group1 Group2 Groupk Groupk Group1 • Document ranking Experimental Results Experimental Results • Yahoo! News search logs • May to July, 2011 • 65 ranking features for each Query-Document pair • Bing query log: May 27, 2012 – May 31, 2012 • 1830 ranking features • Query-level improvement against global model site authority • Query distribution in latent user groups proximity in title • Click preferences in latent user groups query match in title • User-level improvement against global model • Adaptation efficiency document age [10, ∞) queries [5, 10) queries (0, 5) queries * Indicates p-value<0.01

More Related