1 / 33

Learning User Interaction Models for Predicting Web Search Result Preferences

Learning User Interaction Models for Predicting Web Search Result Preferences. Eugene Agichtein Eric Brill Susan Dumais Robert Ragno. Microsoft Research. User Interactions. Goal: Harness rich user interactions with search results to improve quality of search

cbelinda
Télécharger la présentation

Learning User Interaction Models for Predicting Web Search Result Preferences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno MicrosoftResearch

  2. User Interactions • Goal: Harness rich user interactions with search results to improve quality of search • Millions of users submit queries daily and interact with the search results • Clicks, query refinement, dwell time • User interactions with search engines are plentiful, but require careful interpretation • We will predict user preferences for results

  3. Related Work • Linking implicit interactions and explicit judgments • Fox et al. [TOIS 2005] • Predict explicit satisfaction rating • Joachims [SIGIR 2005 ] • Predict preference (gaze studies, interpretation strategies) • More broad overview of analyzing implicit interactions: Kelly & Teevan [SIGIR Forum 2003]

  4. Outline • Distributional model of user interactions • User Behavior = Relevance + “Noise” • Rich set of user interaction features • Learning framework to predict user preferences • Large-scale evaluation

  5. Interpreting User Interactions • Clickthrough and subsequent browsing behavior of individual users influenced by many factors • Relevance of a result to a query • Visual appearance and layout • Result presentation order • Context, history, etc. • General idea: • Aggregate interactions across all users and queries • Compute “expected” behavior for any query/page • Recover relevance signal for a given query

  6. Case Study: Clickthrough Clickthrough frequency for all queries in sample Clickthrough (query q, document d, result position p)=expected (p) + relevance (q , d)

  7. Clickthrough for Queries with Known Position of Top Relevant Result Relative clickthrough for queries top relevant result known to be at position 1

  8. Clickthrough for Queries with Known Position of Top Relevant Result Higher clickthrough at top non-relevant than at top relevant document Relative clickthrough for queries with known relevant results in position 1 and 3 respectively

  9. Deviation from Expected • Relevance component: deviation from “expected”: Relevance(q , d)= observed - expected (p)

  10. Beyond Clickthrough: Rich User Interaction Space • Observed and Distributional features • Observed features: aggregated values over all user interactions for each query and result pair • Distributional features: deviations from the “expected” behavior for the query • Represent user interactions as vectors in “Behavior Space” • Presentation: what a user sees before click • Clickthrough: frequency and timing of clicks • Browsing: what users do after the click

  11. Some User Interaction Features

  12. Outline • Distributional model of user interactions • Rich set of user interaction features • Models for predicting user preferences • Experimental results

  13. Predicting Result Preferences • Task: predict pairwise preferences • A user will prefer Result A > ResultB • Models for preference prediction • Current search engine ranking • Clickthrough • Full user behavior model

  14. Clickthrough Model • SA+N: “Skip Above” and “Skip Next” • Adapted from Joachims’ et al. [SIGIR’05] • Motivated by gaze tracking • Example • Click on results 2, 4 • Skip Above: 4 > (1, 3), 2>1 • Skip Next: 4 > 5, 2>3 1 2 3 4 5 6 7 8

  15. Distributional Model • CD: distributional model, extends SA+N • Clickthrough considered iff frequency > εthan expected • Click on result 2 likely “by chance” • 4>(1,2,3,5), but not 2>(1,3) 1 2 3 4 5 6 7 8

  16. User Behavior Model • Full set of interaction features • Presentation, clickthrough, browsing • Train the model with explicit judgments • Input: behavior feature vectors for each query-page pair in rated results • Use RankNet (Burges et al., [ICML 2005]) to discover model weights • Output: a neural net that can assign a “relevance” score to a behavior feature vector

  17. RankNet for User Behavior • RankNet: general, scalable, robust Neural Net training algorithms and implementation • Optimized for ranking– predicting an ordering of items, not scores for each • Trains on pairs (where first point is to be ranked higher or equal to second) • Extremely efficient • Uses cross entropy cost(probabilistic model) • Usesgradient descent to set weights • Restarts to escape local minima

  18. Outline • Distributional model of user interactions • Rich set of user interaction features • Models for predicting user preferences • Experimental evaluation

  19. Evaluation Metrics • Task: predict user preferences • Pairwise agreement: • For comparison with previous work • Useful for ranking and other applications • Precision for a query: • Fraction of pairs predicted that agree with preferences derived from human ratings • Recall for a query: • Fraction of human-rated preferences predicted correctly • Average Precision and Recall across all queries

  20. Datasets • Explicit judgments • 3,500 queries, top 10 results, relevance ratings converted to pairwise preferences for each query • User behavior data • Opt-in client-side instrumentation • Anonymized UserID, time, visited page • Detect queries submitted to MSN Search engine • Subsequent visited pages • 120,000 instances of these 3,500 queries submitted at least 2 times over 21 days

  21. Methods Compared Preferences inferred by: • Current search engine ranking: Baseline • Result i > Result jiff i > j • Clickthrough model: SA+N • Clickthrough distributional model: CD • Full user behavior model: UserBehavior

  22. Results: Predicting User Preferences • Baseline < SA+N < CD << UserBehavior • Rich user behavior features result in dramatic improvement

  23. Contribution of Feature Types • Presentation features not helpful • Browsing features: higher precision, lower recall • Clickthrough features > CD: due to learning

  24. Amount of Interaction Data • Prediction accuracy for varying amount of user interactions per query • Slight increase in Recall, substantial increase in Precision

  25. Learning Curve • Minimum precision of 0.7 • Recall increases substantially with more days of user interactions

  26. Experiments Summary • Clickthrough distributional model: more accurate than previously published work • Rich user behavior features: dramatic accuracy improvement • Accuracy increases for frequent queries and longer observation period

  27. Some Applications • Web search ranking (next talk): • Can use preference predictions to re-rank results • Can integrate features into ranking algorithms • Identifying and answering navigational queries • Can tune model to focus on top 1 result • Supports classification or ranking methods • Details in Agichtein & Zheng, [KDD 2006] • Automatic evaluation: augment explicit relevance judgments

  28. Conclusions • General framework for training rich user interaction models • Robust techniques for inferring user relevance preferences • High-accuracy preference prediction in a large scale evaluation

  29. Thank you Text Mining, Search, and Navigation group: http://research.microsoft.com/tmsn/ Adaptive Systems and Interaction group: http://research.microsoft.com/adapt/ MicrosoftResearch

  30. Presentation Features • Query terms in Title, Summary, URL • Position of result • Length of URL • Depth of URL • …

  31. Clickthrough Features • Fraction of clicks on URL • Deviation from “expected” given result position • Time to click • Time to first click in “session” • Deviation from average time for query

  32. Browsing Features • Time on URL • Cumulative time on URL (CuriousBrowser) • Deviation from average time on URL • Averaged over the “user” • Averaged over all results for the query • Number of subsequent non-result URLs

  33. An Intelligent Baseline

More Related