Learning User Interaction Models for Predicting Web Search Result Preferences

Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno MicrosoftResearch

User Interactions • Goal: Harness rich user interactions with search results to improve quality of search • Millions of users submit queries daily and interact with the search results • Clicks, query refinement, dwell time • User interactions with search engines are plentiful, but require careful interpretation • We will predict user preferences for results

Related Work • Linking implicit interactions and explicit judgments • Fox et al. [TOIS 2005] • Predict explicit satisfaction rating • Joachims [SIGIR 2005 ] • Predict preference (gaze studies, interpretation strategies) • More broad overview of analyzing implicit interactions: Kelly & Teevan [SIGIR Forum 2003]

Outline • Distributional model of user interactions • User Behavior = Relevance + “Noise” • Rich set of user interaction features • Learning framework to predict user preferences • Large-scale evaluation

Interpreting User Interactions • Clickthrough and subsequent browsing behavior of individual users influenced by many factors • Relevance of a result to a query • Visual appearance and layout • Result presentation order • Context, history, etc. • General idea: • Aggregate interactions across all users and queries • Compute “expected” behavior for any query/page • Recover relevance signal for a given query

Case Study: Clickthrough Clickthrough frequency for all queries in sample Clickthrough (query q, document d, result position p)=expected (p) + relevance (q , d)

Clickthrough for Queries with Known Position of Top Relevant Result Relative clickthrough for queries top relevant result known to be at position 1

Clickthrough for Queries with Known Position of Top Relevant Result Higher clickthrough at top non-relevant than at top relevant document Relative clickthrough for queries with known relevant results in position 1 and 3 respectively

Deviation from Expected • Relevance component: deviation from “expected”: Relevance(q , d)= observed - expected (p)

Beyond Clickthrough: Rich User Interaction Space • Observed and Distributional features • Observed features: aggregated values over all user interactions for each query and result pair • Distributional features: deviations from the “expected” behavior for the query • Represent user interactions as vectors in “Behavior Space” • Presentation: what a user sees before click • Clickthrough: frequency and timing of clicks • Browsing: what users do after the click

Some User Interaction Features

Outline • Distributional model of user interactions • Rich set of user interaction features • Models for predicting user preferences • Experimental results

Predicting Result Preferences • Task: predict pairwise preferences • A user will prefer Result A > ResultB • Models for preference prediction • Current search engine ranking • Clickthrough • Full user behavior model

Clickthrough Model • SA+N: “Skip Above” and “Skip Next” • Adapted from Joachims’ et al. [SIGIR’05] • Motivated by gaze tracking • Example • Click on results 2, 4 • Skip Above: 4 > (1, 3), 2>1 • Skip Next: 4 > 5, 2>3 1 2 3 4 5 6 7 8

Distributional Model • CD: distributional model, extends SA+N • Clickthrough considered iff frequency > εthan expected • Click on result 2 likely “by chance” • 4>(1,2,3,5), but not 2>(1,3) 1 2 3 4 5 6 7 8

User Behavior Model • Full set of interaction features • Presentation, clickthrough, browsing • Train the model with explicit judgments • Input: behavior feature vectors for each query-page pair in rated results • Use RankNet (Burges et al., [ICML 2005]) to discover model weights • Output: a neural net that can assign a “relevance” score to a behavior feature vector

RankNet for User Behavior • RankNet: general, scalable, robust Neural Net training algorithms and implementation • Optimized for ranking– predicting an ordering of items, not scores for each • Trains on pairs (where first point is to be ranked higher or equal to second) • Extremely efficient • Uses cross entropy cost(probabilistic model) • Usesgradient descent to set weights • Restarts to escape local minima

Outline • Distributional model of user interactions • Rich set of user interaction features • Models for predicting user preferences • Experimental evaluation

Evaluation Metrics • Task: predict user preferences • Pairwise agreement: • For comparison with previous work • Useful for ranking and other applications • Precision for a query: • Fraction of pairs predicted that agree with preferences derived from human ratings • Recall for a query: • Fraction of human-rated preferences predicted correctly • Average Precision and Recall across all queries

Datasets • Explicit judgments • 3,500 queries, top 10 results, relevance ratings converted to pairwise preferences for each query • User behavior data • Opt-in client-side instrumentation • Anonymized UserID, time, visited page • Detect queries submitted to MSN Search engine • Subsequent visited pages • 120,000 instances of these 3,500 queries submitted at least 2 times over 21 days

Methods Compared Preferences inferred by: • Current search engine ranking: Baseline • Result i > Result jiff i > j • Clickthrough model: SA+N • Clickthrough distributional model: CD • Full user behavior model: UserBehavior

Results: Predicting User Preferences • Baseline < SA+N < CD << UserBehavior • Rich user behavior features result in dramatic improvement

Contribution of Feature Types • Presentation features not helpful • Browsing features: higher precision, lower recall • Clickthrough features > CD: due to learning

Amount of Interaction Data • Prediction accuracy for varying amount of user interactions per query • Slight increase in Recall, substantial increase in Precision

Learning Curve • Minimum precision of 0.7 • Recall increases substantially with more days of user interactions

Experiments Summary • Clickthrough distributional model: more accurate than previously published work • Rich user behavior features: dramatic accuracy improvement • Accuracy increases for frequent queries and longer observation period

Some Applications • Web search ranking (next talk): • Can use preference predictions to re-rank results • Can integrate features into ranking algorithms • Identifying and answering navigational queries • Can tune model to focus on top 1 result • Supports classification or ranking methods • Details in Agichtein & Zheng, [KDD 2006] • Automatic evaluation: augment explicit relevance judgments

Conclusions • General framework for training rich user interaction models • Robust techniques for inferring user relevance preferences • High-accuracy preference prediction in a large scale evaluation

Thank you Text Mining, Search, and Navigation group: http://research.microsoft.com/tmsn/ Adaptive Systems and Interaction group: http://research.microsoft.com/adapt/ MicrosoftResearch

Presentation Features • Query terms in Title, Summary, URL • Position of result • Length of URL • Depth of URL • …

Clickthrough Features • Fraction of clicks on URL • Deviation from “expected” given result position • Time to click • Time to first click in “session” • Deviation from average time for query

Browsing Features • Time on URL • Cumulative time on URL (CuriousBrowser) • Deviation from average time on URL • Averaged over the “user” • Averaged over all results for the query • Number of subsequent non-result URLs

An Intelligent Baseline

Learning User Interaction Models for Predicting Web Search Result Preferences

Learning User Interaction Models for Predicting Web Search Result Preferences

Presentation Transcript

Web information search strategies - a model for classifying Web interaction ?

Learning to Question: Leveraging User Preferences for Shopping Advice

User Interaction

Learning Preferences

Learning Content Models for Semantic Search

Learning to Question: Leveraging User Preferences for Shopping Advice

Search Preferences

Protocols and Interaction Models for Web Services

Context Learning Can Improve User Interaction

User Preferences for Weather Data Dissemination Standards on the Web

LEARNING USER PLAN PREFERENCES OBFUSCATED BY FEASIBILITY CONSTRAINTS

Learning Knowledge Rich User Models from the Semantic Web

Learning User Preferences

User Interaction

Learning User Interaction Models for Predicting Web Search Result Preference

Approaches to Modeling and Learning User Preferences

User Interaction

Learning User Clicks in Web Search

In search for patterns of user interaction for digital libraries

Learning user preferences for 2CP-regression for a recommender system

User preferences for coworking space characteristics