1 / 15

Query Chains: Learning to Rank from Implicit Feedback

Query Chains: Learning to Rank from Implicit Feedback. Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr. The Problem. The results returned from web searches can be cluttered with results that the user considers to be irrelevant

kaloni
Télécharger la présentation

Query Chains: Learning to Rank from Implicit Feedback

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr

  2. The Problem • The results returned from web searches can be cluttered with results that the user considers to be irrelevant • Search engines don’t learn from your document selections or from revisions to your query

  3. Page Ranking Non-learning Methods • Link-based (Google PageRank) Learning Methods • Explicit user feedback • Ask the user how relevant they found the result • Very accurate data, but very time-consuming • Implicit user feedback • Determine the relevance by looking at search engine logs • Unlimited data at a low cost, but requires interpretation

  4. The Solution • Automatically detect query chains • Use query chains to infer relevance of results in each query and between results from all queries in the chain • Use a ranking Support Vector Machine (SVM) to learn a retrieval function from the results. • Osmot search engine based on this model

  5. Query Chains • People often reword their queries to get more useful results • Spelling mistake • Increased or decreased specificity • New but related query • Query chains are defined as a sequence of reformulated queries

  6. Support Vector Machines • Learning method used for classification • Separates two classes of data points by generating a hyperplane that maximizes the vector distance between the two sets and the hyperplane • Uses the hyperplane to assign new data points to one of the two classes

  7. Identifying Query Chains • Manually labeled query chains from the Cornell University library search engine for a period of five weeks • Used data to train SVM’s with various parameters, giving an accuracy of 94.3% and a precision of 96.5% • Non-learning strategy of assuming all queries from the same IP in a 30 minute period belong to the same chain gave an accuracy and precision of 91.6% • The non-learning strategy was sufficiently accurate and less expensive so they used it instead

  8. Inferring Relevance Developed six strategies for generating feedback from query chains • Click >q Skip Above: A clicked on document is more relevant than any documents above it • Click First >q No-Click Second: Given the first two document results, if the first was clicked, it is more relevant • Strategies 3 and 4 are the same as the first two, but with respect to the previous query • Click >q’ Skip Earlier Query: A clicked on document is more relevant than any that were skipped in any earlier query • Click >q’ Top Two Earlier Query: If nothing was clicked in the last query, the clicked document is more relevant than the top two from an earlier query

  9. Example

  10. Learning Ranking Functions

  11. Experiment • The Osmot search engine was created as a wrapper, implementing logging, analysis and ranking • Users presented with a combination of results from two different ranking functions • Evaluate which ranking was better based on which documents were clicked • Evaluation conducted over two months collecting around 2400 queries

  12. Experiment Results • Users preferred results from the query chain ranking function 53% of the time • Model trained with query chains outperformed model trained without query chains with 99% confidence

  13. Conclusion • Developed an algorithm to determine the relevance of a document from log entries • Developed another algorithm to use preference judgments to learn an improved ranking function • Algorithm can learn to include documents that weren’t included in the original search results

  14. Critique • The learning method uses only log files rather than constantly updating itself • Referred to other papers rather than explain concepts needed to understand the paper • Didn’t offer a comparison between the effectiveness of their learning algorithm compared to other learning algorithms

  15. Questions?

More Related