1 / 22

Personalizing Web Search using Long Term Browsing History

Personalizing Web Search using Long Term Browsing History. Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft. In Proceedings of WSDM 2011. Relevant result. Query:. “pia workshop”. Outline. Approaches to personalization The proposed personalization strategy Evaluation metrics

nerina
Télécharger la présentation

Personalizing Web Search using Long Term Browsing History

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Personalizing Web Search using Long Term Browsing History Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft In Proceedings of WSDM 2011

  2. Relevant result Query: “pia workshop”

  3. Outline • Approaches to personalization • The proposed personalization strategy • Evaluation metrics • Results • Conclusions and Future work

  4. Approaches to Personalization • Observed user interactions • Short-term interests • Sriram et al. [24] and [6], session data is too sparse to personalize • Longer-term interests • [23, 16]: model users by classifying previously visited Web pages • Joachims [11]: user click-through data to learn a search function • PClink [7] and Teevan et al. [28] • Other related approaches: [20, 25, 26] • Representing the user • Teevan et al. [28], rich keyword-based representations, no use of web page characteristics • Commercial personalization systems • Google • Yahoo! promote URLs rich user profile

  5. Personalization Strategy Weighting Title Unigrams WordNet Dictionary Filtering Metadata description Unigrams TF Weighting TFxIDF Weighting Browsing History User Profile Terms User Profile Terms Google N-Gram Filtering Full text Unigrams BM25 Weighting Metadata keywords No Filtering Extracted Terms Noun phrases User Profile Terms and Weights Data Extraction Filtering Visited URLs + number of visits Previous searches &click-through data User Profile Generation Workflow

  6. Personalized Search query Firefox add-on: AlterEgo Browsing History dog 1 cat 10 india 2 mit 4 search 93 amherst 12 vegas 1

  7. Personalized Search query Data extraction forest hiking walking gorp dog cat monkey banana food baby infant child boy girl User Profile Terms csail mit artificial research robot baby infant child boy girl web search retrieval ir hunt dog 1 cat 10 india 2 mit 4 search 93 amherst 12 vegas 1

  8. Personalized Search query Term weighting 6.0 1.6 0.2 2.7 0.2 1.3 dog 1 cat 10 india 2 mit 4 search 93 amherst 12 vegas 1 web search retrieval ir hunt 1.3

  9. Term Weighting • TF: term frequency • TF-IDF: wTF(ti) cow search cow ir hunt dog = 0.02 2 TF 100 1 wTF(ti)= * wTF(ti) log(DFti) forest cow walking gorp * = 0.08 1 2 TF-IDF dog cat monkey banana food baby infant child boy cow csail mit artificial research robot 100 baby infant child boy girl log(103/107) cow search cow ir hunt dog

  10. N ni (rti+0.5)(N-nti+0.5) (nti+0.5)(R-rti+0.5) ri wpBM25(ti)=log R Term Weighting • Personalized BM25 World 0.3 0.7 0.1 0.23 0.6 0.6 0.1 0.7 0.001 0.23 0.6 0.1 0.05 0.5 0.35 0.3 0.002 0.7 0.1 0.01 0.6 0.1 0.7 0.001 0.23 0.6 0.2 0.8 0.1 0.001 0.3 0.4

  11. Re-ranking • Use the user profile to re-rank top results returned by a search engine • Candidate document vs. snippets • Snippets are more effective. Teevan et al. [28] • Allow straightforward personalization implementation • Matching • For each term occurs both in snippet and user profile, its weight will be added to the snippet’s score • Unique matching • Counts each unique term once • Language model • Language model for user profile, weights for terms are used as frequency counts • PClink Dou et al. [7] Scoring methods

  12. Evaluation Metrics • Relevance judgements • NDCG@10 = Σ • Side-by-side • Two alternative rankings side-by-side, ask users to vote for best • Clickthrough-based • Look at the query and click logs from large search engine • Interleaved • New metric for personalized search • Combine results of two search rankings (alternating between results, omitting duplicates) 10 2reli - 1 log2(1+i) 1 i=1 Z

  13. Offline Evaluation • 6 participants, 2 months of browsing history • Judge relevance of top 50 pages returned by Google for 12 queries • 25 general queries (16 from TREC 2009 Web search track), each participant will judge 6 • Most recent 40 search queries, judge 5 • Each participant took about 2.5 hours to complete

  14. Offline Evaluation Personalization strategies. Rel: relative weighting • MaxNDCG: yields highest average NDCG • MaxQuer: improves the most queries • MaxNoRank: the method with highest NDCG that does not take the original Google ranking into account • MaxBestPar: obtained by greedily selecting each parameter sequentially

  15. Offline Evaluation Offline evaluation performance • MaxNDCG and MaxQuer are both significantly better • Interestingly, MaxNoRank is significantly better than Google and Teevan (may be due to overfitting on small offline data) • PClink improves fewest queries, but better than Teevan on average NDCG

  16. Offline Evaluation Distribution of relevance at rank for Google and MaxNDCGrankings • 3600 relevance judgements collected, 9% Very Relevant, 32% Relevant, 58% Non-Relevant • Google:places many Very Relevant results in Top 5 • MaxNDCG: adds more Very Relevant results into Top 5, and succeeds in adding Very Relevant results between Top 5 and Top 10

  17. Online Evaluation • Large-scale interleaved evaluation, users performing day-to-day real searches • The first 50 results requested from Google, personalization strategies were picked randomly • Exploit Team-Draft interleaving algorithm [18] to produce a combined ranking • 41 users, 7997 queries, 6033 query impressions, 6534 queries and 5335 query impressions received a click

  18. Online Evaluation Results of online interleaving test Queries impacted by personalization

  19. Online Evaluation Degree of personalization per rank Rank differences for deteriorated(light) and improved(dark) queries for MaxNDCG • For a large majority of deteriorated queries, the clicked results only loss 1 rank • The majority of clicked results that improved a query gain 1 rank • The gains from personalization are on average more than double the losses • MaxNDCG is the most effective personalization method

  20. Conclusions • First large-scale personalized search and online evaluation work • Proposed personalization techniques: significantly outperform default Google and best previous ones • Key to model users: use characteristics and structures of Web pages • Long-term, rich user profile is beneficial

  21. Future Exploration • Parameter extension • Learning parameter weights • Using other fields (e.g., headings in HTML) and learning their weights • Incorporating temporal information • How much browsing history? • Whether decaying weights of older terms? • How page visit duration can be used? • Making use of more personal data • Using extracted profiles for other purposes

  22. Thank you!

More Related