1 / 17

Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

Kay- Uwe Schmidt*, Tobias Sarnow *, Ljiljana Stojanovic ** *SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany **Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany Symposium on Applied Computing (2009) 2009. 08. 13.

eamon
Télécharger la présentation

Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kay-Uwe Schmidt*, Tobias Sarnow*, LjiljanaStojanovic** *SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany **Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany Symposium on Applied Computing (2009) 2009. 08. 13. Summarized & presented by Babar Tareen, IDS Lab., Seoul National University Socially Filtered Web Search:An approach using social bookmarking tags to personalize web search

  2. Introduction Search engines do not consider current work context Static results for all users Server side personalization has limited use Client side search engines rely on additional terms extracted from documents, thus not scalable Social Bookmarking based search result personalization addresses these issues

  3. Related Work Google History goZone.com Mahalo.com UCAIR

  4. Motivation • A developer is looking for guide lines for testing DB code • Visits • www.ibm.com/db2 • www.hsqldb.org • Googles • “Test” • Original Results • Web based certification • Personality test • Bandwidth test • Personalized Results • DB2 training • DB2 programming test

  5. Personalizing Search Results • Tracking browsing behavior • Create user model • Url’s • Tags fetched from Delicious • Issue original query • Enhance search query by adding tags • Issue new query • Display both results Tags given by a community of users provide a good summary of web page content

  6. Architecture [1] • Search Module • Carries out original query • Inserts space (<DIV>) for personalized results • Metric Module • Includes a metric that delivers a tag for personalized search • Search Enhancer Module • Combines search string with metric module tags • Metadata Module • Extracts metadata for a visited website from delicious

  7. Architecture [2] • Built as add-on on top of • Firefox • Internet Explorer

  8. Metric [1] • Two datasets • Collection of visited websites • Tags for each website • Query last 20 disjunct websites from user model • Format (url, count) • Sorted by weight ‘γ’

  9. Metric [2] • Tags assigned to website • Format (tag, no of users) • t → tags assigned to a website • T → tags for all websites

  10. Algorithm

  11. Result

  12. Evaluation How effective can this be ?

  13. Can Social Bookmarking Improve Web Search? PaulyHeymann, Georgia Koutrika, Hector Garcia-Molina Dept. of Computer Science, Stanford University USA Web Search and Data Mining 2008

  14. Positive Factors [1] • URLs • Pages posted on delicious are often recently modified • Delicious users post interesting pages that are actively updated or have been recently created • Approximately 25% of URLs posted by users are new, unindexed pages • Delicious can server as a small data source for new web pages and to help crawl ordering • Roughly 9% of results for search queries are URLs present in delicious • Delicious URLs are disproportionately common in search results compared to their coverage • While some users are more prolific than others, the top 10% of users only account for 56% of the posts • Delicious is not highly reliant on a relatively small group of users

  15. Positive Factors [2] • URLs • 30-40% of URLs and approximately one in eight domains posted were not previously in delicious. • Delicious has relatively little redundancy in page information • Tags • Popular query terms and tags overlap significantly • Delicious may be able to help with queries where tags overlap with query terms • In this study, most tags were deemed relevant and objective by users • Tags are on the whole accurate

  16. Negative Factors • URLs • Approximately 120,000 URLs are posted to delicious each day • The number of posts per day is relatively small; for instance, it represents 1/10 of the number of blog posts per day • There are roughly 115 million public posts, coinciding with about 30-50 million unique URLs • The number of total posts is relatively small for instance, this is a small portion of the web as whole (perhaps 1/1000) • Tags • Tags are present in the pagetext of 50% of the pages they annotate • A substantial proportion of tags are obvious in context, and many tagged pages would be discovered by a search engine • Domains are often highly correlated with particular tags and vice versa • It may be more efficient to train librarians to label domains than to ask users to tag pages

  17. Discussion Query expansion model based on Social tagging What is the probability of finding tags for random URL in delicious.com? Generalization vs. Specialization

More Related