230 likes | 378 Vues
Recommender Systems and Collaborative Filtering. Drawing much on some online ppt in this area, especially William W. Cohen (CMU). You visit an online bookshop . The shop has 100,000 books. On the webpage, they will display 5 book covers, especially for you . What ones will they display?.
E N D
Recommender Systems and Collaborative Filtering Drawing much on some online ppt in this area, especially William W. Cohen (CMU)
You visit an online bookshop ... The shop has 100,000 books. On the webpage, they will display 5 book covers, especially for you. What ones will they display?
Why? • same for books, webpages, music, films, clothes, food, everything ... this is very serious for e-commerce -- big financial uplift if stores get recommendations ‘right’ • What if the website is not selling you anything (e.g. research papers, search, interest group forum). Why does such a site need to make good recommendations?
Basic approaches used for recommendation • User-based • Recommend things that were purchased or viewed by users who are similar to you • Item-based • Recommend things that are similar to the items that you have viewed/purchased before
Amazon: with minimal info about me via a cookie on this netbook
User Profiles For user-based recommendation, sites need to have some kind of user profile. Similarity with other users is based on distance measurements based on the profile. What do you think could be in a user profile?
Potential contents of user profiles • Demographic data: age, gender, salary, profession, country of residence, country of origin, religion ... • Site behaviour: Purchase history at the site; viewing history, perhaps including time spent on certain pages/items; clickstream sequence
K-Nearest Neighbour based Recommendation Age You Salary (Think in terms of many dimensions, not just these two)
K-Nearest Neighbour based Recommendation Age You Salary Your neighbours: recommend things that they have viewed/purchased
Collaborative Filtering: The main idea People who purchased A also purchased B Different from nearest-neighbour; this can lead to recommendations based on behaviour of users who are very dissimilar to you
Other forms/aspects of collaborative filtering Why “collaborative”? Basically, someone else (in fact many someones) have gone to the effort of viewing/filtering things, and chosen the best few. You get a recommendation of the best few, without having to spend the effort. Rampant examples of CF: twitter, pagerank, stumbleupon, digg, Facebook (Likes), etc ...
Another look at Google’s PageRank(this bit adapted from slides of William Cohen, CMU) Inlinks are “good” (recommendations) Inlinks from a “good” site are better than inlinks from a “bad” site but inlinks from sites with many outlinks are not as “good”... “Good” and “bad” are relative. web site xxx web site xxx web site xxx web site a b c d e f g web site pdq pdq .. web site yyyy web site a b c d e f g web site yyyy
Google’s PageRank(Brin & Page, http://www-db.stanford.edu/~backrub/google.html) web site xxx • Imagine a “pagehopper” that always either • follows a random link, or • jumps to random page • PageRank ranks pages by the amount of time the pagehopper spends on a page: • or, if there were many pagehoppers, PageRank is the expected “crowd size” web site xxx web site a b c d e f g web site pdq pdq .. web site yyyy web site a b c d e f g web site yyyy
Collaborative Filtering and User Ratings Many systems ask users to rate items – e.g. on a scale of 1 to 10. These ratings then enable the system to give more precise/accurate recommendations, and use a variety of sophisticated learning/prediction algorithms.
Collaborative Filtering and User Ratings Many systems ask users to rate items – e.g. on a scale of 1 to 10. These ratings then enable the system to give more precise/accurate recommendations, and use a variety of sophisticated learning/prediction algorithms. E.g. Here are user ratings for some items: “?” means unrated. A B C D E F G H You: 7 2 1 8 9 9 ? ? User1 1 8 8 2 ? 2 8 7 User2 6 3 3 7 6 5 3 1 User3 7 2 1 7 7 ? 3 1 How might a system predict your rating for items G and H?
BellCore’s MovieRecommender(Bell Communications Research) • Participants sent email to videos@bellcore.com • System replied with a list of 500 movies to rate on a 1-10 scale (250 random, 250 popular) • Only subset need to be rated • New participant P sends in rated movies via email • System compares ratings for P to ratings of (a random sample of) previous users • Most similar users are used to predict scores for unrated movies • System returns recommendations in an email message.
Start your own business? Bookmark based recommendation