Recommender Systems

Recommender Systems Based Rajaraman and Ullman: Mining Massive Data Sets & Francesco Ricci et al. Recommender Systems Handbook.

Recommender System All of these thrive on User Generated Content (UGC)!

Recommender System • Central Theme : • Predict ratings for unrated items • Recommend top-k items

RS – Major Approaches • Basic question: Given (highly incomplete/sparse), given predict

RS – Approaches • Content-based: how similar is to items has rated/liked in the past? • Use metadata for measuring similarity. + works even when no ratings available on affected items. • Requires metadata! • Collaborative Filtering: Identify items (users) with their rating vector; no need for metadata; but cold-start is a problem.

RS – Approaches • CF can be memory-based (as sketched on p5): item ’s “characteristics captured by the ratings it has received (rating vector). • Or it can be model-based: model user/item’s behavior via latent factors (to be learned from data). • Dimensionality reduction • Original ratings matrix is usually (very) low rank.  Matrix completion: • using Singular value decomposition (SVD). • Using matrix factorization (MF) [and variants]. • MovieLens – example of RS using CF.

Collaborative Filtering

Key concepts/questions • How is user f/b expressed: ratings or implicit? • How to measure similarity? • How many nearest neighbors to pick (if memory- or neighborhood-based). • How to predict unknown ratings? • Distinguished (also called active) user and (target) item.

A Naïve Algorithm (memory-based) • Find top- most similar neighbors to distinguished user (using chosen similarity or proximity measure). • item rated by sufficiently many of these, compute by aggregating by chosen neighbors above. • Sort items with predicted ratings and recommend top- items to

An Example 4 5 1 5 5 4 2 4 5 3 3 • Jaccard(A,B) = 1/5 <2/4 = Jaccard(A,C)! • – OK, but ignores internal “rating scales”  easy/hard graders. • See the Rajaraman et al. book for “rounded” Jaccard/Cosine. • A more principled approach: subtract from each rating the corresponding user’s mean rating, then apply Jaccard/cosine.

An Example 2/3 5/3 -7/3 1/3 1/3 -2/3 -5/3 1/3 4/3 0 0 • See what just happened to the ratings! • Behavior and items more well-separated. • Cosine can now be + or -: check (A,B) and (A,C).

Prediction using Memory/Neighborhood-based approaches • A popular approach – using Pearson correlation coefficient. • where • i.e., cosine of the “vectors of deviations from the mean”. • – normalization factor = • See the RecSys handbook and [Adomavicius and TuzhilinTKDE 2005 for alternatives.

User-User vs Item-Item. • User-User CF: what we just discussed! • Item-Item – dual in principle: find items most similar to distinguished item ; for every user who did not rate the distinguished item but rated sufficiently many from the similarity group, compute • In practice, item-item has been found to be better than user-user.

Simpler Alternatives for Rating Estimation • Simple average of ratings by most similar neighbors. • Weighted average. • User’s mean plus offset corresponding to weighted average of offsets by most similar neighbors (Pearson!). • Or you can see the popular vote by most similar neighbors: e.g., has 5 most similar neighbors who have rated . • rated 1; rated 3; rated 4; rated 5. • Simple majority: • Suppose 1.0. Then ie-breaking arbitrary.

Item-based CF • Dual to user-based CF, in principle. • “People who bought also bought ”. • Natural connection to association rules (each user = a transaction). • Predict unknown rating of user on item as the aggregate of ratings by on items similar to • E.g., using mean-centering and Pearson correlation for item-item similarity, where mean rating of by various users and similarity b/w and and – the usual normalization factor.

Item-based CF Computation Illustrated • Similarities: computing sim. b/w all pairs of items is prohibitive! • But do we need to? • How efficiently can we compute the sim. of all pairs of items for which the sim. Is positive? X X X X …

Item-based CF – Recommendation Generation X X X X similar items? similar items? X XXXX How efficiently can we generate recommendations for a given user?

Some empirical facts re. user-based vs. item-based CF • User profiles are typically thinner than item profiles; depends on application domain. • Certainly holds for movies (Netflix). •  as users provide more ratings, user-user sim. can chage more dyamically than item-item sim. • Can we precompute item-item sim. and speed up prediction computation? • What about refreshing sim. against updates? Can we do it incrementally? How often should we do this? • Why not do this for user-user?

User & Item-based CF are both personalized • Non-personalized would estimate an unknown rating as a global average. • Every user gets the same recommendation list, modulo items s/he may have already rated. • Personalized clearly leads to better predictions.

Recommender Systems