340 likes | 367 Vues
This book explores recommendation algorithms in user modeling and recommender systems, covering non-personalized, content-based, and collaborative approaches. It delves into the added value of recommender systems, predicting and making recommendations for users, caution in algorithm selection, and the intricacies of personalized recommendations. The text discusses non-personalized algorithms based on external data, including the simple mean and probability methods, and content-based algorithms such as the vector space model. It also introduces collaborative recommendation algorithms like user-based and item-based nearest neighbor recommendations, highlighting their benefits and limitations.
 
                
                E N D
User Modeling and Recommender Systems: recommendation algorithms Adolfo Ruiz Calleja 04/10/2014
Index • Introduction • Non-personalized recommender algorithms • Content-based recommender algorithms • Collaborative recommendation algorithms
Index • Introduction • Non-personalized recommender algorithms • Content-based recommender algorithms • Collaborative recommendation algorithms
Introduction: Added value of the Recommender Systems • Provision of personalized recommendations • Allowsto persuade eachcustomerwithpersonalizedinformation • Serendipitousdiscovery • Enables to dealwiththelongtail
Introduction: Recommender system schema USER ITEM Set of userattributes Set of userattributes Algorithm Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of itemattributes Set of userattributes rating
Introduction: Predictions and recommendations • Outputs of recommender systems • Prediction ≈ how much a user would like an item • Numeric scored related to the predicted opinion of the user about a specific item • Recommendations ≈ suggestion of things you may like • It is typically a list of items • Internally has to make some predictions
Index • Introduction • Non-personalized recommender algorithms • Simple mean • Probabilistic method • Content-based recommender algorithms • Collaborative recommendation algorithms
Not personalized recommender algorithms USER ITEM Set of userattributes Set of userattributes Algorithm Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of itemattributes Set of userattributes rating
Not personalized recommender algorithms • Based on External Community Data • Can know ephemeral information from the user • Example: Tripadvisor or Booking
Not personalized recommender algorithms •  Very simple algorithms •  They forget about the long tail •  When there are lot of raters predictions tend to median score • Self-selection bias • Diversity of raters •  Pretty bad accuracy
Index • Introduction • Non-personalized recommender algorithms • Content-based recommender algorithms • Explicit decision model • The vector space model • Collaborative recommendation algorithms
Content-based recommendation USER ITEM Set of userattributes Set of userattributes Algorithm Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of itemattributes Set of userattributes rating
Content-based recommendation • User model is built analyzing user preferences and item attributes • Hard to found massively used examples • Personalized news feeds
Explicit decision model •  Very well known method in many domains •  The decision tree can be automatically built • No need to formalize domain knowledge •  Can be used with small numbers of features • But recommender systems typically need very many •   They are almost never used
The vector space model • Which factors to consider in the item description? • Possibility to use keyword vector • It can be automatically extracted from text • But not only for textual items!! • We can aggregate keywords • But how? • How to normalize the vector space? • Hard if it is not automatically done • Term Frequency-Inverse Document Frequency • Do we trust on it?
The vector space model • How to build the user profile? • If I like it, it is important for me • Sometimes something I do not like may be relevant or viceversa • Problem of how to update user profiles • Are new items more important than previous ones? • Short term vs. Long term
The vector space model •  We do not need lot of users •  Easy to compute and simple to implement •  Flexible • Easy to integrate with other approaches • Quickly adapt to changes • :S Hard to find out the factors and their weights •  Cannot deal with subjective aspects of the items •  Competitor items are frequently retrieved •  Too simplified model • Results are not accurate as with other approaches
Index • Introduction • Non-personalized recommender algorithms • Content-based recommender algorithms • Collaborative recommendation algorithms • User-based nearest neighbor recommendation • Item-based nearest neighbor recommendation
Collaborative recommendation algorithms USER ITEM Set of userattributes Set of userattributes Algorithm Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of userattributes Set of itemattributes Set of userattributes rating
Collaborative recommendation algorithms • Item model is a set of ratings • User model is a set of ratings • Predominant paradigm
User-based nearest neighbor recommendation • Pearson correlation coefficient • There are other algorithms • But commonly provide less accurate results • Cosine correlation is becoming on fashion • Pearson correlation has some deficiencies • What if two users have few items in common? • What if the ratings are unary data? • What if something is loved or hated by the whole community?
User-based nearest neighbor recommendation • Processing time = O(N^2*M) • But not in real life • Neighborhood selection • 20 to 50 neighbors (sometimes up to 100) • Define number of neighbors or a threshold • Better processing time O(N*M) • Less noise • Reduce coverage
User-based nearest neighbor recommendation • Precomputed neighborhood • Better response time • Need to be frequently update (it is not a good idea to define clusters)
User-based nearest neighbor recommendation •   Very popular •  Based on subjective information •  Very many variants and possible configurations •  What do we do with new items? •  What do we do with new users? •  Need of (similar) users •  Data sparcity is a problem
Item-based nearest neighbor recommendation • Pearson correlation coefficient or cosine similarity • But now the neighborhood is formed by items!! • A model should be built • Processing time = O (I^2) • It is always precomputed • Do not need to save all the model • Memory used vs. accuracy and coverage • Items are much more stable that users • But they still need to be updated
Item-based nearest neighbor recommendation •  Efficient algorithm •  Scales very well •  Data sparcity is not a big problem •  Creates nice recommendation lists •  We still need to deal with the cold-start •  Memory use
User Modeling and Recommender Systems: recommendation algorithms Adolfo Ruiz Calleja 04/10/2014