150 likes | 408 Vues
Yahoo! Music Recommendations. Modeling Music Ratings with Temporal Dynamics and Item Taxonomy. Yahoo! Research. Outline. Features of Yahoo! Music Dataset Basic Latent Factor Model Improved modelling with Taxonomy Temporal Dynamics Experiments and Result.
E N D
Yahoo! Music Recommendations Modeling Music Ratingswith Temporal Dynamics and Item Taxonomy Yahoo! Research
Outline • Features of Yahoo! Music Dataset • Basic Latent Factor Model • Improved modelling with • Taxonomy • Temporal Dynamics • Experiments and Result
Features of Yahoo! Music Dataset • 1st track of KDDCUP 2011 • Ratings over nearly 10 years • Items: 624,96 • Users: 1,000,990 • Ratings: 262,810,175 • Ratings were given to 4 different types • Track (item to be recommended) • Artist, Album, and Genre
Basic Latent Factor Model • Regularized SVD • Each user and item in same latent factor space • Rating as cosine similarity (dot product) • User and item become singular vectors • Stochastic Gradient Descent • Go through each rating and iterates • Good performance with sparse rating matrix
Incorporate Dataset Features • Two dimension • Bias • Personalization • Two effects • Taxonomy • Temporal effect • User session • Long termdynamics
Bias Modelling • Why bias, and basic bias model • Taxonomy biases • Item based • User based • Temporal effect • User session • Long term
Why Bias Modelling? • Lack of personalization • Not saying they are of no importance! • Netflix Prize Data • 52.9% of observed variance is explained • 41.4% by user and item bias • 11.5% by personalization • Separate changing effects from those unchanged
Bias with Taxonomy • Observation • Item biases share components by taxonomy! • Item bias • Album • Artist • Genre • User bias • Personal taste of a particular type affect all songs of this type
Bias with Temporal Effects • User session • Drifting effect: context of ratings • Human are more capable on comparing rather than rating on absolute scale: more on that later • History session bias is discarded as noise • Current session bias is retained for prediction
Bias with Temporal Effects • Long term dynamics of items • New songs’ ratings have different patterns compared to old songs • Steady after 360 weeks
Personalization • Apply the same techniques to personalization • Taxonomy for items • Model album/artist effect independently • Genre did not improve result • Session specific factors for users • Sudden change of user’s taste • History session factors are discarded as noise • Keep current session factor for prediction
Learning the Model • What parameters to learn? • Two Phase Learning • Bias, User, Item…Phase 1 • Session specific parameters…Phase 2 • Stochastic Gradient Descent • Go through each rating available • L2 regularized • Cyclic iteration: sweep forward/backward alternatively to avoid discontinuity across iterations
Result Analysis • Bias takes ~80% of explained variance • Taxonomy: reduces 0.6 • User session: reduces 0.9 • Long term: reduces 0.1 • Latent factors did not improve much • Taxonomy: reduces 0.2 • User session: reduces 0.2
Result Analysis • Released prior to KDDCUP 2011 • Best score by NTU on 1st track: • Single model: 22.90 (lower than this model, 22.59) • Ensemble: 21.01 (blending/stacking is key)