1 / 33

Online Learning for Collaborative Filtering

Online Learning for Collaborative Filtering. Guang Ling, Haiqin Yang, Irwin King, Michael Lyu Presented by Guang LING. Outline. Introduction PMF and RMF Online PMF and Online RMF Experiments and Results Conclusion and Future Work. Introduction.

hua
Télécharger la présentation

Online Learning for Collaborative Filtering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Online Learning for Collaborative Filtering Guang Ling, Haiqin Yang, Irwin King, Michael Lyu Presented by Guang LING

  2. Outline • Introduction • PMF and RMF • Online PMF and Online RMF • Experiments and Results • Conclusion and Future Work

  3. Introduction • We face unprecedentedly large amount of choice! • Search Vs. Recommend

  4. Introduction • Recommender system emerged • Content based filtering • Analyze item content • Collaborative filtering • Rating based

  5. Introduction • Collaborative filtering • Allow user to rate items • Infer user’s taste and item’s feature based on ratings • Match user’s preferences with item’s features

  6. Introduction • Various methods have been developed • Memory based • User based • Item based • Model based • PMF, RMF • PLSA, PLPA • So, what is the problem?

  7. Introduction • Unrealistic assumptions • All ratings are available • There will be no new rating • Data set are small enough to be handled in main memory • Reality • Ratings are collected over time • New ratings are received constantly • Huge data set cannot be easily handled

  8. Introduction • We propose online CF algorithm that • Obviate the need to hold all data • Make incremental changes based solely on new rating • Scale linearly with the number of ratings • Extra features • Command explicit regularization effect

  9. PMF and RMF • Matrix factorization models • Factor R into U and V • Minimize • Square loss: PMF • Cross entropy: RMF No. users No. items

  10. PMF • Conditional distribution over observed ratings: • Spherical Gaussian priors on user and movie feature vectors: • Maximize posterior:

  11. PMF • Maximize • Equivalent to minimize the following loss: • Using gradient descent to minimize loss: Squared loss Regularization

  12. RMF • Top one probability • The probability that an item i being ranked on top • Minimize cross entropy • Cross entropy measures the divergence between two distributions • Un-normalized KL-divergence

  13. RMF • Model loss is defined as: • Using gradient descent to minimize: Cross entropy Regularization

  14. Online PMF • We propose two online algorithms for PMF • Stochastic gradient descent • Adjust model stochastically for each observation • Regularized dual averaging • Maintain an approximated average gradient • Solve an easy optimization problem at each iteration

  15. Stochastic Gradient Descent PMF • Recall the loss function for PMF • Squared loss can be dissected and associated with each observation triplet • Update model using gradient of this loss:

  16. Regularized Dual Averaging PMF • Maintain the approximated average gradient Number of items rated by u Previous gradient Gradient due to new observation

  17. Regularized Dual Averaging PMF • Solve the following optimization problem to obtain • New user feature vector • New item feature vector

  18. Online RMF • Similar to online PMF, we propose two online algorithms for RMF • Stochastic Gradient Descent • Regularized Dual Averaging • However, the challenge is • Loss function cannot be easily dissected

  19. Online RMF • Recall the loss function for RMF • When a new observation is revealed • Loss due to new item • Decay of previous items

  20. Online RMF • We approximate the gradient by Decay previous gradient Gradient with respect to new item Decay previous gradient Gradient with respect to new item

  21. Online RMF • Stochastic Gradient Descent RMF • Dual Averaging RMF

  22. Experiments and Results • Online Vs. Batch algorithms • Performance under different settings • Sensitivity analysis of parameters • Scalability to large dataset

  23. Evaluation Metric • Root Mean Square Error(RMSE) • The lower the better • Normalized Discounted Cumulative Gain(NDCG) • The higher the better

  24. Online Vs. Batch algorithms • We conduct experiments on real life data set • MovieLens: movie rating data set • 6,040 users • 3,900 movies • 1,000,209 ratings • 4.25% of user-item rating matrix is known • Simulate three settings • T1: 10% training, 90% testing • T5: 50% training, 50% testing • T9: 90% training, 10% testing

  25. Online Vs. Batch algorithms • Shown below is the PMF result T1 T5 T9

  26. Online Vs. Batch algorithms • Shown below is the RMF result T1 T5 T9

  27. Impact of in PMF • denote the regularization parameter • Observation • Fewer training data needs more regularization • Results are quite sensitive to regularization SGD-PMF DA-PMF

  28. Impact of in RMF • denote the regularization parameter • Observation • Fewer training data needs more regularization SGD-RMF RDA-RMF

  29. Impact of learning rate • We use to denote the learning rate • It is used in stochastic gradient descent algorithms only SGD-RMF SGD-PMF

  30. Scalability to large dataset • Yahoo! Music dataset • Largest CF dataset publicly available • 252,800,275 ratings • 1,000,990 users • 624,961 items • Rating value range [0, 100]

  31. Scalability to large dataset • Experiment environment • Linux workstation (Xeon Dual Core 2.4 GHz, 32 GB RAM) • Batch PMF: 8 hours for 120 iteration • Online PMF: 10 minutes T1 T5

  32. Conclusion and Future Work • We proposed online CF algorithms • Perform comparable or even better than corresponding batch algorithms • Scales linearly with number of ratings • Adjust model incrementally given new observation • Future Work • Theoretical bound for convergence rate • Find better approximation for average gradient of RMF

  33. Thanks! • Questions?

More Related