1 / 1

Kostadin Georgiev, VMware Bulgaria Preslav Nakov , Qatar Computing Research Institute

A non-IID Framework for Collaborative Filtering with Restricted Boltzmann Machines. Kostadin Georgiev, VMware Bulgaria Preslav Nakov , Qatar Computing Research Institute ICML, June 17, 2013, Atlanta. 5. Experiments & Evaluation. 1. Overview.

analu
Télécharger la présentation

Kostadin Georgiev, VMware Bulgaria Preslav Nakov , Qatar Computing Research Institute

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A non-IID Frameworkfor Collaborative Filtering with Restricted Boltzmann Machines Kostadin Georgiev, VMware Bulgaria PreslavNakov, Qatar Computing Research Institute ICML, June 17, 2013, Atlanta 5. Experiments & Evaluation 1. Overview • We propose a non-IID framework for collaborative filtering • based on Restricted Boltzmann Machines (RBM) • models both user-user and item-item correlations • uses real values in the visible layer (as opposed to multinomial) to model user-item ratings, thus taking advantage of the natural order between them • We further explore the potential of combining the original training data with data generated by the RBM-based model itself in a bootstrapping fashion • Evaluation: on two MovieLens datasets - with 100K and 1M user-item ratings, respectively • Results: Rival the state-of-the-art Data • Evaluation • Two MovieLens datasets: • 100k: • 1,682 movies assigned • 943 users • 100,000 ratings • sparseness: 93.7% • Each rating is between 1 (worst) and 5 (best) • Mean Absolute Error (MAE): • Cross-validation • 5-fold • 80%:20% training:testing data splits • 1M: • 3,952 movies • 6,040 users • 1 million ratings • sparseness: 95.8% - Item-based RBM model outperforms user-based, but not by much. - Hybrid item-based RBM + NB model is relatively insensitive to number of units. 2. User/Item-based RBM Model • The visible layer • represents either all ratings by a user or all rating for an item • units model ratings as real values (vs. multinomial) • noise-free reconstruction is better 3. Non-IID Hybrid RBM Model In the IID case, multinomial visible units are better than real-valued. In the non-IID case: real-valued visible units outperform multinomial. 6. Comparison to Previous Work MovieLens 1M • Remove the IID assumption for the training data • Topology: Unit vij is connected to two independent hidden layers: one user-based and another item-based. • Missing values (ratings): the generatedpredictions are used during testing, but are ignored during training • Training procedure: we average the predictions of the user-based and of the item-based RBM models MovieLens 100k 4. Neighborhood + I-RBM Use the I-RBM predictions from a neighborhood-based algorithm: 7. Conclusion & Future Work However, compute the averages from the original ratings: • Conclusion • proposed a non-IID RBM framework for collaborative filtering • rivals the best previous algorithms, which are more complex • Future work • add an additional layer to model higher-order correlations • add content-based features, e.g., demographic

More Related