reegan
Uploaded by
17 SLIDES
317 VUES
170LIKES

Advances in Response Prediction for Dyadic Data: Unsupervised Learning Techniques

DESCRIPTION

This work explores improved response prediction methods for dyadic data, where a "response value" is linked to pairs of objects. Key applications include social networks, internet advertising, and recommendation systems through unsupervised learning techniques such as collaborative filtering. Utilizing a Bregman co-clustering approach combined with a neural network model, the study analyzes datasets like MovieLens, providing insights into performance enhancement using ROC curves. The findings reveal that integrating linear and co-clustering features can yield variable effectiveness across different datasets.

1 / 17

Download Presentation
Télécharger la présentation

Advances in Response Prediction for Dyadic Data: Unsupervised Learning Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Response Prediction for Dyadic Data Nik Tuzov April 2008 http://www.stat.purdue.edu/~ntuzov/

  2. Dyadic Data • Means that a certain “response value” is associated with a pair of objects Applications: • Social networks • Internet advertising • Recommendation systems

  3. Unsupervised learning • Example: Collaborative filtering (MovieLens project) • Movie 1 is “similar” to 5, hence Y is likely “B” • Users 1, 2, 3 are “similar” to each other, hence X is likely “C” or “D”

  4. Co-clustering with Bregman differences • K*L rectangular clusters – direct products of row/column clusters

  5. Co-clustering with Bregman differences(example from http://videolectures.net/kdd07_agarwal_pdlfm/)

  6. PDLF-GLM Model(Agarwal & Merugu’07)

  7. Neural Network as alternative to GLM

  8. Algorithm

  9. Data: MovieLens • 20603 ratings, 346 users, 966 movies • From 1 to 198 ratings per movie, 32 to 105 ratings per user. • 50 covariates for each (user, movie) pair • 5700 observations held out for validation • Using area under Receiver Operating Characteristic (ROC) curve to measure performance

  10. Neural Network Topology

  11. Number of nodes? • 40 nodes appear enough (produce similar overfitting)

  12. Results

  13. New Covariates? Sample movies from the cluster with delta = -0.57 : • 756 ratings; 23 females and 55 males; No documentaries

  14. Contribution to ROC

  15. Is Neural Network useful? • Gain in ROC area depends on the order: extra linear features (n/network) are added first => gain from co-clustering is reduced • The opposite is also true • Hence, info in linear features is similar to that in clusters, so • For this dataset, n/network is not so helpful, but… • For other dyadic datasets, n/network can be a lot more useful

  16. Related Work • What if we want to predict response on (Web page, Search query, Web user) ? • B. Long, X. Wu, Z. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs. In KDD, 2006.

  17. Additional Info • To obtain a detailed report and Matlab code, please visit my website: http://www.stat.purdue.edu/~ntuzov/ • The project is posted in “Software skills / Matlab” section • Questions? Contact me on ntuzov@purdue.edu

More Related
SlideServe
Audio
Live Player
Audio Wave
Play slide audio to activate visualizer