1 / 26

Introduction to Recommender System

Introduction to Recommender System. Guo , Guangming guogg.good@gmail.com. Outline . Background & Definition Some history worth noting Various applications Main-stream approach Evaluation Some resources. Outline . Background & Definition Related areas Challenges Paradigms

mimi
Télécharger la présentation

Introduction to Recommender System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Recommender System Guo, Guangming guogg.good@gmail.com

  2. Outline • Background & Definition • Some history worth noting • Various applications • Main-stream approach • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  3. Outline • Background & Definition • Related areas • Challenges • Paradigms • Some history worth noting • Various applications • Main-stream approach • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  4. Become clear with basic concepts • First step of learning • Building blocks of new ideas • Define the rules to play with • Prerequisites for communication Lab of Semantic Computing and Data Mining

  5. Definition of Recommender Systems • Also named recommendation systems • A subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item (such as music, books, or movies) or social element (e.g. people or groups) they had not yet considered, using a model built from the characteristics of an item (content-based approaches) or the user's social environment (collaborative filtering approaches). --http://en.wikipedia.org/wiki/Recommender Lab of Semantic Computing and Data Mining

  6. More truth • Important vertical technique in data mining • One of the most success solution for industry • Became an independent research area in 1990s • Many highly reputed academic conferences such as SIGIR, KDD, ICML, WWW, EMNLP et al. have it as their subtopics. • RecSys is fully devoted to this area • Data mining/machine learning approach • 1) specifying heuristics that define the utility function and empirically validating its performance • 2) estimating the utility function that optimizes certain performance criterion, such as the mean square error. Lab of Semantic Computing and Data Mining

  7. Chanllenges • Cold start • Long tail • Data sparsity • Scalability • Social & Temporal • Context-aware • Personality-aware • Being accuracy is not enough Lab of Semantic Computing and Data Mining

  8. Related Research Area • Cognitive science • Text mining • Natural Language Processing • Information retrieval • Machine learning • Association mining • Approximation theory • Management science • Consumer choice in marketing Lab of Semantic Computing and Data Mining

  9. Paradigm of RecSys • Content-based recommendations: • recommended items similar to the ones the user preferred in the past; • Collaborative recommendations: • recommended items that people with similar tastes and preferences liked in the past; • Knowledge-based recommendations: • recommended items based existing knowledge models that fit the needs of users • Hybrid approaches: • Combination of various input data or/and composition various mechanism Lab of Semantic Computing and Data Mining

  10. Background • Universe Problem in Information Age • Information overload • From SE to Recsys • pull vs. push • Web 1.0 vs. web 2.0 • Leverage the existing user generated data • User profile • Behavior history on the web,Rating • Click through data, browse data • Great benefits(win-win) • Help users find valuable information • Help business make more profits Lab of Semantic Computing and Data Mining

  11. Outline • Background & Definition • Some history worth noting • Netflix prize • Various applications • Main-stream approach • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  12. A peak in the history • Research on collaborative filtering algorithm reached a peak during the Netflix movie recommendation competition • October 2, 2006 ~ September 21, 2009 • RMSE • Must outperform baseline by 10% Lab of Semantic Computing and Data Mining

  13. The Million Dollar Programming Prize • The Netflix Prize • Greatly energize the research in Recsys • Last from 2006 to 2009 • Finalist: BellKor’sPragamatic Chaos team • A joint-team • Andreas Töscher and Michael Jahrer ( Commendo Research &Consulting GmbH), originally team BigChaos • Robert Bell, and Chris Volinsky (AT& T), Yehuda Koren (Yahoo),originally team BellKor • Martin Piotte and Martin Chabbert, originally team Pragmatic Theory • The ensemble Team • The most accurate algorithm in 2007 used an ensemble method of 107 different algorithmic approaches Lab of Semantic Computing and Data Mining

  14. Outline • Background & Definition • Some history worth noting • Various applications • Main-stream approach • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  15. Existing applications • News/Article recommendation • Targeted Advertisement • Tags Recommendation • Mobile Recommendation • E-commerce • Books, movies, music… Lab of Semantic Computing and Data Mining

  16. Benefits • Alternative to Search Engine • Boost the profit • Amazon et al. • Better user experience Lab of Semantic Computing and Data Mining

  17. Outline • Background & Definition • Some history worth noting • Various applications • Main-stream approach • Content-based • Collaborative filtering • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  18. Content-based • Simple compute the similarity • Cosine similarity or pearson correlation coefficient • TF-IDF • Utilize dimensionality reduction • LDA Lab of Semantic Computing and Data Mining

  19. Collaborative filtering • Association mining • Memory-based • Nearest-neighbors • Model-based • Latent fator model • Some comparison • Space & time • Theory foundation and interpretability Lab of Semantic Computing and Data Mining

  20. Latent factor model • LSI, pLSA, LDA, latent class model, Topic model et al. • A method based on matrix factorization/decomposition where R is the rating matrix, P and Q are sub-matrix after dimension reduction An low-rank approximation of the original matrix Lab of Semantic Computing and Data Mining

  21. Computations • Traditional SVD • Needs a simple method to complete the matrix • Cost on the completed dense matrix is very high • Situation changed in 2006 after the Netflix Prize • Simon Funk • Defined a cost function on the training data • To avoid overfitting, add regularization term • Gradient descent to optimize C(p,q) Lab of Semantic Computing and Data Mining

  22. Outline • Background & Definition • Some history worth noting • Various applications • Main-stream approach • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  23. Evaluation Criterion • User satisfaction by quesionnaire • Precision • RMSE • Top-k • Coverage • Diversity • Novelty • Serendipity • Originally thinking recommendation has non-sense • … Lab of Semantic Computing and Data Mining

  24. Outline • Background & Definition • Some history worth noting • Various applications • Main-stream approach • Evaluation • Some resources Lab of Semantic Computing and Data Mining

  25. 葫芦项亮 Lab of Semantic Computing and Data Mining

  26. Resources • www.recsyswiki.com • 各大推荐引擎资料汇总 by 大魁 • http://blog.csdn.net/lzt1983/article/details/7914536 Lab of Semantic Computing and Data Mining

More Related