1 / 34

William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui

Mining customer ratings for product recommendation using the support vector machine and the latent class model. William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui. Intelligent Systems Research Group, BT Laboratories. Hong Kong Baptist University. Records of other customers

casey
Télécharger la présentation

William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin H. Law, Kwok-Ching Tsui Intelligent Systems Research Group, BT Laboratories Hong Kong Baptist University

  2. Records of other customers (possibly with ratings) Recommender System . . . What is a Recommender System?

  3. Product Recommendation in E-commerce Products Recommendations www.amazon.com

  4. Product Recommendation in E-commerce Products Recommendations www.cdnow.com

  5. Content-based Recommender System The Support Vector Machine (SVM) The Extended Latent Class Model (ELCM) Personal Profile Records of other customers (possibly with ratings) Collaborative Recommender System . . . Ratings Ratings Overview

  6. Presentation Outline • Content-based Recommendation • Existing Solutions and Their Limitations • Our Proposed Solution - the SVM • Collaborative Recommendation • Existing Solutions and Their Limitations • Our Proposed Solution - the Extended LCM • Experimental Evaluation • Conclusion and Future Works

  7. Content-based Recommender System Personal Profile Content-based Recommendation • Matching between the personal profile and the features extracted from product descriptions. • Assumptions: • Customer personal profiles are available. • Detailed product descriptions are available so that a set of representative features can be extracted. • Both the profiles and the product descriptions share the same representation.

  8. Some Existing Solutions • Keyword Matching • problems of synonymy and polysemy. • Pattern Classification Approaches • f={ f1(y), f2(y), … fm(y)} the set of features for product y • ax(f(y)) the classifier output for customer x’s interest obtained via training, such that • Examples of classifiers: • Naïve Bayes, k-NN, C4.5 (decision tree)

  9. Feature Selection Problem • The performance of content-based recommendation depends heavily on the discriminative power of the features selected to be extracted. • Too few features => hard to learn useful profiles (shallow analysis) • Too many features => hard to estimate the classifier’s parameters with good generalisation performance.

  10. Our Proposed Solution - the use of SVM • The Support Vector Machine has been shown to be able to achieve good generalisation performance for classification of high-dimensional data sets and its training can be framed as solving a quadratic programming problem. • => ones can simply use all extracted features for the input and there is no need for feature selection at all.

  11. Pattern Classification...

  12. Which line is the best?(Training and Generalization)

  13. Support Vector Machine (SVM) • Intuitively, maximize the margin between classes • Theoretically sound • related to minimizing the VC-dimension under the theory of structural risk minimization margin

  14. Solving for the line • Computationally, this leads to a quadratic programming problem • maximize a quadratic objective function subject to some linear constraints • no local maximum (cf neural networks)

  15. Support Vectors • The line depends only on a small number of training examples.

  16. Nonlinear Cases • use another coordinates system such that the “curve” becomes a “line”

  17. Kernels • Only inner products, (x)T (y) , are involved in the calculation • Under certain conditions, there exists a kernel K such that K(x,y)=(x)T (y) • e.g. Polynomial of degree d: K(x,y)=(xTy+1)d • replace xTy by (x)T (y)

  18. Overlapping Cases • Impossible to perfectly separates the two classes • Include an error term • Instead of maximizing margin, minimize error +  / margin • Again, involves only quadratic programming

  19. Records of other customers (possibly with ratings) Collaborative Recommender System . . . Product Ratings Product Ratings Collaborative Recommendation • Matching between the customer’s ratings with the ratings of others (the word-of-mouth approach). • Assumptions: • Customer ratings of a reasonably large group of customers are available. • Each product has been rated by some of the customers. • The product ratings are overlapping to certain degrees.

  20. Some Existing Solutions • Memory-based Approach • Pearson Correlation Coefficient • … and its variants • suffer from the sparsity and the first-rater problems. • Model-based Approach • solve the sparsity problem by incorporating a priori models. • E.g., Naïve Bayes Classifier, Bayesian Network, Latent Class Model

  21. Limitations • The sparsity problem (lacking sufficient ratings) • The first-rater problem (encountering new products) 5 - - 4 - - - - Customer x1 - 5 4 - - - - - Customer x2 1 - 4 - 4 - - - Customer x3 5 - - - - - - - A New Customer xn

  22. Recommended ! Recommended ! Grouping Preference Ratings - to solve the sparsity problem Preference Pattern #1 Preference Pattern #2 5 - - 4 - - - - Customer x1 - 5 4 - - - - - Customer x2 1 - 4 - 4 - - - Customer x3 5 - - - - - - - A New Customer xn

  23. Recommended ! Integrating Product Contents - to solve the first-rater problem Preference Pattern #1 Preference Pattern #2 5 - - 4 - - - - Customer x1 - 5 4 - - - - - Customer x2 1 - 4 - 4 - - - Customer x3 5 - - - - - - - A New Customer xn

  24. Our Proposed Solution - the use of LCM • The latent class model has been proposed by Thomas Hofmann et al. in IJCAI’99 for clustering preference ratings with promising results. • Limitation:only capable of recommending products to customers in the training set. • We extend their model so that • a) Existing products can be recommended to the customers not in the training set • b) New products can be recommended to the existing customers (not described in the paper).

  25. Observed Hidden Customer X Preference Pattern Z Product Y Latent Class Model Model Training: Learn P(z), P(x|z) and P(y|z) using the EM algorithm. The model initialization is done by the K-means clustering.

  26. Existing Products to Existing Customers • Compute the probabilities that x is interested in y • Products can then be sorted according to the values of P(y|x) for recommendation.

  27. Inner product ofthe pdf of pattern z and the ratings of xn. Extension 1: Existing Products to New Customers xn is not inside the training set. Thus, we don’t have P(z|xn).

  28. distance between yn and z in the feature space Extension 2: New Products to Existing Customers yn is not inside the training set. Thus, we don’t have P(yn|z).

  29. Performance Measures • accuracy: the percentage of correct recommendations • recall: the percentage of interesting products that can be located in the output list • precision: the percentage of products in the output list which are really interesting to the customer. • break-even point: The point where recall = precision • expected utility: • its value is high if the products rated high appear early in the output list.

  30. Experiment One: Setup(content-based by SVM) • Product ratings data set • EachMovie (from DEC) • Product description data set • Internet Movie Database (http://www.imdb.com) • Size of feature set = 6620, including • Release date, Runtime, Language, Director, Producer, Original music, Writing credit, ... • No. of products = 1628 • 5-fold cross-validation • ~1200 for training and remaining for testing • No. of customers = 100

  31. Experiment One: Results(content-based by SVM)

  32. Experiment Two: Setup(collaborative by ELCM) • Ratings data set • EachMovie (from DEC) • Training • No. of products = 500 • No. of customers = 90 • Testing • No. of customers = 10 • No. of products = 250 • Size of the product set where ratings are considered for matching, L = {10, 63, 83, 125, 250}

  33. Experiment Two: Results(collaborative by ELCM)

  34. Conclusion and Future Works • SVM and ELCM are empirically shown to be promising for content-based recommendation and collaborative recommendation, respectively. • Future works • ELCM • Model Enhancement - BiELCM, hierarchical, ... • Scalability issue of the EM algorithm for ELCM • Modelling dynamic preference patterns • Applications to cross-selling? • Integration of SVM and ELCM for improvement

More Related