Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems

Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems Vincent Schickel-Zuber, Boi Faltings[SIGKDD’07] Reporter: Che-Wei, Liang Date: 2008/04/10

Outline • Introduction • Background • Collaborative Filtering • Ontology Filtering • Learning the Ontologies • Clustering Algorithms • Learning Hierarchical Ontologies • Experiments • Conclusion

Introduction • Recommender system • Help people finding the most relevant items based on the preferences of the person and others. • Item-based collaborative filtering (CF) • Recommend items based on the experience of the user as well as other similar users.

CF constructs the item-item similarity matrix S →

Ontology • What is Ontology? • A Multi-inheritance graph structure • Edge represent feature, • Item is an instance of at least one concept

Ontology Filtering • Infer preference ratings of items based on the ratings of known items and the relative position in an ontology.

Outline • Introduction • Background • Collaborative Filtering • Ontology Filtering • Learning the ontologies • Clustering Algorithms • Learning Hierarchical ontologies • Experiments • Conclusion

Background • Users U= {u1,…,um} • Items I= {i1,…,in} • Ru,i=theratingassignedtoitemibyuseru

Collaborative Filtering (1/4) • Collaborative Filtering • Finding similar items • Combine similar items into a recommendation list • Assumption: similar users like similar items

Collaborative Filtering (2/4) • Top-N recommendation strategy 1. Compute pair-wise similarities in matrix R 2. Predict rating of an item i by using the k most similar items to i (i’s neighborhood) 3. Select best N items

Collaborative Filtering (3/4) →

Collaborative Filtering (4/4) • Reduce the search space! • But • Search space remain huge an unconstrained • Require user to rate many items to find highly correlated neighbors. • Greatly influenced by the size of the item’s neighborhood.

Ontology Filtering (1/3) • Two input: • Users’ historical data R • An Ontology modeling the domain • Defining the ontology usually not made explicit • wine by color => white and red bytaste?

Ontology Filtering (2/3) 1. Compute a-priori score, APS(c) , nc is number of descendants of concept c 2. Infer ratingby α(y,lca)β(x,lca) • OSS-findtheclosestconceptxtoanygiveny

Ontology Filtering (3/3)

CF vs. OF

Outline • Introduction • Background • Collaborative Filtering • Ontology Filtering • Learning the ontologies • Clustering Algorithms • Learning Hierarchical ontologies • Experiments • Conclusion

Clustering algorithm • Clusteringalgorithm • Fuzzyclustering,nearest-neighborclustering,hierarchicalclustering,artificialneuralnetworksforclustering,statisticalclustering. • Hierarchical algorithm • Distance-based clustering • Conceptual-based clustering

Hierarchical algorithm dendrogram

Distance-based Clustering • Distance-basedclustering • Agglomerative clustering • bottom-up • Computeallpair-wisesimilaritiesO(n2) • Partitional clustering • top-down • Lowcomplexity

Concept-Based clustering • Concept-Based clustering • Items need to be represented by a set of attribute-value pairs. • Ex:mammal(body cover,heartchamber,bodytemperature)= (hair, four, regulated) • COBWEB • Classificationtreeisnotheight-balanced • Overallcomplexityisexponentialto#attributes.

Learning Hierarchical Ontologies (1/5) • Userscanbecategorizedindifferentcommunities. • Oneontologyforallusersisnotappropriate • Selectbetterontologytousebasedonuser’spreferences.

Learning Hierarchical Ontologies (2/5) • GenerateawholesetofontologiesΛ

Learning Hierarchical Ontologies (3/5)

Learning Hierarchical Ontologies (4/5) • Findconceptproblem • Ins(y|x),ifconceptsrepresentstheitemslikedaretoodistantfromdislikedones? • Algorithm2 1.Selectasubsetofontologiesthatperformbest 2.Selectontologyminimizesthedistancebetweenlikedanddislikedconceptsfortheselectedontologies.

Learning Hierarchical Ontologies (5/5)

LearningMulti-HierarchicalOntologies • Someproblem • Implicitfeature • Limitconceptrepresentation • LimitOF’sinferenceprocess • Ignoreotherpossiblesuboptimalcandidates • Improve:slightlyincreasethesearchspace

Classicalagglomerativeclusteringwithcomplete-linkcriterionfunctionClassicalagglomerativeclusteringwithcomplete-linkcriterionfunction

Experiments • Two data sets: • MovieLens • Rating 943 real users on at least 20 movies. • Total 1682 movies, 19 themes. • Jester • Rating on jokes collected over a period of 4 years. • Contains 24,983 users, 100 jokes.

Evaluating Recommendation Algorithm • RS:recommendationsetRS • Nok:#(Relevantitems) • Nr:#(RelevantitemsinthedatabaseN) • UseF1metric

Hierarchical Clustering Analysis

Hierarchical Clustering Analysis • Execution time in seconds required for the clustering algorithm to generate the ontology.

Hierarchical Clustering Analysis

Multi-Hierarchical Clustering Analysis • Tradeoffbetweenpredictionaccuracyandontologyquality.

Multi-Hierarchical Clustering Analysis

RecommendationAccuracy

Conclusions • Introduce three algorithms • Learns a set of ontologies based on some historical data. • Capable of selecting which one to use based on the user’s perference • Building a multi-hierarchical ontology based on a predefined window size • Experimental results on two famous data sets showed that can produce good ontologies and increase the prediction accuracy. • The learnt ontologies can even outperform traditional item-based collaborative filtering.

Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems

Using Hierarchical Clustering for Learning the Ontologies used in Recommendation Systems

Presentation Transcript

Performance guarantees for hierarchical clustering

Hierarchical Clustering

Modular and hierarchical learning systems

Personalized Recommendation in Social Tagging Systems Using Hierarchical Clustering

Hierarchical Clustering

Hierarchical Clustering

Hierarchical Clustering

Bayesian Hierarchical Clustering

Hierarchical Clustering

Hierarchical Clustering

Hierarchical Clustering in R

DOCUMENT CLUSTERING USING HIERARCHICAL ALGORITHM

Hierarchical Clustering

TOWARDS HIERARCHICAL CLUSTERING

Hierarchical Clustering

Hierarchical Clustering

Machine-learning algorithms are often used in recommendation

Hierarchical Clustering

Bayesian Hierarchical Clustering

Hierarchical Clustering