Tag-based Contextual Collaborative Filtering

Tag-based Contextual Collaborative Filtering Reyn Nakamoto, Shinsuke Nakajima, Jun Miyazaki, Shunsuke Uemura Nara Institute of Science and Technology IAENG International Journal of Computer Science 2008 Presented by Jae-won Lee

Introduction • Traditionally, Collaborative Filtering based systems are widely used • However, CF systems have a weakness • They do not take into consideration the context in which a resource was liked • For example, a user may like a resource because it is funny and interesting, while another user may like it because it was informative • In this paper, social tag are assumed to represent the context of the tags are attached • Tag-based contextual CF means the combination of traditional CF and tagging systems Center for E-Business Technology

Related work • Collaborative Filtering Systems • If users prefer the same items, then their preference will be similar for other items • Usesent, movie, product recommendation systems • Social Tagging Systems • In previous studies, tags are generally used for tag searching, user profile matching and subsequent recommendation • Tags provide the clues as to why a user liked something Center for E-Business Technology

Tag-based Contextual Filtering • Tag-based Contextual Collaborative Filtering (TCCF) • The combination of traditional CF and social tagging system • Taking into consideration the context of the preference • TCCF uses tags as the indicator that a user likes something, while traditional CF uses numeric ratings • Process • A user visits a resource such as a website • If the user likes a resource, he bookmarks it with tags • These tags explain what the resource meant to him • The system calculates user similarity to others based on their common bookmarks and tags (section Contextual CF user similarity model) • The system calculates the predicted scores for yet unrated resources (Section Contextual CF score prediction model) • The system recommends new resources to the user (Section Contextual CF recommendation) Center for E-Business Technology

Contextual CF User Similarity Model • Traditional CF User Similarity • For simplicity, we use binary rating such as 1 and “-” • Traditional CF does not tell as to why the user likes something • However, tags provide more insight as to why the user may have liked it • One user tags it with “informative”, while the other tags it with “funny” • Users may follow the same thing, but follow it for different reasons Center for E-Business Technology

Contextual CF User Similarity Model • TCCF • User similarity between a user A and a user B is calculated as follows • Where n is the number of commonly tagged resources between users • TAk is the tag vector that user A used for commonly tagged resource k • TBk is the tag vector that user B used for commonly tagged resource k Center for E-Business Technology

Contextual CF User Similarity Model • Example Compute User Similarity between User B and User C  Since user B and user C only have one commonly tagged link (resource 3), thus n = 1  the cosine similarity between two tag vectors is 0.5  therefore, simccf(B, C) = 0.91 By using tags as a clue of context, B and C are the most similar users Center for E-Business Technology

Contextual CF User Similarity Model • TCCF • Users that have bookmarked the same resource are considered similar • Particularly, the similarity is higher if the tags used to describe the resource are similar • Weaknesses of this model • This model assumes that most users would only bookmark a resource if and only if they like the resource • However, the lack of a bookmark does not necessarily mean dislike • There are the natural language issues that exist with tagging sites • Issues like synonymy and polysemy may occur • Would different users use tags for the same purpose ? Center for E-Business Technology

Contextual CF Score Prediction Model • Traditional CF • The predicted score for some user A for an unevaluated resource x is calculated by the following • Sk isa user in the set of all users with a similar score with user A • Score Prediction for the previous example • Traditional CF works well when recommending resources in the same domain Center for E-Business Technology

Contextual CF Score Prediction Model • While a user may have the same preference for one domain (context), the preference may not carry over to another • e.g. Although two users may both like reading Harry Potter books, this does not necessarily mean that if one likes soccer, the other will too • Considering context when predicting a score for a user • In , m is the number of commonly tagged resources that user Sk haswith user A • This returns the tag vector of a commonly tagged resource which has the highest similarity to the tag vector of the target resource, TSkx Center for E-Business Technology

Contextual CF Score Prediction Model • Example • We want to predict user C’s score for resource 2, a resource in which he has not yet tagged • - In previous example, using the context CF user similarity scores, user B has a high similarity to user C - For user B, only resource 3 is commonly tagged with user C Thus, the tag vectors that B attached to resource 2 and 3 are as follows Center for E-Business Technology

Contextual CF Score Prediction Model • Example (cont’d) • Therefore, sim(TB3, TB2) = 0.41 • If user B had more commonly tagged resources with user C, similarity with that tag vector TBk, sim(TBk, TB2) is calculated and the highest similarity used in score calculation • Given the user similarity of B to C is 0.91, the final predicted score, Score (C, 2) = 0.71 • Conversely, we can compute user C’s score for resource 1, Score(C, 1) = 0.5 • Therefore, Resource 2 would be recommended over resource 1 • Previously, resource 1 and 2 had the same score Center for E-Business Technology

Conclusion • This paper describes a new contextual collaborative filtering model based on tagging information • Two areas of traditional CF have been changed • User similarity calculation • Score prediction calculation • In future, we will implement this model and further research into tag expansion through natural language processing methods Center for E-Business Technology

Opinion • Pros. • Easy to understand the model with examples • Interesting paper because the authors argue that tags reflect a user’s context • Cons. • No experiments • Is it reasonable to assume that tags are regarded as contextual information? • There is no solutions for solving the weakness of the proposed model • Ref. slide page 8 Center for E-Business Technology

Tag-based Contextual Collaborative Filtering