300 likes | 427 Vues
Speaker: Nonhlanhla Shongwe. 18 January 2009. User-induced Links in Collaborative Tagging Systems. Ching -man Au Yeung , Nicholas Gibbins , Nigel Shadbolt CIKM’09. 2. Preview. Introduction Collaborative tagging User-Induced hyperlinks Similarity of Assigned Tags
E N D
Speaker: Nonhlanhla Shongwe 18 January 2009 User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09
2 Preview • Introduction • Collaborative tagging • User-Induced hyperlinks • Similarity of Assigned Tags • Association Rule Mining • Analysis of User-induced links • Tag Prediction • Discussion • Conclusion
3 Introduction • Hyper links • Makes navigation through the web possible • The author decides the document to link to • Due to the limited links that authors give, has lead to user-contributed content on the web. • In social bookmarking sites, e.g. Delicious • Users can maintain a collection of documents • URLs are identified by their chosen tags
4 Collaborative tagging (1/2) • Popular Tagging systems e.g. Delicious and LibraryThing • Allows users describe their favorite online resources using their own words • Eg http:///www.cnn.com tags new, tv, sports weather, travel • Advantages over traditional methods • Flexibility and freedom offered by these systems • Systems are quick to adapt to changes in the vocabulary among the users.
5 Collaborative tagging (2/2) • Collaborative tagging activities of participating user results in scheme called folksonomy • Folksonomy is divided into three types of elements • Users • Assign tags to the Web • Tags • Keywords chosen by users to describe and categorize a web document • Documents • Object tagged by the user
6 User-Induced hyperlinks • Two types of hyperlinks • For Navigation • For recommendation • Directs users to other documents that contain related information • Two different approached to discover implicit relations in folksonomy • Calculating the similarity between the sets of tags assigned to the document • Analyzing the collective behavior of the user who have tagged the document • User-induced Links are implicit links in a folksonomy as resulted from collaborative tagging activities by users
7 Similarity of Assigned Tags (1/4) • First approach of discovering user-induced links • Calculate the pair-wise similarity between documents based on their tags • Jaccard Coefficient • In IR, Cosine Similarity
8 Similarity of Assigned Tags (2/4) • Cosine Similarity
9 Similarity of Assigned Tags (3/4) • Second similarity function • The normalized discounted cumulative gain (NDCG) • used to evaluate ranking of documents according to their relevance score • Firstly list the tags of the two documents • Secondly, calculate the DCG at position p
10 Similarity of Assigned Tags (4/4) • Thirdly, iDCG • Finally, calculate the NDCG • Use a function
11 Association Rule Mining • Second approach of discovering user-induced links • Finding out pairs of Web documents that have both been tagged by the same group of users • Aims at identifying implicit patterns within a large database of transactions • Two major concepts • Support • confidence
12 Analysis of User-Induced Links (1/3) • Two methods described • Identify user-induced links in data collected Delicious • Compared them with existing hyperlinks in terms of several different aspects. • Several aspects to compare • Do they connect 2 documents from the same domain/website • Similarity between documents on the two ends of a link • Whether users are equally interested in the linked documents
13 Analysis of User-Induced Links (2/3) • Data collection • Data collected from Delicious • Documents cover a wide range of topics • Documents collected on per-tag basis • First collected at random 130 tags, popular tags • For each tag, crawl Delicious to obtain a set of documents and users that have tag the document.
14 Analysis of User-Induced Links (3/3) • Results • Identify user-induced links between the documents using the two methods • For similarity, vary the similarity threshold to 0.5 • For association Rule, set minimum support to 100 and vary the minimum confidence level • Findings • Very few user-induced links that supported confidence of 0.5 and above
15 Results (1/8)
16 Results (2/8)on Same Domain • One important function of hyperlinks • allow users to navigate from one hypertext document to another • More beneficial if the links point to some document outside external to the current website • Check whether the documents at the ends are from the same domain
17 Results (3/8) on Same Domain
18 Results (4/8)onCoincidence between existing hyperlinks and user-induced links • See whether such links already exist between the documents • If user-induced links coincide with existing hyperlinks • means that users are satisfied with the existing hyperlinks • If user-induces are mostly new, • means that there are user interests and perspectives that existing hyperlinks have note captures
19 Results (5/8) on Coincidence between existing hyperlinks and user-induced links
20 Results (6/8)on similarity and user preferences • Look at documents that are connected by user-induce links • Between blog posts of highly related topics • News articles on the same topics • Websites offering applications of similar functionalities • Q&A pages of some portal site • Two different approaches for generating user-induced links • Association rule, a link is generated if enough users are interested in two documents regardless of the similarity between them • Similarity based, generates links based on the tags assigned regardless of whether there are many users interested in the documents
21 Results (7/8) on similarity and user preferences
22 Results (8/8) on similarity and user preferences
23 Tags Prediction (1/3) • The analysis of user-induced links shows that links generated by association rule mining of user collections usually connect documents that are highly related to each other as judged by the similarity between their tags • To predict the tags • Identify the other documents that have a link to this document • The set of documents that have a link (dx)
24 Tags Prediction (2/3) • Firstly, consider a simple averaging method
25 Tags Prediction (3/3) • Secondly method of aggregation method
26 Experiments (1/2) • Measure the performance of the predictions • By using NDCG • Precision at the nth Term • NDCG was used • To investigate whether the predictions are accurate in terms of the ordering of the tags.
27 Experiments (2/2)
28 Discussion • Implicit relation between web documents can be discovered by • examining user preferences and • document similarity embedded in a folksonomy • User-induced are different from hyperlinks • Collaborative tagging environment • shows the differences between the perspective of Web authors and Web readers • Worthwhile considering an open hypermedia structure backed by a collaborative tagging system.
29 Conclusion • User-induced links, a form of implicit relations between documents • We used • Tag similarity • to generate many user-induced links • Association rule miming • to generate very high user-induced-links