1 / 23

Towards Ontology Learning from Folksonomies

Towards Ontology Learning from Folksonomies. Jie Tang * , Ho-fung Leung # , Qiong Luo + , Dewei Chen * , and Jibin Gong * * Dept. of Computer Science and Technology, Tsinghua University # Dept. of Computer Science and Engineering, The Chinese U. of Hong Kong

gagan
Télécharger la présentation

Towards Ontology Learning from Folksonomies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Ontology Learning from Folksonomies Jie Tang*, Ho-fung Leung#, Qiong Luo+, Dewei Chen*, and Jibin Gong* *Dept. of Computer Science and Technology, Tsinghua University #Dept. of Computer Science and Engineering, The Chinese U. of Hong Kong +Dept. of Computer Science, Hong Kong U. of Science and Technology July. 14th 2009

  2. Motivation • The Semantic Web aims to provide a Web environment in which each Web document is annotated with machine-readable metadata (e.g., concept from an ontology). • Manual annotation tool, e.g., Protégé (Noy, et al., IS’01) • Automatic annotation methods using ML, e.g., iASA (Tang, et al., JoDS’05), TCRF(Tang, et al., ISWC’06) • Folksonomy provides a way to annotate the Web… • , but a really free way…… • It also poses a big challenge in reliability and consistency due to the lack of terminological control. • This work aims to learn ontology from folksonomies

  3. Motivating Example

  4. Motivating Example • Several key challenges: • How to define this problem in a principled way? • How to model the synonym/hypernym/homonym between tags? • How to construct the hierarchical ontology according to the modeling results?

  5. Our Solution Divergence • Use topic to model tags and documents. • Define four divergence measures to estimate the difference between tags. • Present an algorithm to construct the hierarchical structure from the tags. tags documents Topic

  6. Outline Related Work Our Approach Modeling Folksonomy Divergence Estimation Hierarchical Structure Construction Experiments Conclusion & Future Work

  7. Previous Work • Ontology learning from text • WebOntEx (Han and Elmasri, 03); • Protégé plug-in (Buitelaar et al., 99); • (Maedche and Staab, 2001; Sleeman et al., 03); etc. • Folksonomy integration • Learning syno-/hyper-nym between tags(Li et al., 07); • Clustering tags (Specia and Motta, 2007); • Learning hierarchical relations between tags (Zhou et al., 07); • Non-taxonomic relations (Mori et al., 06); etc. tags Topic documents • Topic models • PLSI (Hofmann, 1999); LDA (Blei et al., 03); Author-topic model (Steyvers et al., 04); etc.

  8. Outline Related Work Our Approach Modeling Folksonomy Divergence Estimation Hierarchical Structure Construction Experiments Conclusion & Future Work

  9. How to model tags and documents? • Input: Assume that a tag tiis used to annotate multiple documents and a document d contains a vector wdof Ndwords. Then a set of tags with the annotated documents can be represented as • Modeling: how to represent each document and each tag? and how to characterize the relationship between documents and tags? words tags Tag-Topic (TT) Models topic

  10. Generative Story of Tagging Generative process Document Latent Dirichlet Co-clustering We present a generative model for clustering documents and terms. Our model is a four hierarchical bayesian model. We present efficient inference techniques based on Markow Chain Monte Carlo. We report results in document modeling, document and terms clustering … NLP IR mining 0.23 clustering 0.19 classification 0.17 …. P(w|z) ML clustering DM inference Data mining NLP IR model 0.23 learning 0.19 boost 0.17 …. P(w|z) DM ML Tags: Data mining, clustering, probabilistic model probabilistic model ……

  11. Tag-Topic (TT) Models Generative process: words tags Topic Tag-Topic (TT) Models

  12. Topic Smoothing The new objective function: with Smoothing term Log-likelihood of the tag-topic (TT) model.

  13. Divergence Estimation Estimated topic distribution • Tag divergence • Hypernym-divergence • Merging-divergence • Keep-divergence Posterior probability derived from the topic modeling results

  14. Hierarchical Structure Construction Correspond to a divergence Penalty to the complex of the generated hierarchy Step 1. Step 2.

  15. Outline Related Work Our Approach Modeling Folksonomy Divergence Estimation Hierarchical Structure Construction Experiments Conclusion & Future Work

  16. Data Sets and Evaluation Measures • Data sets • PAPER: 4,841 papers and their associated tags (8,071 unique tags and a total of 37,010 tags) from CITEULIKE • MOVIE: 4,009 movies and their tags (18,559 unique tags and a total of 142,498 tags) from IMDB • Evaluation Measures • Accuracy (against ODP or human judgement) • Case study • Baseline • Hierarchical clustering

  17. Accuracy Performance

  18. Case Study—Movie By clustering By TT By TT with smoothing

  19. Case Study—Paper By TT By TT with smoothing

  20. Outline Related Work Our Approach Modeling Folksonomy Divergence Estimation Hierarchical Structure Construction Experiments Conclusion & Future Work

  21. Conclusion • Formalize a novel problem of ontology learning from folksonomies. • Exploit a probabilistic topic model to model the tags and their annotated documents and propose four divergence measures. • Present an algorithm to construct the hierarchical structure from tags. • Experimental results on two different types of real-world data sets show that our method can effectively learn the ontological hierarchy from social tags.

  22. Future Work • Discover non-taxonomic relationship between tags • Ontology learning from noisy tags • Incremental ontology learning from the dynamic tagging space • Applications: • Personalized tag recommendation • Social tagging—guiding the tagging process • …

  23. Thanks! Q&A HP: http://keg.cs.tsinghua.edu.cn/persons/tj/ Open resource will be available soon at: http://arnetminer.org/resources

More Related