1 / 15

Lecture: Prof. Hahn-Ming Lee Student: Ching-Hao Mao D9415004@mail.ntust.tw

Fuzzy Final Homework System Implementation Selected paper: Fuzzy integration of structure adaptive SOMs for web content mining , Fuzzy Sets and Systems 148 (2004) 43–60. Lecture: Prof. Hahn-Ming Lee Student: Ching-Hao Mao D9415004@mail.ntust.edu.tw. Outline. Introduction

Télécharger la présentation

Lecture: Prof. Hahn-Ming Lee Student: Ching-Hao Mao D9415004@mail.ntust.tw

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fuzzy Final HomeworkSystem ImplementationSelected paper: Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60 Lecture: Prof. Hahn-Ming Lee Student: Ching-Hao Mao D9415004@mail.ntust.edu.tw

  2. Outline • Introduction • Proposed method in selected paper • Implementation • Conclusion • References

  3. Introduction • In this report, we implement Kim and Cho’s paper appear on Fuzzy Set and System in 2004 • User profile represents different aspects of user’s characteristics • The author proposed an ensemble of classifiers that estimate user’s preference using web content labeled by user as “like” or “dislike”

  4. Introduction- Preview Studies [2]

  5. Feature Selection Method Properties • Feature selection methods such as Information Gain, TFIDF, and ODDS ratio have different properties • TFIDF does not consider class values of documents when calculating the relevance of features while information gain uses class labels of documents • Odds ratio uses class labels of documents but they find useful features to classify only one specific class

  6. Overview of the proposed method in [1] Classification TFIDF, Information Gain, ODDS Ratio

  7. Structure Adaptive SOM

  8. Training SASOM’s using different feature sets Hot Fuzzy Integral or Cold

  9. Data Set Description • UCI Syskill & Webert data (http://kdd.ics.uci.edu) • Contain the HTML source of web pages plus the ratings of a single user on these web pages • The web pages are on four separate subjects • Bands- recording artists (Implement in this report) • Goats (Implement in this report) • Sheep • BioMedical

  10. Implementation • Coding Java (J2SE 1.5) program for preprocessing, feature selection (TFIDF and ODDS Ratio), and Fuzzy Integral mechanism • Using Weka for Feature Selection (Information Gain) and Classification • This report not successfully program SASOM…

  11. Implementation-preprocessing UCI Syskill & Webert data After Stopword and Porter Stemmer ExtractHTMLContent.java Pure Text without Anchor Text Bands_Stopword.txt Bands_Porter.txt Bands.txt

  12. In Bands, 61 dataset E.g. Attribute Number: 5436->32 Implementation- Feature Selection

  13. Implementation- Fuzzy Integral Fuzzy measure of classifiers that are determined subjectively [1] Bayes Classifier b1,b2,b3 0.99 b1=0, b2=1, b3=0 FuzzyIntegral.java

  14. Conclusion • Fuzzy integral provides the method of measuring the importance of classifiers subjectively, especially in semi-supervised learning method • The method based on fuzzy integral can be effectively applied to web content mining for predicting user’s preference as user profile • Fuzzy Integral maybe can apply into my research area to integrate expert or user’s knowledge

  15. References • Kyung-Joong Kim, Sung-Bae Cho, Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60 • Pazzani M., Billsus, D., Learning and Revising User Profiles: The identification of interesting web sites, Machine Learning 27 (1997), 313-331 • http://kdd.ics.uci.edu/databases/SyskillWebert/SyskillWebert.data.html

More Related