1 / 16

Outlines

High Relevance Keyword Extraction facility for Bayesian text classification on different domains of varying characteristic. Presenter : Min-Cong Wu Authors : Lam Hong Lee, Dino Isa, Wou Onn Choo , Wen Yeen Chue 2012.ESA. Outlines. Motivation Objectives Methodology Experiments

lenka
Télécharger la présentation

Outlines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Relevance Keyword Extraction facility for Bayesian text classification on different domains of varying characteristic Presenter : Min-Cong WuAuthors : Lam Hong Lee, Dino Isa, WouOnnChoo, Wen YeenChue2012.ESA

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • Bayesian classification as compared to other classification approaches is its ability and simplicity in handling raw text data directly • As a trade-off to its simplicity, Bayesian classification has been reported as one of the poorest-performing classification approaches.

  4. Objectives • By use to HRKE facility enhance the accuracy of Bayesian classifier without sacrificing the low cost.

  5. Methodology – Block diagram

  6. Methodology – Bayesian Classifier

  7. Methodology – TF-IDF method TF*IDF • TF-IDF = • TF(Term Frequency),IDF(Inverse Document Frequency) • N=This word contains the number of document in dataset • Example:

  8. Methodology – HRKE facility • The degree of relevance of keywords in the classification task can be adjusted by setting a threshold, m/n.

  9. Experiment-the basic flat ranking multivariate

  10. Experiment - Featured Articles dataset

  11. Experiment - Vehicles dataset

  12. Experiment - Mathematics dataset

  13. Experiment - 20-Newsgroups dataset

  14. Experiment - Summary

  15. Conclusions • HRKE facility is achieved through applying unique feature selection method based on the occurrence of keywords in documents from a specified category, and compares the occurrence of those keywords in each of the competing categories.

  16. Comments • Advantages Improve Bayesian classification performance and can maintain low cost. • Applications - Feature selection

More Related