Download
comments from pre submission presentation n.
Skip this Video
Loading SlideShow in 5 Seconds..
Comments from Pre-submission Presentation PowerPoint Presentation
Download Presentation
Comments from Pre-submission Presentation

Comments from Pre-submission Presentation

67 Views Download Presentation
Download Presentation

Comments from Pre-submission Presentation

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Comments from Pre-submission Presentation • Q: Check why kNN is so lower than SVM on Reuters and 20 Newsgroups corpus? -10%. • A: Refer to the following four references: [Joachims 98] [Debole 03 STM] [Dumais 98 Inductive] [Yang 99 Reexamination]

  2. [Joachims98][Debole03][Dumais98]Results on the Reuters Corpus

  3. [Yang 99 Re-examination]Significance Test • Micro-level analysis (s-test) • SVM > kNN >> {LLSF, NNet} >> NB • Macro-level analysis • {SVM, kNN, LLSF} >> {NB, NNet} • Error-rate based comparison • {SVM, kNN} > LLSF > NNet >> NB

  4. Comments from Pre-submission Presentation • 2. Explain why BEP & F1 in Chap 7 • -Add reference

  5. Breakeven point (1) • BEP, first proposed by Lewis[1992]. Later, he himself pointed out that BEP is not a good effectiveness measure, because • 1. there may be no parameter setting that yields the breakeven; in this case the final BEP value, obtained by interpolation, is artificial; • 2. to have P=R is not necessarily desirable, and it is not clear that a system that achieves high BEP can be tuned to score high on other effectiveness measure.

  6. Breakeven point (2) • Yang[1999Re-examinatio] also noted that when for no value of the parameters P and R are close enough, interpolated breakeven may not be a reliable indicator of effectiveness.

  7. Comments from Pre-submission Presentation • 3. Add more qualitative analysis would be better

  8. Analysis and Proposal: Empirical observation Comparison of idf, rf and chi2 value of four features in two categories of Reuters Corpus

  9. Comments from Pre-submission Presentation • 4. Chap 7 remove Joachims Results using quotation is fine

  10. Comments from Pre-submission Presentation • 5. Tone down “best” claims •  to our knowledge (experience, understanding) • Pay attention this usage when doing presentation

  11. Introduction:Other Text Representation • Word senses (meanings) [Kehagias 2001] • same word assumes different meanings in a different contexts • Term clustering [Lewis 1992] • group words with high degree of pairwise semantic relatedness • Semantic and syntactic representation [Scott & matwin 1999] • Relationship between words, i.e. phrases, synonyms and hypernyms

  12. Introduction:Other Text Representation • Latent Semantic Indexing [Deerwester 1990] • A feature reconstruction technique • Combination Approach [Peng 2003] • combine two types of indexing terms, i.e. words and 3-grams • In general, high level representation did not show good performance in most cases

  13. Literature Review:Knowledge-based Representation • Theme Topic Mixture Model – Graphical Model [Keller 2004] • Using keywords from summarization [Li 2003]

  14. Literature Review: 2. How to weight a term (feature) • [Salton 1988] elaborated three considerations: • 1. term occurrences closely represent the content of document • 2. other factors with the discriminating power pick up the relevant documents from other irrelevant documents • 3. consider the effect of length of documents

  15. Literature Review: 2. How to weight a term (feature) • 1. Term Frequency Factor • Binary representation (1 for present and 0 for absent) • Term frequency (tf): number of times a term occurs in a document • Log(tf): log operation to scale the effect of unfavorably high term frequency • Inverse term frequency (ITF)

  16. Literature Review: 2. How to weight a term (feature) • 2. Collection Frequency Factor • idf: the most-commonly used factor • Probabilistic idf: aka. term relevance weight • Feature selection metrics: chi^2, information gain, gain ratio, odds ratio, etc.

  17. Literature Review: 2. How to weight a term (feature) • 3. Normalization Factor • Combine the above two factors by using multiplication operation • In order to eliminate the length effect, we use the cosine normalization to limit the term weighting range within (0,1)