1 / 16

Study on Ensemble Learning

Study on Ensemble Learning. By Feng Zhou. Content . Introduction A Statistical View of M3 Network Future Works. Introduction. Ensemble learning: To combine a group of classifiers rather than to design a new one.

fionn
Télécharger la présentation

Study on Ensemble Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Study on Ensemble Learning By Feng Zhou

  2. Content • Introduction • A Statistical View of M3 Network • Future Works

  3. Introduction • Ensemble learning: • To combine a group of classifiers rather than to design a new one. • The decisions of multiple hypotheses are combined to produce more accurate results. • Problems in traditional learning algorithms • Statistical Problem • Computational Problem • Representation Problem • Related Works • Resampling techniques: Bagging, Boosting • Approaches for extending to multi-class problem: One-vs-One, One-vs-All.

  4. Min-Max-Modular (M3) Network(Lu, IEEE TNN 1999) • Steps • Dividing training sets. (Chen, IJCNN 2006; Wen, ICONIP 2005) • Training pair-wise classifiers • Integrating the outcomes (Zhao, IJCNN 2005) • Min process • Max process

  5. A Statistical View • Assumption • The pair-wise classifier outputs a probabilistic value. Sigmoid function (J.C. Platt, ALMC 1999): • Bayesian decision theory

  6. A Simple Discrete Example

  7. A Simple Discrete Example (II) Pc0(w+|x=x2) = 1/3 Pc1(w+|x=x2) = 1/2 Pc2(w+|x=x2) = 1/2 Pc0 < min(Pc1,Pc2) Classifier 0 (w+:w-) Classifier 2 (w+:w2-) Classifier 1 (w+:w1-)

  8. A More Complicated Example • When consider a new more classifier, the evidence that x belong to w+ is getting shrinking. • Pglobal(w+) < min(Ppartial(w+)) • The one reporting the minimum value contains the most information about w-(Minimization principle) • If Ppartial(w+)=1, no information about w- iscontained. Information about w- is increasing …… Classifier 1 (w+:w1-) Classifier 2 (w+:w2-)

  9. Analysis • For each classifier cij • For each sub-positive class wi+ • For positive class w+

  10. Analysis (II) • Decomposition of a complex problem • Restoration to the original resoluation

  11. Composition of Training Sets Have been used Not used yet Trivial set, useless

  12. Another Way of Combination Training and testing Time:

  13. Experiments - Synthesis Data

  14. Experiments – Text Categorization(20 Newsgroup copus) • Experiments Setup • Removing words : • stemming • stop • words < 30 • Using Naïve Bayes as the elementary classifier • Estimating the probability with a sigmod function

  15. Future Work • Situation with consideration of noise • The virtue of the problem: To access the underlying distribution • Independent parameters for the model: • Constraints we get: • To obtain the best estimation. Kullback-Leibler Distance (T. Hastie, Ann Statist 1998)

  16. References [1] T. Hastie & R. Tibshirani, Classification by pairwise coupling, Ann Statist 1998. [2] J. C. Platt, (Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, ALMC 1999 [3] B. Lu & , Task decomposition and module combination based on class relations a modular neural network for pattern classification, IEEE Tran. Neural Networks, 1999 [4] Y. M. Wen & B. Lu, Equal Clustering Makes Min-Max Modular Support Vector Machines More Efficient, ICONIP 2005 [5] H. Zhao & B. Lu, On efficient selection of binary classifiers for min-max modular classifier, IJCNN 2005 [6] K. Chen & B. Lu, Efficient classification of multi-label and imbalanced data using min-max modular classifiers, IJCNN 2006

More Related