1 / 28

Lecture 6 Ensemble Learning (1) Boosting

Lecture 6 Ensemble Learning (1) Boosting. Adaboost Boosting is an additive model Brief intro to lasso The relationship. Boosting. Combine multiple classifiers. Construct a sequence of weak classifiers, and combine them into a strong classifier by a weighted majority vote.

wyome
Télécharger la présentation

Lecture 6 Ensemble Learning (1) Boosting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6 Ensemble Learning (1) Boosting Adaboost Boosting is an additive model Brief intro to lasso The relationship

  2. Boosting Combine multiple classifiers. Construct a sequence of weak classifiers, and combine them into a strong classifier by a weighted majority vote. “weak”: better than random coin-tossing Some properties: Flexible. Able to select features. Good generalization. Could fit noise.

  3. Boosting Adaboost: (Freund &Schapire 1995)

  4. Boosting “A Tutorial on Boosting”,Yoav Freund and Rob Schapire

  5. Boosting “A Tutorial on Boosting”,Yoav Freund and Rob Schapire

  6. Boosting “A Tutorial on Boosting”,Yoav Freund and Rob Schapire

  7. Boosting “A Tutorial on Boosting”,Yoav Freund and Rob Schapire

  8. Boosting

  9. Boosting This is the weight of the current weak classifier in the final model. This weight is for individual observations. Notice it is stacked from step 1. If an observation is correctly classified at this step, its weight doesn’t change. If incorrectly classified, its weight increases.

  10. Boosting

  11. Boosting

  12. Boosting

  13. Boosting 10 predictors The weak classifier is a Stump: a two-level tree.

  14. Boosting Boosting can be seen as fitting an additive model, with the general form: Expansion coefficients Examples of γ: Sigmoidal function in neural networks; A split in a tree model; Basis functions: Simple functions of feature x, with parameters γ

  15. Boosting In general, such functions are fit by minimizing a loss function This could be computationally intensive. An alternative is to go stepwise, fitting a sub-problem of a single basis function

  16. Boosting Forward stagewise additive modeling --- add new basis functions without adjusting previously added ones. Example: * Squared loss function is not good for classification.

  17. Boosting The version of Adaboost we discussed uses this loss function: The basis functions are individual weak classifiers.

  18. Boosting Margin: y*f(x) >0, correct <0, incorrect The goal of classification – to produce positive margin as much as possible. Negative margin should be penalized more. Exponential penalize negative margin more heavily.

  19. Boosting To be solved: Independent from β and G

  20. Boosting Observations are either correctly or incorrectly classified. Then the target function to be minimized is: For any β> 0, Gm has to satisfy: G is the classifier that minimizes the weighted error rate.

  21. Boosting Solving for the Gm will give us a weighted error rate. Plug it back to get β: Update the overall classifier by plugging these in:

  22. Boosting The weight for next iteration becomes: Using Independent of i. Ignored.

  23. Lasso The equivalent Lagrangian form: Ridge regression: Elastic Net:

  24. Lasso

  25. Lasso Orthogonal x are the least squares estimates

  26. Lasso Lasso Ridge Error contour in parameter space.

  27. Boosted linear regression {Tk} : a collection of basis functions

  28. Boosted linear regression Here the T’s are X’s themselves in a linear regression setting.

More Related