1 / 27

Lecture 7. Kernel Smoothing Methods

Lecture 7. Kernel Smoothing Methods. Instructed by Jinzhu Jia. Outline. Description of Kernel One-dimensional Kernel smoothing Selecting the Width of the Kernel Local Regression in R p Kernel Density Estimation and Classification. Nearest Neighbor Smoothing.

nasya
Télécharger la présentation

Lecture 7. Kernel Smoothing Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 7. Kernel Smoothing Methods Instructed by Jinzhu Jia

  2. Outline • Description of Kernel • One-dimensional Kernel smoothing • Selecting the Width of the Kernel • Local Regression in Rp • Kernel Density Estimation and Classification

  3. Nearest Neighbor Smoothing Average according to uniform weights on all k neighbors

  4. Nearest Neighbor Smoothing • Properties: • Approximates E(Y|X) • Not continuous • To overcome discontinuity: assign weights that die off smoothly with distance from the target point.

  5. Epanechnikov Kernel

  6. Kernel • Similarity to the target ``x0” • Larger implies lower variance but high bias

  7. Examples • Example: EpanechnikovKernel – • K-NN: neighborhood sizereplaces • Tri-cube:

  8. Local Linear Regression

  9. Local linear regression

  10. Local linear regression • Make a first order bias correction

  11. Local Polynomial Regression

  12. Local Polynomial Regression • The price of bias reduction? Variance!

  13. Selecting the Width of the Kernel • controls the width of the local region • Epanechnikov or tri-cube: is the radius of the support region • KNN: is the number of neighbors • Gaussian: is the standard deviation

  14. Selecting the Width of the Kernel • A natural bias-variance tradeoff as we change the width of the averaging window (take local average as an exmaple) • If the window is narrow, variance will be bigger and bias is smaller • If the window is wide, variance will be smaller and bias is big, because some x will be far away from x0 • So CV, AIC, BIC can all be used to select a good

  15. Local Regression in

  16. Local Regression in • Boundary effects? • Less useful in high dimensions • Difficult for visualization

  17. Data Visualization

  18. Structured Local Regreesion Models in • When p/n is big, local regression is not helpful, unless we have some structural information.

  19. Structured Local Regreesion Models in • Structured regression functions • One-dimensional local regression for each stage

  20. Varying Coefficient Models • Divide the p predictors into two sets: • and Z • We assume the conditional linear model:

  21. Varying Coefficient Models: an example

  22. Local Likelihood and Other Models • From global to local: • Example: local logistic regression model.

  23. Kernel Density Estimation and Classification • Kernel Density Estimation • Smooth version • One choice of the kernel: histogram

  24. Kernel Density Estimation and Classification

  25. Naïve Bayes Classifier • Assumption: given class G=j, features are independent

  26. Mixture Models for Density Estimation and Classification • Gaussian mixture model: • EM algorithm is used for the parameter estimatation

  27. Homework • Due May 9 • ESLII_print 5, pp216. Exercise: 6.2, 6.12

More Related