90 likes | 204 Vues
This comprehensive overview focuses on the concepts of overfitting and regularization in machine learning, elaborating on their implications through practical examples. It discusses the bias-variance dilemma, explaining how overfitting can be identified and avoided using evaluation techniques on polynomial models. The role of weight decay in optimizing weights for complex models with limited data is illustrated, emphasizing the importance of validation sets. Additionally, a dataset generation exercise is proposed to fit a fourth-degree polynomial with varying regularization parameters.
E N D
Overfitting and Regularization Chapters 11 and 12 on amlbook.com
Over-fitting easy to recognize in 1D Parabolic target function 4th order hypothesis 5 data points -> Ein = 0
Origin of over-fitting can be analyzed in 1D: Bias/variance dilemma
Over-fitting easy to avoid in 1D: Results from HW2 Eval Sum of squared deviations Ein Degree of polynomial
Using Eval to avoid over-fitting works in all dimensions but computation grows rapidly for large d Ein Ecv-1 Eval EE d = 2 Terms in F5(x) added successively Validation set needs to be large Does this compromise training?
What if we want to add higher order terms to a linear model but don’t have enough data a validation set? Solution: Augment the error function used to optimize weights Example Penalizes choices with large |w|. Called “weight decay”
Normal equations with weight decay essentially unchanged (ZTZ + lI) wreg =ZTy
Best value l is subjective In this case l = 0.0001 large enough to suppress swings and data still important in determining optimum weights
Assignment 8: due 11-13-14 Generation of in silico dataset y(x) = 1 + 9x2 + N(0,1) with 5 randomly selected values of x between -1 and +1 Fit a 4th degree polynomial to the data with and without regularization by choosing l = 0, 0.0001, 0.001,0.01,1.0, and 10. Display results as in slide 8 of lecture on regularization