1 / 23

Linear Models for Classification

Linear Models for Classification. Chapter 4. Decision Theory. Inference step Use training data to learn a model for . Decision step For given x , use to make optimal class assignments. Decision Theory. Nonprobabilistic Method

dacia
Télécharger la présentation

Linear Models for Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LinearModels forClassification Chapter 4

  2. Decision Theory • Inference step Use training data to learn a model for . • Decision step For given x, use to make optimal class assignments.

  3. Decision Theory • Nonprobabilistic Method Discriminant function • Generative Models • Discriminative Models

  4. 4.1 Discriminant Function • w determines the orientation of the decision surface. • The bias parameter w0 determines the location of the decision surface.

  5. Multiple classes • One-versus-the-rest classifier : y>0 assign x to class Ck, else not class Ck .

  6. Multiple classes • : assigning a point x to class Ck if yk(x) > yj(x) for all j ≠ k. • The decision regions: connected and convex ?

  7. Least squares for classification The sum-of-squares error function:

  8. Least squares for classification

  9. Fisher’s linear discriminant • Adjust w to maximizes the class separation. • Fisher Criterion • between-class covariance matrix • total within-class covariance matrix

  10. Fisher’s linear discriminant • If the within-class covariance is isotropic, w is proportional to the difference of the class means.

  11. Relation to least squares • The least-squares: making the model predictions as close as possible to a set of target values. • Fisher criterion: requiring maximum class separation in the output space. • For the two-class problem, the Fisher criterion can be obtained as a special case of least squares.

  12. Fisher’s discriminant for multiple classes • We first note from (4.46) that SB is composed of the sum of K matrices, each of which is an outer product of two vectors and therefore of rank 1. • In addition, only (K −1) of these matrices are independent as a result of the constraint (4.44). Thus, SB has rank at most equal to (K −1) and so there are at most (K −1) nonzero eigenvalues. • This shows that the projection onto the (K − 1)-dimensional subspace spanned by the eigenvectors of SB does not alter the value of J(w), and so we are therefore unable to find more than (K − 1) linear ‘features’ by this means.

  13. The perceptron algorithm • The perceptron function • nonlinear activation function f(· ) • The perceptron criterion (error function)

  14. The perceptron algorithm

  15. 4.2 Probabilistic Generative Models • Model the class-conditional densities and class prior • Use Bayes’ theorem to compute posterior probabilities

  16. Logistic Sigmoid Function

  17. 4.2 Probabilistic Generative Models • Beyes’ theorem • if ak aj for all j ≠ k, then p(Ck|x) ≈1, and p(Cj |x)≈ 0.

  18. Continuous inputs • Model the class-conditional densities Here, all classes share the same covariance matrix. • Two-classes Problem

  19. Two-Classes Problem

  20. K Classes • With the same covariance matrix, the boundary is linear, otherwise it is quadratic.

  21. Maximum likelihood solution • The likelihood function: • Note that the approach of fitting Gaussian distributions to the classes is not robust to outliers, because the maximum likelihood estimation of a Gaussian is not robust.

  22. Discrete features When The class-conditional distributions: Linear functions of the input values:

  23. Exponential family • Exponential family: Linear Function

More Related