1 / 12

Understanding Support Vector Machines: Key Concepts and Techniques

Support Vector Machines (SVMs) are powerful classifiers that aim to separate training data into distinct classes by maximizing the margin between them. When data is linearly separable, SVM finds the optimal hyperplane to achieve this separation. For non-linear data, a kernel function maps data into a higher dimension for better classification. In scenarios where perfect separation isn't achievable, a "soft margin" approach allows for some errors, striking a balance between model accuracy and generalization. Key techniques include kernel selection and parameter tuning using cross-validation.

susane
Télécharger la présentation

Understanding Support Vector Machines: Key Concepts and Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SVMs in a Nutshell

  2. What is an SVM? • Support Vector Machine • More accurately called support vector classifier • Separates training data into two classes so that they are maximally apart

  3. Simpler version • Suppose the data is linearly separable • Then we could draw a line between the two classes

  4. Simpler version • But what is the best line? In SVM, we’ll use the maximum margin hyperplane

  5. Maximum Margin Hyperplane

  6. What if it’s non-linear?

  7. Higher dimensions • SVM uses a kernel function to map the data into a different space where it can be separated

  8. What if it’s not separable? • Use linear separation, but allow training errors • This is called using a “soft margin” • Higher cost for errors = creation of more accurate model, but may not generalize • Choice of parameters (kernel and cost) determines accuracy of SVM model • To avoid over- or under-fitting, use cross validation to choose parameters

  9. Some math • Data: {(x1, c1), (x2, c2), …, (xn, cn)} • xiis vector of attributes/features, scaled • ci is class of vector (-1 or +1) • Dividing hyperplane: wx - b = 0 • Linearly separable means there exists a hyperplane such that wxi- b > 0 if positive example and wxi- b < 0 if negative example • w points perpendicular to hyperplane

  10. More math • wx - b = 0 Support vectors • wx - b = 1 • wx - b = -1 Distance between hyperplanes is 2/|w|, so minimize |w|

  11. More math • For all i, either w xi- b 1 or wx - b  -1 • Can be rewritten: ci(w xi- b) 1 • Minimize (1/2)|w| subject to ci(w xi- b) 1 • This is a quadratic programming problem and can be solved in polynomial time

  12. A few more details • So far, assumed linearly separable • To get to higher dimensions, use kernel function instead of dot product; may be nonlinear transform • Radial Basis Function is commonly used kernel: k(x, x’) = exp(||x - x’||2) [need to choose ] • So far, no errors; soft margin: • Minimize (1/2)|w| + C i • Subject to ci(w xi- b) 1 - i • C is error penalty

More Related