1 / 20

An introduction to support vector machine (SVM)

An introduction to support vector machine (SVM). Advisor : Dr.Hsu Graduate : Ching –Wen Hong. Outline. 1.SVM : A brief overview 2.Simple SVM : Linear classifier for separable data 3.Simple SVM : Linear classifier for non-separable data 4.Conclusion. SVM : A brief overview.

cnimmons
Télécharger la présentation

An introduction to support vector machine (SVM)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An introduction to support vector machine (SVM) Advisor : Dr.Hsu Graduate : Ching –Wen Hong

  2. Outline • 1.SVM : A brief overview • 2.Simple SVM : Linear classifier for separable data • 3.Simple SVM : Linear classifier for non-separable data • 4.Conclusion

  3. SVM : A brief overview • 1-1 What is a SVM ? • a family of learning algorithm for classification of objects into two classes . • Input : a training set {(x1,y1),…,(xl ,yl)} of object xi E Ŕ(n-dim vector space) and their known classes yi E {-1,+1}. • Output : a classifier f :Ŕ→ {-1,+1}.which predicts the class f(x) for any (new) object x E Ŕ

  4. 1-2 Pattern recognition example

  5. 1-3 Example of classification tasks • Optical character recognition : x is an image , y is a character. • Text classification : x is a text , y is a category . • Medical diagnosis : x is a set of features (age , sex, blood type, genome,…) , y indicates the risk.

  6. 1-4 Are there other methods for classification ? • Bayesian classifier (base on maximum a posterior probability) • Fisher linear discriminant • Neural networks • Expert system (rule-based) • Decision tree • …

  7. 1-5 Why is it gaining popularity ? • Good performance in real-world applications. • Computational efficiency. • Robust in high dimension. • No strong hypothesis on the data generation process (contrary to Bayesian approach).

  8. 2.Simplest SVM :Linear SVM for separable training sets • a training set S= {(x1,y1),…,(xl ,yl)} , xi E Ŕ, yi E {-1,+1}. • 2-1 Linearly separable training set

  9. 2-2 Linear classifier

  10. 2-3 Which one is the best ?

  11. 2-4 How to find the optimal hyperplane? • xi·w+b≥+1 for yi=+1 (1) yi(xi·w+b)-1≥0 ,i=1,…,l • xi·w+b≤-1 for yi= -1 (2) , w is the Normal vector of H1,H2 • H1: xi·w+b=1 ,H2: xi·w+b=-1 • Margin=2/║w║, ◎ is a support vector.

  12. 2-5 Finding the optimal hyperplane • The optimal hyperplane is defined by the pair (w,b). • Solve the linear program problem • Min ½║w║² • st. yi(xi·w+b)-1≥0 ,i=1,…,l • This is a class quadratic(convex) program

  13. 2-6 Lagrange Method

  14. 2-7 Recovery the optimal hyperplane • Once αi ,i=1,..,l is found. we recover (w,b) corresponding to the optimal hyperplane , w is given by w=∑ αi yixi and the decision function f(x)=w·x+b

  15. 2-8 Solving the dual problem

  16. 2-9 The Karush-Kahn-Tucker conditions • The KKT conditions are necessary and sufficient for w,b,α to be solution,Thus solving the SVM problem is equivalent to finding asolution to the KKT conditions. • From the KKT conditions,we can the following conclusion,If αi>0 then yi(w·xi+b)=1 and xi is a support vector • If all other training points(αi=0) were removed and training was repeated ,the separating hyperplane would be found.

  17. 2-10 Examples by Pictures

  18. 3.Simplest SVM :Linear classifier for non-separable data • 3-1 Finding the optimal hyperplane • Solve the linear program problem • Min ½║w║²+C(∑εi) , c is a extreme large value • S.t. yi(xi·w+b)-1+εi≥0 , εi≥0, 0≤αi≤c ,i =1,…,l

  19. 3-2 Lagrange Method

  20. Simplest SVM :Conclusion • Finds the optimal hyperplane , which corresponds to the largest margin • Can be solved easily using a dual formulation • The solution is sparse : the number of support vectors can be very small compared to the size of the training set • Only support vectors are important for prediction of future points. All other points can be forgotten.

More Related