1 / 23

Topics in Algorithms 2007

Topics in Algorithms 2007. Ramesh Hariharan. Support Vector Machines. Machine Learning. How do learn good separators for 2 classes of points? Seperator could be linear or non-linear Maximize margin of separation. Support Vector Machines. Hyperplane w. x. |w| = 1

carson-hart
Télécharger la présentation

Topics in Algorithms 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics in Algorithms 2007 Ramesh Hariharan

  2. Support Vector Machines

  3. Machine Learning • How do learn good separators for 2 classes of points? • Seperator could be linear or non-linear • Maximize margin of separation

  4. Support Vector Machines • Hyperplane w x |w| = 1 For all x on the hyperplane w.x = |w||x| cos(ø)= |x|cos (ø) = constant = -b w.x+b=0 w ø -b

  5. Support Vector Machines • Margin of separation |w| = 1 x Є Blue: wx+b >= Δ x Є Red: wx+b <= -Δ maximize 2 Δ w,b,Δ w wx+b=Δ wx+b=0 wx+b=-Δ

  6. Support Vector Machines • EliminateΔ by dividing by Δ |w| = 1 x Є Blue: (w/Δ) x + (b/Δ) >= 1 x Є Red: (w/Δ) x + (b/Δ) <= -1 w’=w/Δ b’=b/Δ |w’|=|w|/Δ=1/Δ w wx+b=Δ wx+b=0 wx+b=-Δ

  7. Support Vector Machines • Perfect Separation Formulation x Є Blue: w’x+b’ >= 1 x Є Red: w’x+b’ <= -1 minimize |w’|/2 w’,b’ minimize (w’.w’)/2 w’,b’ w wx+b=Δ wx+b=0 wx+b=-Δ

  8. Support Vector Machines • Formulation allowing for misclassification xiЄ Blue: wxi + b >= 1-ξi xiЄ Red: -(wxi + b) >= 1-ξi ξi >= 0 minimize (w.w)/2 + C Σξi w,b,ξi x Є Blue: wx+b >= 1 x Є Red: -(wx+b) >= 1 minimize (w.w)/2 w,b

  9. Support Vector Machines • Duality yi (wxi + b) + ξi >= 1 ξi >= 0 yi=+/-1, class label minimize (w.w)/2 + C Σξi w,b,ξi Σ λi yi = 0 λi >= 0 -λi >= C max Σλi – ( ΣiΣj λiλjyiyj (xi.xj) )/2 λi Primal Dual

  10. Support Vector Machines • Duality (Primal  Lagrangian  Dual) • If Primal is feasible then Primal=Lagrangian Primal yi (wxi + b) + ξi >= 1 ξi >= 0 yi=+/-1, class label min (w.w)/2 + C Σξi w,b,ξi • min max • w,b,ξi λi, αi >=0 • (w.w)/2 + C Σξi • Σi λi (yi (wxi + b) + ξi - 1) • - Σi αi (ξi - 0) = Primal Lagrangian Primal

  11. Support Vector Machines • Lagrangian Primal  Lagrangian Dual • Langrangian Primal >= Lagrangian Dual min max w,b,ξi λi, αi >=0 (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1) -Σiαi(ξi -0) max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0) >= Lagrangian Primal Lagrangian Dual

  12. Support Vector Machines • Lagrangian Primal >= Lagrangian Dual • Proof Consider a 2d matrix Find max in each row Find the smallest of these values Find min in each column Find the largest of these values LP LD

  13. Support Vector Machines • Can Lagrangian Primal = Lagrangian Dual ? • Proof Consider w* b* ξ* optimal for primal Find λi, αi>=0 such that minimizing over w,b,ξ gives w* b* ξ* Σiλi(yi (w*xi+b*)+ξi* -1)=0 Σiαi(ξi*-0)=0 max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0)

  14. Support Vector Machines • Can Lagrangian Primal = Lagrangian Dual ? • Proof Consider w* b* ξi* optimal for primal Find λi, αi >=0 such that Σiλi(yi (w*xi+b*)+ξi* -1)=0 Σiαi(ξi*-0)=0 ξi* > 0 implies αi=0 yi (w*xi+b*)+ξi* -1 !=0impliesλi=0 max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0)

  15. Support Vector Machines • Can Lagrangian Primal = Lagrangian Dual ? • Proof Consider w* b* ξi* optimal for primal Find λi, αi >=0 such that minimizing over w,b,ξi gives w*,b*, ξi* at w*,b*,ξi* δ/ δwj = 0, δ/ δξi = 0, δ/ δb = 0 and second derivatives should be non-neg at all places max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0)

  16. Support Vector Machines • Can Lagrangian Primal = Lagrangian Dual ? • Proof Consider w* b* ξi* optimal for primal Find λi, αi >=0 such that minimizing over w,b gives w*,b* w* - Σiλiyi xi = 0 -Σiλi yi = 0 -λi - αi +C = 0 second derivatives are always non-neg max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0)

  17. Support Vector Machines • Can Lagrangian Primal = Lagrangian Dual ? • Proof Consider w* b* ξi* optimal for primal Find λi, αi >=0 such that ξi* > 0 implies αi=0 yi (w*xi+b*)+ξi* -1 !=0impliesλi=0 w* - Σiλiyi xi = 0 -Σiλi yi = 0 - λi - αi + C = 0 Such a λi, αi >=0 always exists!!!!! max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0)

  18. Support Vector Machines • Proof that appropriate Lagrange Multipliers always exist? • Roll all primal variables into w lagrange multipliers into λ max min f(w) – λ (Xw-y) λ>=0 w min max f(w) – λ (Xw-y) w λ>=0 min f(w) w Xw >= y

  19. Support Vector Machines • Proof that appropriate Lagrange Multipliers always exist? λ 0 >=0 = Claim: This is satisfiable = Grad(f) at w* = >= > w* λ>=0 X= y X

  20. Support Vector Machines • Proof that appropriate Lagrange Multipliers always exist? Claim: This is satisfiable Grad(f) = Grad(f) Grad(f) λ>=0 X= Row vectors of X=

  21. Support Vector Machines • Proof that appropriate Lagrange Multipliers always exist? Grad(f) Claim: This is satisfiable Grad(f) = Row vectors of X= λ>=0 X= h X= h >=0, Grad(f) h < 0 w*+h is feasible and f(w*+h)<f(w*) for small enough h

  22. Support Vector Machines • Finally the Lagrange Dual max min λi, αi >=0 w,b,ξi (w.w)/2+ C Σξi -Σiλi(yi (wxi+b)+ξi-1)-Σiαi(ξi -0) w - Σiλiyi xi = 0 -Σiλi yi = 0 -λi - αi +C = 0 Rewrite in final dual form Σ λi yi = 0 λi >= 0 -λi >= -C max Σλi – ( ΣiΣj λiλjyiyj (xi.xj) )/2 λi

  23. Support Vector Machines • Karush-Kuhn-Tucker conditions Σiλi(yi (w*xi+b*)+ξi* -1)=0 Σiαi(ξi*-0)=0-λi - αi +C = 0 If ξi*>0 αi =0 λi =C If yi (w*xi+b*)+ξi* -1>0 λi = 0 ξi* = 0 If 0 < λi <C yi (w*xi+b*)=1 Rewrite in final dual form Σ λi yi = 0 λi >= 0 -λi >= C max Σλi – ( ΣiΣj λiλjyiyj (xi.xj) )/2 λi

More Related