1 / 70

Advanced Perceptron Learning

Advanced Perceptron Learning. David Kauchak CS 451 – Fall 2013. Admin. Assignment 2 contest  due Sunday night!. Linear models. A linear model in n -dimensional space (i.e. n features) is define by n+ 1 weights: In two dimensions, a line: In three dimensions, a plane:

rehan
Télécharger la présentation

Advanced Perceptron Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Perceptron Learning David KauchakCS 451 – Fall 2013

  2. Admin Assignment 2 contest  due Sunday night!

  3. Linear models A linear model in n-dimensional space (i.e. n features) is define by n+1 weights: In two dimensions, a line: In three dimensions, a plane: In n-dimensions, a hyperplane (where b = -a)

  4. Learning a linear classifier f2 f1 What does this model currently say? w=(1,0)

  5. Learning a linear classifier f2 f1 POSITIVE NEGATIVE w=(1,0)

  6. Learning a linear classifier f2 (-1,1) f1 Is our current guess: right or wrong? w=(1,0)

  7. Learning a linear classifier f2 (-1,1) f1 predicts negative, wrong How should we update the model? w=(1,0)

  8. A closer look at why we got it wrong (-1, 1, positive) w1 w2 We’d like this value to be positive since it’s a positive value could have contributed (positive feature), but didn’t contributed in the wrong direction decrease increase 1 -> 0 0 -> 1

  9. Learning a linear classifier f2 (-1,1) f1 Graphically, this also makes sense! w=(0,1)

  10. Learning a linear classifier f2 f1 Is our current guess: right or wrong? (1,-1) w=(0,1)

  11. Learning a linear classifier f2 f1 predicts negative, correct (1,-1) How should we update the model? w=(0,1)

  12. Learning a linear classifier f2 f1 Already correct… don’t change it! (1,-1) w=(0,1)

  13. Learning a linear classifier f2 f1 Is our current guess: right or wrong? (-1,-1) w=(0,1)

  14. Learning a linear classifier f2 f1 predicts negative, wrong (-1,-1) How should we update the model? w=(0,1)

  15. A closer look at why we got it wrong (-1, -1, positive) w1 w2 We’d like this value to be positive since it’s a positive value contributed in the wrong direction didn’t contribute, but could have decrease decrease 0 -> -1 1 -> 0

  16. Learning a linear classifier f2 f1, f2, label -1,-1, positive -1, 1, positive 1, 1, negative 1,-1, negative f1 w=(-1,0)

  17. Perceptron learning algorithm repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): check if it’s correct based on the current model if not correct, update all the weights: if label positive and feature positive: increase weight (increase weight = predict more positive) if label positive and feature negative: decrease weight (decrease weight = predict more positive) if label negative and feature positive: decrease weight (decrease weight = predict more negative) if label negative and negative weight: increase weight (increase weight = predict more negative)

  18. A trick… if label positive and feature positive: increase weight (increase weight = predict more positive) if label positive and feature negative: decrease weight (decrease weight = predict more positive) if label negative and feature positive: decrease weight (decrease weight = predict more negative) if label negative and negative weight: increase weight (increase weight = predict more negative) Let positive label = 1 and negative label = -1 label * fi 1*1=1 1*-1=-1 -1*1=-1 -1*-1=1

  19. A trick… if label positive and feature positive: increase weight (increase weight = predict more positive) if label positive and feature negative: decreaseweight (decrease weight = predict more positive) if label negative and feature positive: decrease weight (decrease weight = predict more negative) if label negative and negative weight: increaseweight (increase weight = predict more negative) Let positive label = 1 and negative label = -1 label * fi 1*1=1 1*-1=-1 -1*1=-1 -1*-1=1

  20. Perceptron learning algorithm repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): check if it’s correct based on the current model if not correct, update all the weights: for each wi: wi = wi + fi*label b = b + label How do we check if it’s correct?

  21. Perceptron learning algorithm repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label b = b + label

  22. Perceptron learning algorithm repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label b = b + label Would this work for non-binary features, i.e. real-valued?

  23. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) 1 4 f1 • Repeat until convergence • Keep track of w1, w2 as they change • Redraw the line after each step (-1,-1) 2 3 (.5,-1) w = (1, 0)

  24. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (0, -1)

  25. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (-1, 0)

  26. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (-.5, -1)

  27. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (-1.5, 0)

  28. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (-1, -1)

  29. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (-2, 0)

  30. Your turn  repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label f2 (-1,1) (1,1) f1 (-1,-1) (.5,-1) w = (-1.5, -1)

  31. Which line will it find?

  32. Which line will it find? Only guaranteed to find some line that separates the data

  33. Convergence repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label b = b + label Why do we also have the “some # iterations” check?

  34. Handling non-separable data If we ran the algorithm on this it would never converge!

  35. Convergence repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label b = b + label Also helps avoid overfitting! (This is harder to see in 2-D examples, though)

  36. Ordering repeat until convergence (or for some # of iterations): for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label b = b + label What order should we traverse the examples? Does it matter?

  37. Order matters What would be a good/bad order?

  38. Order matters: a bad order

  39. Order matters: a bad order

  40. Order matters: a bad order

  41. Order matters: a bad order

  42. Order matters: a bad order

  43. Order matters: a bad order Solution?

  44. Ordering repeat until convergence (or for some # of iterations): randomize order or training examples for each training example (f1, f2, …, fn, label): if prediction * label ≤ 0: // they don’t agree for each wi: wi = wi + fi*label b = b + label

  45. Improvements What will happen when we examine this example?

  46. Improvements Does this make sense? What if we had previously gone through ALL of the other examples correctly?

  47. Improvements Maybe just move it slightly in the direction of correction

  48. Voted perceptron learning Training • every time a mistake is made on an example: • store the weights (i.e. before changing for current example) • store the number of examples that set of weights got correct Classify • calculate the prediction from ALL saved weights • multiply each prediction by the number it got correct (i.e a weighted vote) and take the sum over all predictions • said another way: pick whichever prediction has the most votes

  49. Voted perceptron learning Vote 3 • Training • every time a mistake is made on an example: • store the weights • store the number of examples that set of weights got correct 1 5 1

  50. Voted perceptron learning Vote 3 1 Classify 5 1

More Related