1 / 23

CS 461: Machine Learning Lecture 2

CS 461: Machine Learning Lecture 2. Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu. Introductions. Share with us: Your name Grad or undergrad? What inspired you to sign up for this class? Anything specifically that you want to learn from it?. Today’s Topics. Review Homework 1

ilipscomb
Télécharger la présentation

CS 461: Machine Learning Lecture 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 461: Machine LearningLecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu CS 461, Winter 2008

  2. Introductions • Share with us: • Your name • Grad or undergrad? • What inspired you to sign up for this class?Anything specifically that you want to learn from it? CS 461, Winter 2008

  3. Today’s Topics • Review • Homework 1 • Supervised Learning Issues • Decision Trees • Evaluation • Weka CS 461, Winter 2008

  4. Hypothesis Induction, generalization Actions, guesses Refinement Feedback, more observations Observations Review • Machine Learning • Computers learn from their past experience • Inductive Learning • Generalize to new data • Supervised Learning • Training data: <x, g(x)> pairs • Known label or output value for training data • Classification and regression • Instance-Based Learning • 1-Nearest Neighbor • k-Nearest Neighbors CS 461, Winter 2008

  5. Homework 1 • Solution/Discussion CS 461, Winter 2008

  6. Issues in Supervised Learning • Representation: which features to use? • Model Selection: complexity, noise, bias CS 461, Winter 2008

  7. When don’t you want zero error? CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  8. Model Selection and Complexity • Rule of thumb: prefer simplest model that has “good enough” performance “Make everything as simple as possible, but not simpler.” -- Albert Einstein CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  9. Another reason to prefer simple models… CS 461, Winter 2008 [Tom Dietterich]

  10. Decision Trees Chapter 9 CS 461, Winter 2008

  11. Decision Trees • Example: Diagnosis of Psychotic Disorders CS 461, Winter 2008

  12. (Hyper-)Rectangles == Decision Tree discriminant CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  13. Measuring Impurity • Calculate error using majority label • After a split • More sensitive: use entropy • For node m, Nm instances reach m, Nim belong to Ci • Node m is pure if pim is 0 or 1 • Entropy: • After a split: CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  14. Should we play tennis? CS 461, Winter 2008 [Tom Mitchell]

  15. How well does it generalize? CS 461, Winter 2008 [Tom Dietterich, Tom Mitchell]

  16. Decision Tree Construction Algorithm CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  17. Decision Trees = Profit! • PredictionWorks • “Increasing customer loyalty through targeted marketing” CS 461, Winter 2008

  18. Decision Trees in Weka • Use Explorer, run J48 on height/weight data • Who does it misclassify? CS 461, Winter 2008

  19. Building a Regression Tree • Same algorithm… different criterion • Instead of impurity, use Mean Squared Error(in local region) • Predict mean output for node • Compute training error • (Same as computing the variance for the node) • Keep splitting until node error is acceptable;then it becomes a leaf • Acceptable: error < threshold CS 461, Winter 2008

  20. Turning Trees into Rules CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

  21. Weka Machine Learning Library Weka Explorer’s Guide CS 461, Winter 2008

  22. Summary: Key Points for Today • Supervised Learning • Representation: features available • Model Selection: which hypothesis to choose? • Decision Trees • Hierarchical, non-parametric, greedy • Nodes: test a feature value • Leaves: classify items (or predict values) • Minimize impurity (%error or entropy) • Turning trees into rules • Evaluation • (10-fold) Cross-Validation • Confusion Matrix CS 461, Winter 2008

  23. Next Time • Support Vector Machines (read Ch. 10.1-10.4, 10.6, 10.9) • Evaluation (read Ch. 14.4, 14.5, 14.7, 14.9) • Questions to answer from the reading: • Posted on the website (calendar) CS 461, Winter 2008

More Related