CS 461: Machine Learning Lecture 2

CS 461: Machine LearningLecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu CS 461, Winter 2008

Introductions • Share with us: • Your name • Grad or undergrad? • What inspired you to sign up for this class?Anything specifically that you want to learn from it? CS 461, Winter 2008

Today’s Topics • Review • Homework 1 • Supervised Learning Issues • Decision Trees • Evaluation • Weka CS 461, Winter 2008

Hypothesis Induction, generalization Actions, guesses Refinement Feedback, more observations Observations Review • Machine Learning • Computers learn from their past experience • Inductive Learning • Generalize to new data • Supervised Learning • Training data: <x, g(x)> pairs • Known label or output value for training data • Classification and regression • Instance-Based Learning • 1-Nearest Neighbor • k-Nearest Neighbors CS 461, Winter 2008

Homework 1 • Solution/Discussion CS 461, Winter 2008

Issues in Supervised Learning • Representation: which features to use? • Model Selection: complexity, noise, bias CS 461, Winter 2008

When don’t you want zero error? CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

Model Selection and Complexity • Rule of thumb: prefer simplest model that has “good enough” performance “Make everything as simple as possible, but not simpler.” -- Albert Einstein CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

Another reason to prefer simple models… CS 461, Winter 2008 [Tom Dietterich]

Decision Trees Chapter 9 CS 461, Winter 2008

Decision Trees • Example: Diagnosis of Psychotic Disorders CS 461, Winter 2008

(Hyper-)Rectangles == Decision Tree discriminant CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

Measuring Impurity • Calculate error using majority label • After a split • More sensitive: use entropy • For node m, Nm instances reach m, Nim belong to Ci • Node m is pure if pim is 0 or 1 • Entropy: • After a split: CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

Should we play tennis? CS 461, Winter 2008 [Tom Mitchell]

How well does it generalize? CS 461, Winter 2008 [Tom Dietterich, Tom Mitchell]

Decision Tree Construction Algorithm CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

Decision Trees = Profit! • PredictionWorks • “Increasing customer loyalty through targeted marketing” CS 461, Winter 2008

Decision Trees in Weka • Use Explorer, run J48 on height/weight data • Who does it misclassify? CS 461, Winter 2008

Building a Regression Tree • Same algorithm… different criterion • Instead of impurity, use Mean Squared Error(in local region) • Predict mean output for node • Compute training error • (Same as computing the variance for the node) • Keep splitting until node error is acceptable;then it becomes a leaf • Acceptable: error < threshold CS 461, Winter 2008

Turning Trees into Rules CS 461, Winter 2008 [Alpaydin 2004  The MIT Press]

Weka Machine Learning Library Weka Explorer’s Guide CS 461, Winter 2008

Summary: Key Points for Today • Supervised Learning • Representation: features available • Model Selection: which hypothesis to choose? • Decision Trees • Hierarchical, non-parametric, greedy • Nodes: test a feature value • Leaves: classify items (or predict values) • Minimize impurity (%error or entropy) • Turning trees into rules • Evaluation • (10-fold) Cross-Validation • Confusion Matrix CS 461, Winter 2008

Next Time • Support Vector Machines (read Ch. 10.1-10.4, 10.6, 10.9) • Evaluation (read Ch. 14.4, 14.5, 14.7, 14.9) • Questions to answer from the reading: • Posted on the website (calendar) CS 461, Winter 2008

CS 461: Machine Learning Lecture 2

CS 461: Machine Learning Lecture 2

Presentation Transcript

CSC2515 Fall 2007 Introduction to Machine Learning Lecture 2: Linear regression

BCB 444/544

CSC2515 Fall 2008 Introduction to Machine Learning Lecture 11a Boosting and Naïve Bayes

CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines

CPSC 7373: Artificial Intelligence Lecture 6: Machine Learning

Lecture 18 Expectation Maximization

Introduction to Machine Learning

Introduction to Machine Learning

Machine Learning: Lecture 4

6.S093 Visual Recognition through Machine Learning Competition

Machine Learning in Practice Lecture 12

Lecture 7 : Intro to Machine Learning

Machine Learning Lecture outline

Machine Learning in Practice Lecture 19

Chapter 8 Machine learning

CS 461: Machine Learning Lecture 4

机器学习 machine learning

Machine Learning: Lecture 8

CS 461: Machine Learning Lecture 2

Machine Learning Lecture 11: Nearest Neighbor

About Machine Learning Examples

Machine Learning Training