160 likes | 296 Vues
This lecture by Dr. Guoping Qiu explores the K-Nearest Neighbor (K-NN) classifier, a fundamental machine learning algorithm. It details how to classify a test instance using the closest training examples based on a defined distance metric, such as Euclidean distance. With practical examples, it illustrates how to determine the best value of K using N-fold cross-validation for minimizing classification errors. The lecture also addresses storage requirements and potential reduction methods for training data, making it essential for anyone interested in machine learning fundamentals.
E N D
Machine Learning Lecture 6 K-Nearest Neighbor Classifier G53MLE Machine Learning Dr Guoping Qiu
x1 2 3 1 4 5 6 7 10 9 11 12 13 14 8 x2 15 16 Elliptical blobs (objects) Objects, Feature Vectors, Points X(15) X(1) X(7) X(16) X(3) X(8) X(25) X(12) X(13) X(6) X(9) X(10) X(4) X(11) X(14)
x1 x2 Nearest Neighbours X(j)=(x1(j), x2(j), …,xn(j)) X(i)=(x1(i), x2(i), …,xn(i)) G53MLE Machine Learning Dr Guoping Qiu
Nearest Neighbour Algorithm • Given training data (X(1),D(1)), (X(2),D(2)), …, (X(N),D(N)) • Define a distance metric between points in inputs space. Common measures are: Euclidean Distance G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model Given test point X • Find the K nearest training inputs to X • Denote these points as (X(1),D(1)), (X(2), D(2)), …, (X(k), D(k)) x G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model Classification • The class identification of X Y = most common class in set {D(1), D(2), …, D(k)} x x G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model • Example : Classify whether a customer will respond to a survey question using a 3-Nearest Neighbor classifier G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model • Example : 3-Nearest Neighbors 15.16 15 152.23 122 15.74 G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model • Example : 3-Nearest Neighbors 15.16 15 152.23 122 15.74 Three nearest ones to David are: No, Yes, Yes G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model • Example : 3-Nearest Neighbors 15.16 15 152.23 122 15.74 Yes Three nearest ones to David are: No, Yes, Yes G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model • Picking K • Use N fold cross validation – Pick K to minimize the cross validation error • For each of N training example • Find its K nearest neighbours • Make a classification based on these K neighbours • Calculate classification error • Output average error over all examples • Use the K that gives lowest average error over the N training examples G53MLE Machine Learning Dr Guoping Qiu
K-Nearest Neighbour Model • Example: For the example we saw earlier, pick the best K from the set {1, 2, 3} to build a K-NN classifier G53MLE Machine Learning Dr Guoping Qiu
Further Readings • T. M. Mitchell, Machine Learning, McGraw-Hill International Edition, 1997 Chapter 8 G53MLE Machine Learning Dr Guoping Qiu
Tutorial/Exercise Questions • K nearest neighbor classifier has to store all training data creating high requirement on storage. Can you think of ways to reduce the storage requirement without affecting the performance? (hint: search the Internet, you will find many approximation methods). G53MLE Machine Learning Dr Guoping Qiu