K Nearest Neighbor Classification Methods

K Nearest Neighbor Classification Methods

Training Set

Used for prediction/classification Given input x, (e.g., <sunny, normal, ..?> #neighbors = K (e.g., k=3) Often a parameter to be determined The form of the distance function K neighbors in training data to the input data x: Break ties arbitrarily All k neighbors will vote: majority wins Weighted K-means “K” is a variable: Often we experiment with different values of K=1, 3, 5, to find out the optimal one Why important? Often a baseline Must beat this one to claim innovation Forms of K-NN Document similarity Cosine Case based reasoning Edited data base Sometimes better than 100% Image understanding Manifold learning Distance metric The K-Nearest Neighbor Method

How to decide the distance?Try 3-NN on this data: testing

K Nearest Neighbor Classification Methods