40 likes | 174 Vues
This document provides an overview of the K-Nearest Neighbor (K-NN) classification method, focusing on its training set usage for predicting or classifying input data (e.g., sunny, normal). It discusses the parameter K, which defines the number of neighbors to consider (e.g., k=3), and emphasizes the importance of choosing the right value through experimentation. The document explores distance metrics for determining neighbors, approaches for handling ties in votes, and highlights the relevance of K-NN in various fields such as document similarity, image understanding, and case-based reasoning.
E N D
Used for prediction/classification Given input x, (e.g., <sunny, normal, ..?> #neighbors = K (e.g., k=3) Often a parameter to be determined The form of the distance function K neighbors in training data to the input data x: Break ties arbitrarily All k neighbors will vote: majority wins Weighted K-means “K” is a variable: Often we experiment with different values of K=1, 3, 5, to find out the optimal one Why important? Often a baseline Must beat this one to claim innovation Forms of K-NN Document similarity Cosine Case based reasoning Edited data base Sometimes better than 100% Image understanding Manifold learning Distance metric The K-Nearest Neighbor Method