1 / 13

Chapter 6. Classification and Prediction

Eager Learners : when given a set of training tuples, will construct a generalization model before receiving new tuples to classify Classification by decision tree induction Rule-based classification Classification by back propagation Support Vector Machines (SVM) Associative classification.

Télécharger la présentation

Chapter 6. Classification and Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Eager Learners: when given a set of training tuples, will construct a generalization model before receiving new tuples to classify Classification by decision tree induction Rule-based classification Classification by back propagation Support Vector Machines (SVM) Associative classification Chapter 6. Classification and Prediction

  2. Lazy vs. Eager Learning • Lazy vs. eager learning • Lazy learning (e.g., instance-based learning): Simply stores training data (or only minor processing) and waits until it is given a test tuple • Eager learning (the above discussed methods): Given a set of training set, constructs a classification model before receiving new (e.g., test) data to classify • Lazy: less time in training but more time in predicting

  3. Lazy Learner: Instance-Based Methods • Typical approaches • k-nearest neighbor approach • Instances represented as points in a Euclidean space.

  4. The k-Nearest Neighbor Algorithm • All instances correspond to points in the n-D space • The nearest neighbor are defined in terms of Euclidean distance, dist(X1, X2) • Target function could be discrete- or real- valued • For discrete-valued, k-NN returns the most common value among the k training examples nearest to xq _ _ _ . _ + + _ + xq _ +

  5. The k-Nearest Neighbor Algorithm • k-NN for real-valued prediction for a given unknown tuple • Returns the mean values of the k nearest neighbors • Distance-weighted nearest neighbor algorithm • Weight the contribution of each of the k neighbors according to their distance to the query xq • Give greater weight to closer neighbors • Robust to noisy data by averaging k-nearest neighbors

  6. The k-Nearest Neighbor Algorithm • How can I determine the value of k, the number of neighbors? • In general, the larger the number of training tuples is, the larger the value of k is • Nearest-neighbor classifiers can be extremely slow when classifying test tuples O(n) • By simple presorting and arranging the stored tuples into search tree, the number of comparisons can be reduced to O(logN)

  7. The k-Nearest Neighbor Algorithm • Example: K=5

  8. Distance Measure

  9. Distance Measure Example

  10. What is the prediction if k=1? K=3? K=5?

More Related