1 / 10

KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11

KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11. Incremental Learning. Friday, November 16 James Plummer jwp1924@ksu.edu Reference Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997. Outline. Machine Learning Extracting information from data Forming concepts

inga-decker
Télécharger la présentation

KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KDD Group Research SeminarFall, 2001 - Presentation 8 – 11 Incremental Learning Friday, November 16 James Plummer jwp1924@ksu.edu Reference Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997.

  2. Outline • Machine Learning • Extracting information from data • Forming concepts • The Data • Arrangement of Data • Attributes, Labels, and Instances • Categorization of Data • Results • MLJ ( Machine Learning in Java ) • Collection of Machine Learning algorithms • Current Inducers • Incremental Learning • Description of technique • Nearest Neighbor Algorithm • Distance-Weighted Algorithm • Advantages and Disadvantages • Gains and Loses.

  3. Machine Learning • Sometimes called Data Mining • The process of extracting useful information from data • Marketing databases, medical databases, weather databases • Finding Consumer purchase patterns • Used to form concepts • Predictions • Classifications • Numeric Answers

  4. The Data • Arrangement of Data • A piece of data is a set of attributes ai which make up an instance xj • Attributes can be considered evidence • Each instance has a label or category f(xj) (outcome value) xj = a1, a2, a3, . . . ai; f(xj); • A set of data is a set of instances • Categorization • A set of instances is used as control for new query instances xq(training) • Calculate f^(xj) based on training data • f^(xj) is the predicted value of the actual f(xj) • Results • The number of correctly predicted values over the total number of query instances • f^(xq)correct/ f(xq)total

  5. Yes No Yes Data Example • Predict the values of Example 6, 7, 8 given data examples 1 through 5

  6. MLJ (Machine Learning in Java) • MLJ is a collection of learning algorithms • Inducers • Categorize data to learn concepts • Currently in Development • ID3 • Uses trees • Naïve Bayes • Uses complex calculations • C4.5 • Uses trees with pruning techniques • Incremental Learning • Uses comparison techniques • Soon to be added

  7. Incremental Learning • Instance Based Learning • k-Nearest Neighbor • All instances correspond to points in an n-dimensional space • The distance between two instances is determined by: ar(x)is therthattribute of instancex • Given a query instance xq to be categorized the k-nearest neighbors are calculated • f^(xq) is assigned the most frequent value of the nearest k f(xj) • For k = 1, f^(xq) will be assigned f(xi) if xi is the closest instance in the space

  8. Examine three cases for the 2 dimensional space to the right • k=1 • k=5 • Weighted, k=5 Distance-Weighted Nearest Neighbor • Same as k-Nearest Neighbor • Effect of f(xj) on f^(xq) based on d(xq, xj) • In the case xq = xithen f^(xq) = f(xi)

  9. Advantages and Disadvantages • Gains of using k-Nearest Neighbor • Individual attributes can be weighted differently • Change d(xi, xq) to allow nearest xi to have stronger of weaker effect on f^(xq) • Unaffected by noise in training data • Very Effective when provided a large set of training data • Flexible, f^(xq) can be calculated in many useful ways • Very small training time • Loses • Not good when training data is insufficient • Not very effective if similar xi have disimilar f^(xi) • More computation time need to categorize new instances

  10. Referrences • Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997. • Witten and Frank, “Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations”. Morgan Kaufmann publishers. 2000. * equation reduced for simplicity

More Related