1 / 12

The K Nearest Neighbor Algorithm (kNN)

The K Nearest Neighbor Algorithm (kNN). Erik Zeitler Uppsala Database Laboratory. Examination. Examination is split in two parts Solve the assignment Oral examination During the oral examination The instructor validates your program using a script Non-working program

dcapehart
Télécharger la présentation

The K Nearest Neighbor Algorithm (kNN)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The K Nearest Neighbor Algorithm (kNN) Erik Zeitler Uppsala Database Laboratory

  2. Examination • Examination is split in two parts • Solve the assignment • Oral examination • During the oral examination • The instructor validates your program using a script • Non-working program  the examination ends immediately (“fail” grade is given)  you may re-do the examination later • The instructor will ask questions • on your implementation • on the method itself • All group members must take part in the solution. • Group members can get different grades on the same assignment. Erik Zeitler

  3. Grades Erik Zeitler

  4. Examination • Why do we have the oral part? Are we out to get you? • The assignments cover a good part of the course  understanding them will help you. • If you have problems solving the assignment, please ask during office hours. • The only way asking will affect your grade is that you might learn more. • Solving assignments • Understanding your own solution Different things! Erik Zeitler

  5. What you need to do • Sign up for oral exam • Groups of 2 – 3 students • Forms are on my office door, P1320 • Implement a solution • Deadline: Submit by e-mail 24h before your oral exam • 1, 2: erik.zeitler@it.uu.se • 3, 4: gyozo.gidofalvi@it.uu.se • Answer the questions on the form • Bring one form per student • Prepare for oral exam: • Study the theory behind Erik Zeitler

  6. K Nearest Neighbor • Basic idea: • If it walks like a duck and it quacks like a duck  Then it must be a duck • So how do we know how a duck walks and talks? • Either we ask the other ducks – or if they are unavailable – • Look at who else is walking and talking this way. Erik Zeitler

  7. Duck walking and talking • Assume that a duck • has average step length 5…15 cm • quacks at a frequency 600…700 Hz • On the other hand consider a cow: • step length is 30…60 cm • a cow moos at 100…200 Hz Erik Zeitler

  8. Cows and Ducks in a Plot Erik Zeitler

  9. Enter the Chicken Erik Zeitler

  10. Classifying you using kNN • Each of you belong to a group: • [F|STS|Int Masters|Exchange Students|Other] • Let’s classify each one using 1-NN and 3-NN • How do we select our distance measure? • How do we decide which of 1-NN and 3-NN is best? Erik Zeitler

  11. Things to Consider for the Assignment • Preprocessing • What are the ranges of the different measurements? • Is one characteristic more important than another? • If so, how can we reflect this? • If not, do we need to do something else? • You can assume: no missing points, no noise • Selecting training and testing data and choosing K • Is the data sorted in any way? If so is this good or bad? • Are there different ways of subdividing the known data? • How do we know if the value of K is good or bad? Erik Zeitler

  12. Things to Consider for the Assignment • Classifying unknown data • Do we need to preprocess the unknown data? • Which data set should we use to classify the unknown data? • Complexity • What is the offline part of kNN and what is the online part? • What is the complexity for the offline and online parts of kNN? Erik Zeitler

More Related