1 / 38

Pattern Classification and Evaluating

Pattern Classification and Evaluating. Contents. Introduction Pattern Recognition Pattern Classification Evaluating a pattern. What is Pattern Recognition?. The study of how machines can observe the environment, learn to distinguish patterns of interest from their background, and

Télécharger la présentation

Pattern Classification and Evaluating

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pattern Classification and Evaluating

  2. Contents • Introduction Pattern Recognition • Pattern Classification • Evaluating a pattern

  3. What is Pattern Recognition? • The study of how machines can observe the environment, • learn to distinguishpatterns of interest from their background, and • make sound and reasonable decisions about the categories of the patterns. • What is a pattern? • What kinds of category we have?

  4. What is a Pattern? • As opposite of a chaos; it is an entity, vaguely defined, that could be given a name. • For example, a pattern could be • A fingerprint images • A handwritten cursive word • A human face • A speech signal

  5. Pattern Recognition Models • The four best known approaches • template matching • statistical classification • syntactic or structural matching • neural networks

  6. Pattern Recognition Models

  7. Pattern Representation • A pattern is represented by a set of d features, or attributes, viewed as a d-dimensional feature vector.

  8. Classification Mode test pattern Preprocessing Feature Measurement Classification Feature Extraction/ Selection training pattern Preprocessing Learning Training Mode Two Modes of a Pattern Recognition system

  9. Pattern Classification • Pattern classification involves taking features extracted from the image and using them to classify image objects automatically. • This is done by developing classification algorithms that use the feature information. • The primary uses of pattern classification are for computer vision and image compression applications development.

  10. Pattern Classification • Pattern classification is typically the final step in the development of a computer vision algorithm. • In computer vision applications, the goal is to identify objects in order for the computer to perform some vision-related task. • These tasks range from computer diagnosis of medical images to object classification for robot control.

  11. Pattern Classification • In image compression, we want to remove redundant information from the image and compress the important information as much as possible. • One way to compress information is to find a higher-level representation of it, which is exactly what feature analysis and pattern classification is all about.

  12. Pattern Classification • To develop a classification algorithm, we need to divide our data into two. • Training set: To develop the classification scheme • Test set: To test the classification algorithm • Both the training and the test sets should represent the images that will be seen in the application domain.

  13. Pattern Classification • Theoretically, a larger training set size would give an increasingly higher success rate. • However, since we normally have a finite number of data (images), they are equally divided between the two sets. • After the data have been divided, work can begin on the development of the classification algorithm. Figure 6.4.1

  14. Pattern Classification • The general approach is to use the information in the training set to classify the “unknown” samples in test set. • It is assumed that all samples available have a known classification. • The success rate is measured by the number of correct classifications.

  15. Pattern Classification • The simplest method for identifying a sample from the test set is called the nearest neighbor method. • The object of interest is compared to every sample in the training using either a distance measure, a similarity measure or a combination of measures.

  16. Pattern Classification • The “unknown” object is then identified as belonging to the to the same class as the closest example in the training set. • If distance measure is used, this is indicated by the smallest number. • If similarity measure is used, this is indicated by the largest number. • This process is computationally intensive and not robust.

  17. Pattern Classification • We can make the nearest neighbor method more robust by selecting not just the vector it is closest to, but a group of close feature vectors. • This method is known as K-nearest neighbor method. • K can be assigned any integer.

  18. Pattern Classification • Then we assign the unknown feature vector to the class that occurs most often in the set of K-neighbors. • This is still very computationally expensive since we must compare each unknown sample to every sample in the training set. • Even worse, we normally want the training set to be as large as possible.

  19. Pattern Classification • One way to reduce the amount of computation is by using a method called nearest centroid. • Here, we find the centroid vector that is the representative of the whole class. • The centroids are calculated by finding the average value for each vector component in the training set.

  20. Pattern Classification • The unknown sample only needs to be compared with the representative centroid. • This would reduce the number of comparisons and subsequently the amount of calculations. • Template matching is a pattern classification method that uses the raw image data as a feature vector.

  21. Pattern Classification • A template is devised, possibly via a training set, which is then compared to subimages by using a distance or similarity measure. • Typically, a threshold is set on this measure to determine when we have found a match. • More sophisticated methods using fuzzy logic, artificial neural network, and probability density model are also commonly used.

  22. Evaluating a pattern recognition system • Recognition rate • Cross validation • Interpretation of the results

  23. Recognition rate • In your system specifications you need success criteria for your project (product) • HW, SW, Real-time, recognition rate,… • Recognition rate = (number of correct classified / number of tested samples) • Multiply by 100% and you have it in percentages • How do you test a system? • How do you present and interpret the results?

  24. Test • The training data contains variations • Is this variation similar to the variations in ”real life data” ? • The system will never be better than the training data ! • The right question to ask is how well the trained system generalizes • That is, how well does the system recognize UNKNOWN data? • NEVER TRAIN OF TEST DATA !!! • However, it does provide an upper limit for the recognition rate • Test methods: • Cross-validation • M-fold cross validation

  25. Methods for test • Cross-validation • Train on a % of the samples (a > 50) and test on the rest • a is typically 90, depending on the number of samples and the complexity of the system • M-fold cross validation • Divide (randomly) all samples in M equally sized groups • Use M-1 groups to train the system and test on the rest • Do this M times and average the results

  26. Training of the system • Before we test we need to train our system • How much should the system be trained? • How much should the different parameters be tuned? • Danger of over fitting!

  27. Interpretation of the results • Recognition rate = (number of correct classified / number of tested samples) • Multiply by 100% and you have it in percentages • Error % = 100% - ( Recognition rate x 100% ) • Distribution of errors? • Confusion matrix • 3 classes • 25 samples per class Output (from the system) Input (the truth)

  28. Confusion matrix Output (from the system) • Provides inside into : • Are the errors equally distributed or are the errors only associated with a few classes? • ”Solution”: Sub-divide the classes, delete some classes, use other features, post processing,… • Is one class too big (”eats many others”) ?? • Which classes are close ?? • Etc…. Input (the truth)

  29. Confusion matrix – overview… • Number of errors = Incorrect recognized + not recognized S1 = a+b+c. S2= d+e+f. S3 = g+h+i. S = S1 + S2 + S3. T = Trace = a+e+i (matrix diagonal - successes)

  30. General Representation of errors • Number of errors = Incorrect recognized + Not recognized • The total number of errors can be represented like this: Output How does the system respond to a random input…. Input

  31. General Representation of errors • Number of errors = Incorrect recognized + Not recognized • The total number of errors can be represented like this: Output (from the system) Not recognized (Type II error) (False negativ = FN) (False reject = FR) (False reject rate = FRR) (Miss) Input (the truth) Incorrect recognized (Type I error) (False positiv = FP) (False accept = FA) (False accept rate = FAR) (Ghost object) (False alarm)

  32. Design your system wrt errors • One parameter (threshold?) often controls FN and FP • Use this parameter when designing the system wrt errors • Bayes classifier: • Given an input, find the nearest class using Mahalanobis distance: r (tavle) • Are we 100% sure the input originates from a known class? • Noise can result in unreliable data • Solution: Besides nearest class we also introduce a Threshold on r • That is, r < TH otherwise ignore this sample (not recognized: FN) • (Tavle: kurver som funktion af r )

  33. Design your system wrt errors • Choice of TH: DEPENDS ON THE APPLICATION! • Default: EER or overall minimum error • If we want a low FP (incorrect recognized). This results in more FN (not recognized) and you therefore need to post-process these samples • ”Re-try” (as with conveyer belts) • ”New” pattern recognizer with ”new” features • Or… • General post-processing: store the likelihoods for each classified sample and use this in the following

  34. General Representation of errors • Example: SETI • Find intelligent signals in input data • FN versus FP – are they equally important? Output (from the system) No !! Not recognized (Type II error) (False negativ = FN) (False reject = FR) (False reject rate = FRR) (Miss) Input (the truth) Ok Incorrect recognized (Type I error) (False positiv = FP) (False accept = FA) (False accept rate = FAR) (Ghost object) (False alarm)

  35. General Representation of errors • Example: Access control to nuclear weapons • Is the person trying to enter ok? • FN versus FP – are they equally important? Output (from the system) Ok Not recognized (Type II error) (False negativ = FN) (False reject = FR) (False reject rate = FRR) (Miss) Input (the truth) No !! Incorrect recognized (Type I error) (False positiv = FP) (False accept = FA) (False accept rate = FAR) (Ghost object) (False alarm)

  36. Design your system wrt errors • When we have True Negatives (correct rejection) • Fx a system which assesses whether a person is sick or well • Alternative representation • ROC curve • ROC = Receiver Operating Characteristic • (Tavle ) • X-axis: Fraction of the well who were found to be sick • False Positive Rate (FPR): FP / (FP+TN) • Y-axis: Fraction of the sick who were found to be sick • True Positive Rate (TPR): TP / (TP+FN) • ROC curves are good when comparing different systems • Only one curve with normalized axes: [0,1] • The more similar a curve is to: the better the system

  37. What to remember (2/2) • Danger of over fitting! • Interpretation of results • Confusion matrix • Error = Incorrect recognized (FP) + not recognized (FN) • FP and FN depend on a Threshold value • How is this Threshold value defined in your system? • FP and FN can be illustrated directly or as a ROC curve • Test on data from unknown classes • Relevant for your project?

  38. What to remember (1/2) • Success criteria for your project • HW, SW, Real-time, recognition rate • Test • You perform tests to see how well the trained system generalizes • NEVER TEST ON TRAINING DATA!!! • Cross-validation • Train on a % of the samples (a > 50) and test on the rest • What should a be? • M-fold cross validation • Divide (randomly) all samples in M equally sized groups • Use M-1 groups to train the system and test on the rest • Do this M times and average the results

More Related