html5-img
1 / 93

Institute of Information Theory and Automation Introduction to Pattern Recognition

Jan Flusser flusser @utia.cas.cz. Institute of Information Theory and Automation Introduction to Pattern Recognition. Pattern Recognition. Recognition (classification) = assigning a pattern/object to one of pre-defined classes. Pattern Recognition.

swain
Télécharger la présentation

Institute of Information Theory and Automation Introduction to Pattern Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jan Flusser flusser@utia.cas.cz Institute of Information Theory and Automation Introduction to Pattern Recognition

  2. Pattern Recognition • Recognition (classification) = assigning a pattern/object to one of pre-defined classes

  3. Pattern Recognition • Recognition (classification) = assigning a pattern/object to one of pre-defined classes • Statictical (feature-based) PR - the pattern is described by features (n-D vector in a metric space)

  4. Pattern Recognition • Recognition (classification) = assigning a pattern/object to one of pre-defined classes • Syntactic (structural) PR - the pattern is described by its structure. Formal language theory (class = language, pattern = word)

  5. Pattern Recognition • Supervised PR – training set available for each class

  6. Pattern Recognition • Supervised PR – training set available for each class • Unsupervised PR (clustering) – training set not available, No. of classes may not be known

  7. PR system - Training stage

  8. Desirable properties of the features • Invariance • Discriminability • Robustness • Efficiency, independence, completeness

  9. Desirable properties of the training set • It should contain typical representatives of each class including intra-class variations • Reliable and large enough • Should be selected by domain experts

  10. Classification rule setup Equivalent to a partitioning of the feature space Independent of the particular application

  11. PR system – Recognition stage

  12. An example – Fish classification

  13. The features: Length, width, brightness

  14. 2-D feature space

  15. Empirical observation • For a given training set, we can have • several classifiers (several partitioning • of the feature space)

  16. Empirical observation • For a given training set, we can have • several classifiers (several partitioning • of the feature space) • Thetraining samplesare not always • classified correctly

  17. Empirical observation • For a given training set, we can have • several classifiers (several partitioning • of the feature space) • Thetraining samplesare not always • classified correctly • We should avoid overtraining of the • classifier

  18. Formal definition of the classifier • Each class is characterized by its • discriminant function g(x) • Classification =maximization of g(x) • Assign x to class i iff • Discriminant functions defines decision • boundaries in the feature space

  19. Minimum distance (NN) classifier • Discriminant function • g(x) = 1/ dist(x,w) • Various definitions of dist(x,w) • One-element training set 

  20. Voronoi polygons

  21. Minimum distance (NN) classifier • Discriminant function • g(x) = 1/ dist(x,w) • Various definitions of dist(x,w) • One-element training set  Voronoi pol • NN classifier may not be linear • NN classifier is sensitive to outliers  • k-NN classifier

  22. k-NN classifier Find nearest training points unless k samples belonging to one class is reached

  23. Linear classifier Discriminant functions g(x) are hyperplanes

  24. Bayesian classifier Assumption: feature values are random variables Statistic classifier, the decission is probabilistic It is based on the Bayes rule

  25. TheBayes rule Class-conditional probability A priori probability A posteriori probability Total probability

  26. Bayesian classifier Main idea: maximize posterior probability Since it is hard to do directly, we rather maximize In case of equal priors, we maximize only

  27. Equivalent formulation in terms of discriminat functions

  28. How to estimate ? • From the case studies performed before (OCR, speech recognition) • From the occurence in the training set • Assumption of equal priors • Parametric estimate (assuming pdf is • of known form, e.g. Gaussian) • Non-parametric estimate (pdf is • unknown or too complex)

  29. Parametric estimate of Gaussian

  30. d-dimensional Gaussian pdf

  31. The role of covariance matrix

  32. Two-class Gaussian case in 2D Classification = comparison of twoGaussians

  33. Two-class Gaussian case – Equal cov. mat. Linear decision boundary

  34. Equal priors max min Classification by minimum Mahalanobis distance If the cov. mat. is diagonal with equal variances then we get “standard” minimum distance rule

  35. Non-equal priors Linear decision boundary still preserved

  36. General G case in 2D Decision boundary is a hyperquadric

  37. General G case in 3D Decision boundary is a hyperquadric

  38. More classes, Gaussian case in 2D

  39. What to do if the classes are not normally distributed? • Gaussian mixtures • Parametric estimation of some other pdf • Non-parametric estimation

  40. Non-parametric estimation – Parzen window

  41. The role of the window size • Small window  overtaining • Large window  data smoothing • Continuous data  the size does not matter

  42. n = 1 n = 10 n = 100 n = ∞

  43. The role of the window size Small window Large window

  44. Applications of Bayesian classifier in multispectral remote sensing • Objects = pixels • Features = pixel values in the spectral bands • (from 4 to several hundreds) • Training set – selected manually by means of thematic maps (GIS), and on-site observation • Number of classes – typicaly from 2 to 16

  45. Satellite MS image

  46. Other classification methods in RS • Context-based classifiers • Shape and textural features • Post-classification filtering • Spectral pixel unmixing

  47. Non-metric classifiers Typically for “YES – NO” features Feature metric is not explicitely defined Decision trees

  48. General decision tree

  49. Binary decision tree Any decision tree can be replaced by a binary tree

  50. Real-valued features • Node decisions are in form of inequalities • Training = setting their parameters • Simple inequalities  stepwise decision • boundary

More Related