1 / 18

Performance Evaluation in Computer Vision

Performance Evaluation in Computer Vision. Kyungnam Kim Computer Vision Lab, University of Maryland, College Park. Contents. Error Estimation in Pattern Recognition Jain et al., “Statistical Pattern Recognition: A Review”, IEEE PAMI 2000 (Section 7 Error Estimation).

elewa
Télécharger la présentation

Performance Evaluation in Computer Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Evaluation in Computer Vision Kyungnam Kim Computer Vision Lab, University of Maryland, College Park

  2. Contents • Error Estimation in Pattern Recognition • Jain et al., “Statistical Pattern Recognition: A Review”, IEEE PAMI 2000 (Section 7 Error Estimation). • Assessing and Comparing Algorithms • Adrian Clark and Christine Clark, “Performance Characterization in Computer Vision: A Tutorial”. • Receiver Operating Characteristic (ROC) curve • Detection Error Trade-off (DET) curve • Confusion Matrix • McNemar’s test http://peipa.essex.ac.uk/benchmark/

  3. Error Estimation in Pattern Recognition • Reference - Jain et al., “Statistical Pattern Recognition: A Review”, IEEE PAMI 2000 (Section 7 Error Estimation). • It is very difficult to obtain a closed-form expression for error rate Pe. • In practice, the error rate must be estimated from all the available samples split into training and test sets. • Error estimate = percentage of misclassified test samples. • Reliable error estimate – (1) Large sample size, (2) Independent training and test samples.

  4. Error Estimation in Pattern Recognition • The error estimate (function of the specific training and test sets used) is random variable. • Given a classifier, t is # of misclassified test samples out of n.  The probability density function of t has a binomial distribution. • The maximum-likelihood estimate, Pe, of Pe is given by Pe=t/n, with E(Pe) = Pe and Var(Pe) = Pe(1- Pe)/n. • Pe is a random variable  a confidence interval (shrink as n increases)

  5. leave all in versions of cross- validation approach resampling based on the analogy population  sample sample  sample http://www.uvm.edu/~dhowell/StatPages/Resampling/Bootstrapping.html http://www.childrens-mercy.org/stats/ask/bootstrap.asp http://www.cnr.colostate.edu/class_info/fw663/bootstrap.pdf http://www.maths.unsw.edu.au/ForStudents/courses/math3811/lecture9.pdf

  6. Error Estimation in Pattern Recognition • Receiver Operating Characteristic (ROC) Curve  detailed later. • ‘Reject Rate’: reject doubtful patterns near the decision boundary (low confidence). • A well-known reject option is to reject a pattern if its maximum a posteriori probability is below a threshold. • Trade-off between ‘reject rate’ and ‘error rate’.

  7. Next seminar: Dimensionality Reduction/Manifold Learning ?

  8. classification method

  9. Assessing and Comparing Algorithms • Reference: Adrian Clark and Christine Clark, “Performance Characterization in Computer Vision: A Tutorial”. • http://peipa.essex.ac.uk/benchmark/tutorials/essex/tutorial.pdf • The same training and test sets. Some standard sets – FERET, PETS. • Simply to see which has the better success rate?  Not enough. A standard statistical test, McNemar’s test is required. • Two types of testing: • Technology evaluation: the response of an underlying generic algorithm to factors such as adjustment of its tuning parameters, noisy input date, etc. • Application evaluation: how well an algorithm performs a particular task

  10. Assessing and Comparing Algorithms • Receiver Operating Characteristic (ROC) curve

  11. Assessing and Comparing Algorithms • Detection Error Trade-off (DET) curve • logarithmic scales on both axes • more spread out, easier to distinguish • close to linear

  12. Assessing and Comparing Algorithms • Detection Error Trade-off (DET) curve • Forensic applications: track down a suspect • High security applications: ATM machines • EER (equal error rate) • Comparisons of algorithms tend to be performed • with a specific set of tuning parameter values • (Running them with settings that correspond to • the EER is probably the most sensible.)

  13. Assessing and Comparing Algorithms • Crossing ROC curves Comparisons of algorithms tend to be performed with a specific set of tuning parameter values (Running them with settings that correspond to the EER is probably the most sensible.)

  14. Assessing and Comparing Algorithms • Confusion Matrices

  15. Assessing and Comparing Algorithms • McNemar’s test An appropriate statistical test must take into account not only # of FP, etc. but also ‘# of tests’. (a form of chi-square test) http://www.zephryus.demon.co.uk/geography/resources/fieldwork/stats/chi.html http://www.isixsigma.com/dictionary/Chi_Square_Test-67.htm

  16. Assessing and Comparing Algorithms • McNemar’s test If # of tests > 30, the central limit theorem applies

More Related