140 likes | 263 Vues
Symbol Recognition Contest 2009 current status. Philippe Dosch 1 , Ernest Valveny 2 and Mathieu Delalandre 2 1 LORIA, QGAR team, Nancy, France 2 CVC, DAG Group, Barcelona, Spain GREC 2009 Workshop La Rochelle, France Thursday 23th of July 2009. Introduction. Context
E N D
Symbol Recognition Contest 2009current status Philippe Dosch1, Ernest Valveny2 and Mathieu Delalandre2 1LORIA, QGAR team, Nancy, France 2CVC, DAG Group, Barcelona, Spain GREC 2009 Workshop La Rochelle, France Thursday 23th of July 2009
Introduction • Context • Many recognition methods exist, sometimes very ad-hoc and domain dependent • Which are the most generic/robust ones? • Able to recognize a large variety of data, from different application domains • Robust to common noise and distortion found in documents • Easy to implement and/or tune • Objective: Measure their performance and robustness under different criteria and kinds of noise
Introduction • Past recognition contests: ICPR’00, GREC’2003, GREC’2005 and GREC’2007 • Contest evolution ICPR’00, GREC’2003, GREC’2005 segmented technical symbols GREC2007 segmented logos GREC2009 whole drawings (i.e. symbol localization) • Agenda by 31th of July training datasets will be available http://dag.cvc.uab.es/isrc2009/ by October The contest will be run online http://epeires.loria.fr/ • Interested people are invited to participate
Introduction • Concerned data
Plan • Recognition datasets (segmented technical symbols and logos) • Localization datasets (drawings, queries) • Conclusions
Recognition Datasets Basic dataset Scalability … • 10-25 images/class • All classes included Subsets of the basic dataset with increasing number of classes (25, 50, 100, 150) Image degradations Geometric transformations … … … Application of increasing levels of degradation to the images of the basic dataset (for each kind of degradation) Application of rotation and scaling to the images of the basic dataset
Recognition Datasets noise B noise E noise A
Recognition Datasets Training sets
Localization Datasets C1 M1 c1 C2 M2 M3 c2 C3 Symbol Models (2) run M4 C4 Background Image Building Engine (1) edit (3) display p L p1 loaded symbol symbol model bounding box and control point p2 L1 L2 θ2 θ1 alignment
Localization Datasets Background Dataset 1 Contest Dataset 1 Background Dataset 2 Contest Dataset 2 Image degradation Random selection of a test image with groundtruth --- --- Background Dataset n Contest Dataset n 1. Random selection of a document 2. Radom selection of a symbol Groundtruth Generation of queries 3. Random crop
Localization Datasets Level 2 Level 3 Level 1
Conclusions • New feature of the contest, localization datasets • Remaining work, performance characterization for localization simple method (e.g. bounding box overlapping) • Agenda by 31th of July training datasets will be available http://dag.cvc.uab.es/isrc2009/ by October The contest will be run online http://epeires.loria.fr/ • Interested people are invited to participate, please contact us: philippe.dosch@loria.fr ernest.valveny@uab.cat mathieu@cvc.uab.es