20 likes | 158 Vues
Stable Affine Frames on Isophotes Michal Per ďoch, Jiří Matas and Štepán Obdržálek Czech Technical University Prague. Abstract
E N D
Stable Affine Frames on Isophotes Michal Perďoch, Jiří Matas and Štepán Obdržálek Czech Technical University Prague Abstract We propose a new affine-covariant feature, the Stable Affine Frame (SAF). SAFs lie on the boundary of extremal regions, i.e. on isophotes. Instead of requiring the whole isophote to be stable with respect to intensity perturbation as in maximally stable extremal regions (MSERs), stability is required only locally, for the primitives constituting the three-point frames. We show experimentally on standard datasets that SAFs have repeatability comparable to the best affine covariant detectors tested in the report of Mikolajczyk et al.[2] and consistently produce a significantly higher number of features per image. Moreover, the features cover images more evenly than MSERs, which facilitates robustness to occlusion. Motivation Outline • The set of isophotes is a complete image representation. • Isophotes posses desirable properties: they are invariant to projective or “elastic” transformation(homeomorphism) of image coordinates and to monotonic transformations of image intensities. • Observation from MSERs: Not all parts of an isophote are stable as it evolves through the intensities in the image. • Compact representation, (i) covers the image evenly and (ii) avoids multiple responseson one place. • Main idea: detect stable affine covariant structures on isophotes. • To find stable structures on isophotes, an ordering and part (e.g. point) correspondence on isophotes have to be defined. • Affine invariant point-to-point correspondenceson adjacent isophotes are defined via Local Affine Frames (LAFs). • Extremal principle is used to select stable structures. Algorithm for Detection of Stable Affine Frames • Isophote enumeration • Discrete isophotes are boundaries of extremal regions. • Extremal regions are computed efficiently using the union find algorithm. • The partial ordering of nested extremal regions by intensity defines an adjacency relation between isophotes. • Local affine frames [3] on isophotes • LAFs are affine covariant structures on isophotes. • Each LAF is represented as homogenous affine matrix A of the transformation that projects canonical coordinate system to the frame in the image. • Two examples, one construction dependent on local and one dependent on global properties of isophote are shown. The process of construction of two LAFs (cyan and green) on concavities of isophote. Each 4th isophote shown on a part of the image. • Correspondences on adjacent isophotes • Sequences of LAFs are formed on adjacent isophotes using “geometric” similarity • where A1, A2 are local affine frames and is a set of points in the canonical coordinate system. • A stable sequence consists of LAFs on adjacent isophotes that satisfy • µLis a parameter of the method. • Stable LAF selection • Find greedily the longest stable sub-sequence satisfying • if found and if then Ax is output as the Stable Affine Frame. ¢ and µSare parameters of the method. • All frames overlapping the SAF in the sequence are suppressed and step 4 repeated. Three sequences of LAFs on a subset of isophotes. Stable and unstable part of the sequence. The shape normalization in the construction of two LAFs (cyan and green) given by center of gravity, covariance matrix of the region and extremal curvature Selected Stable affine frame.
Experiments • Comparison with the state-of-the-art affine detectors [2] • Evaluation of repeatability and number of correspondences of affine covariant detectors on standard dataset using standard protocol. • Planar scenes with known ground truth homography. • Image coverage experiment • SAF detector allows direct control of the spatial distribution of the detections through the maximum overlap thresholds, which reduces redundancy of the representation. 1 3 (a) (b) (c) 5 Coverage evaluation (darker areas covered by more responses). (a) Graffiti scene, frames detected in both images by: (b) MSER+LAF 251 frames, 32.85% of 764 detected, (c) SAF 319 frames, 44.74% of 713 detected. 1 3 (a) (b) (c) 5 Coverage evaluation (darker areas covered by more responses). (a) Bikes scene, frames detected in both images by: (b) MSER+LAF 586 frames, 33.01% of 1776 detected, (c) SAF 665 frames, 47.94% of 1387 detected. Conclusions • Contributions • SAF - a new affine covariant region detector that is able to detect locally stable structures on isophotes. • Repeatability of the detector is comparable with other detectors [2], SAF detector provides the highest number of correspondences. • Controllable coverage of the image and thus complexity of the model. • Outperforms the MSER+LAF method on planar scenes. • Extensions and future work • Can be easily extended to projective covariant frames [4], or simplified to similarity covariant frames (DoG, HarLap etc.) • Evaluation on non-planar scenes. • Improving running time, now at approximately 10s for 800x600 image vs. 0.1s for MSERs. Ranking of detectors based on repeatability Ranking of based on number of correspondences • Repeatability vs. number of features, comparison with MSER+LAF[3] • Study of the tradeoff between the number of responses and quality of detector output in terms of • geometric repeatability (upper row). • “matchability” - percentage of correct correspondences in all matches (lower row). Note, “matchability” depends, besides the detector, on the matching procedure. References [1] J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust widebaseline stereo from maximally stable extremal regions. In BMVC, 2002. [2] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. V. Gool. A comparison of affine region detectors. IJCV, 65(1-2):43-72, 2005. [3] Š.Obdržálek and J. Matas. Object recognition using local affine frames on distinguished regions. In BMVC, 2002. [4] C. Rothwell, A. Zisserman, D. Forsyth, and J. Mundy. Canonical frames for planar object recognition. In ECCV, LNCS 588. Springer, 1992. Acknowledgements The authors were supported by Grant Agency of the Czech Technical University Prague project CTU 0706313 and by Czech Science Foundation project 201/06/1821. Images and figure sources were kindly provided by Krystian Mikolajczyk.