320 likes | 437 Vues
This paper explores the intersection of traditional object recognition methods and the potential for learning rigid geometric models from 2D image evidence, also known as iconic models. We analyze how visual mechanisms from biology, such as foveated vision and visual attention, inform the development of our model architecture. Our approach involves learning primitive object models and their relationships through a systematic algorithm, allowing us to infer structured object models from scene sequences. Support for this research is provided by CNPq Brazil and DSC/COPIN/UFPB.
E N D
Structural Learning from Iconic Representations Herman M. Gomes* hmg@dsc.ufpb.br Robert B. Fisher rbf@dai.ed.ac.uk Institute of Perception, Action and Behaviour Division of Informatics Edinburgh University * Supported by CNPq Brazil and DSC/COPIN/UFPB.
Overview • Introduction • Learning primitive object models • Learning model relationships • Case study • Conclusions and future work SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Traditional object recognition research • Geometric, symbolic or structure based recognition: CAD based 2D and 3D vision and 3D object recognition • Property, vector or feature based recognition: specific feature vectors, multiple filtering, global descriptors for shape, texture and colour, amongst others • Iconic or image based recognition: direct use of images, either complying with the traditional sensor architecture or using alternative representations • This work fits in the intersection of the above areas SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Question: Would it be possible to learn rigid geometric models from 2D image evidence (iconic object models) acquired from a sequence of scenes? SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Mechanisms from Biology • Foveated vision: retina-like image representation (log-polar) has useful properties • Visual attention: fixation gives insights where object features (or components) are likely to be found • Primal sketch: provides more compact representations for image data and cues for an attention mechanism SBIA/IBERAMIA 2000, Atibaia SP
Introduction • System’s architecture Model base Attention Map Generic Scenes Update attention Primitive models Model relationships Feature planes Extract primal sketch planes Foveate Image Cluster objects SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Image representation • Gaussian receptive field function • Local contrast normalisation for estimating original reflectance information • Primal sketch features (edges, bars, blobs and ends) learned and extracted using a neural network approach • Log-polar SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Retinal and log-polar images SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Some of the extracted features SBIA/IBERAMIA 2000, Atibaia SP
Learning primitive object models • Definition Primitive iconic model: set of regions or object instances that are similar to each other, organised into a geometric model and specified by means of the relative scales, orientations, positions and similarity scores for each pair of image regions SBIA/IBERAMIA 2000, Atibaia SP
Learning primitive object models 5 1 11 12 4 6 9 2 3 7 8 10 13 Model X: {1,2,3,4,5,6,9,10,11} Model Y: {7,8,12,13} Y 7 8 12 13 Relative scales and orientations 7 (1.0, 0) (1.0, 180) (0.8, 270) (0.8, 90) 8 (1.0, 0) (0.8, 90) (0.8, 270) 12 (1.0, 0) (1.0, 180) 13 (1.0, 0) SBIA/IBERAMIA 2000, Atibaia SP
Algorithm for each scene image S ido for each foveation point P ij on the scene do obtain the object feature f ij at the position P ij if the model base is empty then create a new model and store f ij on it else generate a set of scaled and rotated variations of f ij find the model Ft that gives the highest similarity score Cmax between its internal object features and one of the variations ifCmax > threshold store f ij in Ft for all f kl in Ftstore the similarity scores Sm(f ij, f kl) and the relative scales rS(f ij, f kl) and the relative orientations rO(f ij, f kl) elsecreate a new model and store f ij on it SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Initial assumptions • The recognition of consistent geometric relations allows the inference of larger structured object models • At the moment we consider only 2D rigid body transformations SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Overview of the solution • A graph is built from the output of the previous algorithm • Vertices represent instances of an image neighbourhood found in the scenes • Edges represent a relationships between two neighbourhoods • Intra and inter-model relationships are inferred by means of the cliques found in the graph SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Vertices • How to distinguish between object feature instances of a same type that are structurally linked together and those that are not? • How to make the correspondence between sets of feature instances that appear at a same position in all objects? • We build hypothesis in the vertices by taking the full combinatorial set of object feature instances and applying an evaluation function that tells how good a hypothesis is SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Vertices • Defined as the Cartesian product of the sets of object feature instances of a same model class found in each of the images in the sequence alternatively: • Missing features: a wild-card ‘*’ is added to f ibefore computing the Cartesian product SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Vertices • Ranking: function of the similarity between its internal object instances • Pruning: allow only K *’s per vertex during the vertex creation process (K << N) • We define: Sm(*,*) = Sm(*,f j ) = Sm(f i ,*) = 1 SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Edges • An edge e(a,b) relates two compatible vertices • Two vertices a and b are defined as compatible if: 1. For each pair of feature instances (a i, a j ) in a, which are related by a given scale and orientation R=(rS(a i, a j ), rO(a i, a j )), there exists another pair (b i, b j ) in bthat has internal components related through a similar relative scale and orientation; and 2. Each pair of feature instance co-ordinates (Pai, Pbi ) and (Paj, Pbj ) taken from the same vertex positions roughly define the same vector angle Aand length D,Q=(A, D), when taking into account the feature's relative scales and orientations. SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Understanding edge creation Rija= (rS(a i, a j ), rO(a i, a j )) Rijb= (rS(b i, b j ), rO(b i, b j )) Qiab= A(Pai, Pbi ), D(Pai, Pbi )) Qjab= A(Paj, Pbj ), D(Paj, Pbj )) a=(a1, a2,..., aN) b=(b1, b2,..., bN) Rija ai aj bN Qiab Qjab ... ... ... bi bj aN Rijb Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Edge ranking rS, rO ai aj bN A,D A,D ... ... ... bi bj aN rS, rO Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing relative scales rS(a i, a j ) ai aj bN ... ... ... bi bj aN rS(b i, b j ) Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing relative orientations rO(a i, a j ) ai aj bN ... ... ... bi bj aN rO(b i, b j ) Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing vector angles ai aj A(ai, bi ) bN A(a j, b j ) ... D(ai, bi ) ... ... D(a j, b j) bi bj aN Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing vector lengths SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Edge pruning • Eliminate edges that link pairs of vertices with at least one common instance at the same position in the vertex list • Discard edges below a given threshold SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Cliques • Sets of vertices that are maximally connected • A standard algorithm is used to find cliques in the graph G=(V,E) • Ranking: • Thresholding SBIA/IBERAMIA 2000, Atibaia SP
Case study • Initial setup • Scenes created from synthetically generated variations on a same set of real input images • Interest points selected manually • Similarity function defined as the cross-correlation of grey level and primal sketch planes SBIA/IBERAMIA 2000, Atibaia SP
Primitive object feature models found SBIA/IBERAMIA 2000, Atibaia SP
Cliques found SBIA/IBERAMIA 2000, Atibaia SP
Conclusions and future work • We provided some evidence that structured models can be learned in the context of an iconic vision system by using a graph-based representation and algorithm • The relative positions of object features are recorded and, by using the feature’s relative scales and orientations, possible relationships can be inferred SBIA/IBERAMIA 2000, Atibaia SP
Conclusions and future work • There are still some issues that require further research: • Deciding for the best ranking functions and how they affect the final results • Learning primitive object models is computationally expensive • Vertex creation is exponential to the number of images • Designing more complex case studies is currently under development SBIA/IBERAMIA 2000, Atibaia SP