Object Recognition in the Dynamic Link Architecture

Object Recognition in the Dynamic Link Architecture Yang Ran CMPS 828J

Outline • Background and Introduction • System Overview • General algorithm in details • Implementations of the algorithm • Experiment results • Further readings and conclusion

Background • Problem: To recognize human faces from single images our of a large gallery. • Challenges: Distortions in terms of position, size , expression, and pose • Existed methods: • Appearance Based v.s. Shape based • 2D vs. 3D

Background: Notations • Image: face image • Model: face gallery • Graph: a concise face description • Jet: A local description of the distribution based on the Gabor transform

System Overview • Faces are represented as rectangular graphs by layers of neurons • Each neuron represents a node and has a jet attached

Assumptions • The image domain and the model domain are bi-directionally connected by dynamic links. • These connections are plastic on a fast time scale, changing radically during a single recognition event • The strength of a connection between any two nodes in the image and a model is controlled by the jet similarity between them, which roughly corresponds to the number of features that are common to the two nodes

Key Factors • Basic representation is the labeled graph formed by edges and vertices bundled in jets • Edge Labels: distance information • Vertex/Node Labels: wavelet responses • Graph should be able to deform to adapt to the variations of human faces

Preprocessing by Gabor Wavelets • Gabor Wavelets are biological motivated convolution kernels in the shape of plane waves restricted by Gaussian envelope function

More for Gabor Why use it? • A good approximation to the sensitivity profiles of neurons found in visual cortex of higher vertebrates • Cells come in pair with even and odd symmetry like the real and imagery part of Gabor Filter

Jets Generation • The set of convolution coefficients for kernels and frequencies at one image pixel is called a jet • Describes a small patch of gray values around a given pixel • Sample W at five logarithmically spaced f levels and eight directions by u, v

Jets Generation-cnt’l • The magnitude of (WI) (kuv, x) form a feature vector located at x, which will be referred to as a jet • Evaluate the similarity by Elastic Graph Matching:

Edge Labels • Derived from neuron version, edges encodes neighborhood relationships • Presents the topology of the vertices • Define • Quadratic comparison function

Example • Graph representation of a face

Elastic Graph Matching Elastic matching of a model graph M to a target graph I amounts to a search for a set of vertex positions which simultaneously optimizes the matching of vertex labels and edge labels according to:

Elastic Graph Matching-cnt’l A heuristic algorism is seek to close the optimum within a reasonable time • Step 1: find approximate face position so that the image can be scaled and cut to standard size • Step 2: Extract graph from target face image • Step 3: Match with cost function • Refine position and size with λ = infinity • Local distortion

Experiments • Data Base • Technical Aspects • Results • Conclusions

Data Base As a face data base we used galleries of 111 different persons. Of most persons there is one neutral frontal view, one frontal view of different facial expression, and two views rotated in depth by 15 and 30 degrees respectively.

Technical Aspects • The CPU time needed for the recognition of one face against a gallery of 111 models is approximately 10--15 minutes on a Sun SPARCstation 10-512 with a 50 MHz processor.

Results-Office Items

Comparison of Two Galleries

More Results

More Results-cnt’l

Recognition Results Against Galleries Recognition results against a gallery of 20, 50, and 111 neutral frontal views

Conclusion • Close to natural model: a small number of examples is needed for face recognition • Gabor Wavelets representation are robust to moderate lighting changes, shifts and deformations • Elastic Graph Matching in Dynamic Link Architecture is robust in face recognition

Conclusion • Having only several images per person in gallery does not provide sufficient information to handle 3D rotation • Rectangle grid v.s. Feature points

References • M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R.P. Wurtz, W. Konen. Distortion Invariant Object Recognition in the Dynamik Link Architecture. IEEE Transactions on Computers 1992, 42(3):300-311. • Laurenz Wiskott, Jean-Marc Fellous, Norbert Krüger, et al. Face Recognition by Elastic Bunch Graph Matching,Proc. 7th Intern. Conf. on Computer Analysis of Images and Patterns, CAIP'97, Kiel

Object Recognition in the Dynamic Link Architecture

Object Recognition in the Dynamic Link Architecture

Presentation Transcript

OBJECT RECOGNITION

Object recognition

Dense Object Recognition

Object Recognition

Object Recognition

Visual Object Recognition

Object Recognition

Visual Object Recognition

Visual Object Recognition

Object recognition

Expert Object Recognition in Video

Object Recognition

Object Recognition

Object Recognition

What is the Best Multi-Stage Architecture for Object Recognition

Object recognition

Object Recognition

Object recognition

Object Recognition