610 likes | 734 Vues
Object Recognition. Tom McGrath CIS 601. What is object recognition?. Perception of objects is different for humans than for computers. For humans: perception of familiar items. For computers: perception of familiar patterns. Are they really the same thing?. What do we mean by ‘objects’.
E N D
Object Recognition Tom McGrath CIS 601
What is object recognition? • Perception of objects is different for humans than for computers. • For humans: perception of familiar items. • For computers: perception of familiar patterns. • Are they really the same thing?
What do we mean by ‘objects’ • What we call object recognition may also be called pattern recognition. • A pattern is an arrangement of descriptors. • Descriptors may have more forms, but they are primarily vectors and strings.
More generally… • Object recognition is the process whereby observers are able to recognize three-dimensional objects despite receiving only two-dimensional input that varies greatly depending on viewing conditions.(2)
2 main approaches • Decision-theoretic • Patterns described using quantitative descriptors. • Structural • Patterns represented by symbolic information. • Strings, for example.
Decision Theoretic • Based on discriminant functions • Let x = (x1, x2, …, Xn)T represent an n-dimensional pattern vector • Let W = (w1, w2,… ,wW) be pattern classes.
Basic problem of decision-theoretic • We want to find W decision functions d1(x), d2(x),…,dw(x) with the property: • If a pattern x belongs to class wi, then di(x) > dj(x), where j = 1, 2, …, W; j != i
In other words.. • We want to classify x, which is a pattern. • We are given a finite set of classes of objects. • We want to categorize the pattern x into one of the classes. • To do so, we apply x to all decision functions, and categorize x to the class of best fit.
Structural • Represents objects as strings, trees, graphs.. • Define descriptors and recognition rules base on the representations.
What does finite classification imply? • The idea of a finite set of classes is quite limiting. • Corresponds with industries’ use of object recognition: very application specific. • Indicates that computer object recognition techniques lack some abilities which are simple for humans.
Differences in classification • Techniques thus far only classify objects based on their shape, color, texture, etc. These are only representative of the light reflected by an object. • Humans classify objects many ways, including an object’s function.
For example… • We classify a ring of rocks with a fire inside as a fire pit. • We classify a board as a joist once it is installed as support for the floor. • We classify our computer as a paperweight once it is more than five years old.
Correlation • Given an image, we want to find all places in the image which contain a subimage, also called a template. • Very useful for answering ‘where is the ‘x’ in this picture?’
Notice.. • Recognition models typically rely on input from optical sensors. • Such input is represented entirely in two-dimensional space.
Is a 3D representation necessary? • DARPA challenge was not successfully completed. • Army’s LADAR sensors, which provide depth data, have demonstrated more capability.
3D Object recognition with neural trees • First stage extracts features from the input range images. • These features are used in the second stage to group image pixels into different surface patches according to the six surface classes proposed by the differential geometry.(4)
Invariants • Basic idea: • D(g(A),g(B)) = D(A,B) • For all g in transformation group G Limitations: There are very many possible transformations in G, and computation times becomes a problem.
Varying goals of object recognition • Are we looking for “that” object? • Face recognition • Are we looking for “one of those” objects? • Web search for 1987 Chevy pickup.
Notice… • Just because an object exists in an image doesn’t mean it is recognizable. • Example from Late Night with Conan O’Brien
Histogram approach… • Vary bad results for images with: • Much noise • Small target objects With tightly controlled conditions, moderate success can be achieved.
Correlation example • Find the flower
Create template • Actual template size: 32X32
Acquire input image Actual image size: 1600X1200
Compute correlation image • Actual image size: 1600X1200
Show areas of best match • Actual image size: 1600X1200
Find flower with more noise • Source image: 1600X1200
Templates for a coin • Acquire a template:
Acquire target image • Actual size: 1600X1200
Structural approach to stapler • Acquire source stapler image
Compute the boundary • Image recreated from computed boundary:
Select boundary points • Boundary points at distance of 8: