500 likes | 663 Vues
Object Recognition by Discriminative Combinations of Line Segments, Ellipses and Appearance Features. Professor: S. J. Wang Student : Y. S. Wang. Outline. Background System Overview Shape-Token Code-Book of Shape-Token Code-Word Combination Hybrid Detector Experimental Result
E N D
Object Recognition by Discriminative Combinations of Line Segments, Ellipses and Appearance Features Professor:S. J. Wang Student : Y. S. Wang
Outline • Background • System Overview • Shape-Token • Code-Book of Shape-Token • Code-Word Combination • Hybrid Detector • Experimental Result • Conclusion
Background • Contour Based Detection Method • Problem of Contour Fragment: • Storage requirement is large for training. • Slow matching speed. • Not scale invariant. • Solution provided is Shape-Token.
Shape Token • What is Shape-Tokens? • Constructing Shape-Tokens • Describing Shape-Tokens • Matching Shape-Tokens
What is Shape-Tokens? • Use the combination of line and ellipse to represent the contour fragments. • Line for line. • Ellipse for curve. • Example: • Why shape-tokens? • Several parameters are enough for us to describe the contour fragment.
Constructing Shape-Tokens • Extract Shape Primitives of line segments and ellipses by [16] [17]. • Pairing reference primitive to its neighboring primitive. • Different type combination: Take ellipse as reference. • Same type combination: Consider each as reference in turn. • Three types of Shape-Tokens: • Line-Line, Ellipse-Line, Ellipse-Ellipse.
Constructing Shape-Tokens • Line-Line • Combine neighboring line which has any point falling in trapezium searching area. • Ellipse-Line & Ellipse-Ellipse • Circular Search Area. Consider primitives has any point within searching area and weakly is connected to reference ellipse.
Describing Shape-Tokens • d • : Orientation of a Primitive. • : Unit vector from center of reference primitive to center of its neighbor. • : Distance between centers of primitives. • : Length and Width for each primitives.
Matching Shape-Tokens • Dissimilarity Measure (Shape Distance)
Matching Shape-Tokens • More general for multiple scale matching • Normalize descriptor against object scale
Codebook of Shape-Tokens • Extracting Shape-Tokens inside bounding boxes of training images. • Producing Code-words • Clustering by Shape • Clustering by Relative Positions • Selecting representative code-words into codebook for specific target object.
K-Medoid Method • Similar to the k-means method. • Procedure: • Randomly select k of the n data points as medoids. • Associate each data point to the closest medoid. • For each medoidm For each non-medoid data point o Swap m and o and compute the total cost of the configuration. • Select the configuration with the lowest cost. • Repeat the steps above until there is no change in the medoid.
K-Medoid Method • First two steps
K-Medoid Method • Third to Fourth step
Clustering by Shape • Method: • Use k-medoid method to cluster the shape-tokens for each type separately. • Repeat the step above until the dissimilarity value for each cluster is lower then a specific threshold. • Metric: • Dissimilarity Value: average shape distance between the medoid and its members. • Threshold: 20% of the maximum of D(.).
Clustering by relative positions • Target: • Partition the clusters obtained from previous step by to attain sub-clusters whose members have similar shape and position relative to the centroid of object. • :vector direct from object centroid to the shape-token centroid. • Method: Mean-Shift.
Candidate Code-Word • Medoid for each sub-cluster. • Parameters: • Shape Distance Threshold :Mean shape distance of the cluster plus one standard deviation. • Relative Position Center :Mean of vectors of the sub-clusters members. • Radius:Euclidean distance between to of each sub-cluster member plus one standard deviation.
Candidate Code-Words • Example:the Weizmann horse dataset.
Selecting Candidates into Codebook • Intuition: Size of cluster. • Problem:Lots of selected candidates belong to background clutter. • What kind of candidates we prefer ? • Distinctive Shape. • Flexible enough to accommodate intra-class variations. • Precise Location for its members.
Selecting Candidates into Codebook • Instead of using cluster size directly, the author scores each candidate by a product “” consists of three values. • Intra-cluster shape similarity value • “” where is the maximum of the range of shape distance for the type of candidate currently considered. • The number of unique training bounding boxes its members are extracted from. • Its value of .
Selecting Candidates into Codebook • One more problem left: • If use to choose the candidate directly, it may cause not ideal spatial distribution. • Solution:Radial Ranking Method
Selecting Candidates into Codebook • Example:the Weizmann horse dataset.
Code-Word Combination • Why code-word combination ? • One can use a single code-word that is matched in test image to predict object location. => Less discriminative and easy to matched in background. • Instead, a combination of several code-words can be more discriminative.
Code-Word Combination • Matching a code-word combination • Way to match code-word combination. • Finding all matched code-word combinations in training images • Exhaustive set of code-word combinations. • Learning discriminative xCC (x-codeword combination)
Matching a Code-Word Combination • Criteria: • Shape Constraint :Shape distance between each code-word and shape-token in image should be less then shape distance threshold . • Geometric Constraint:Centroid prediction by all code-words in the combination concur.
Matching a Code-Word Combination • Example:
Finding all matched code-word combinations in training images • Goal:Finding an exhaustive set of possible candidates of code-word combinations. • Method: (Similar to Sliding-Window Search) • For each candidate window at scale and location in image I, we try to find there is any match for each code-word or not. And the combination of each matched code-word will be a possible combination candidate.
Finding all matched code-word combinations in training images • Specify a variable to represent the matching condition of a specific code-word .
Finding all matched code-word combinations in training images • If ,then we say that the code-word is matched at scale and location . • Any combination of these matched code-word will produce a candidate combination. • Why not consider the geometric constraint?
Finding all matched code-word combinations in training images
Learning Discriminative xCC • We’d like to obtain a xCCwhich satisfies the following three constraint. • Shape Constraint : Highly related Code-Book Establishment • Geometric Constraint: Object Location Agreement. • Structural Constraint :Reasonable code-word combination for different poses of object.
Learning Discriminative xCC • Example:
Learning Discriminative xCC • Binary Tree to represent a xCC. • Each node is a decision statement:
Learning Discriminative xCC • AdaBoost Training Procedure to produce one xCC from each iteration. • The Binary Tree depth “k” can be obtained by 3-fold cross validation.
Learning Discriminative xCC • Example:
Learning Discriminative xCC • Example:
Learning Discriminative xCC • Example:
Hybrid Detector xMCC • Incorporating SIFT as appearance information to enhance the performance. • Procedure: (same as previous section)
Hybrid Detector xMCC • Example:
Hybrid Detector xMCC • Example:
Hybrid Detector xMCC • Example:
Experimental Result • Contour only result under viewpoint change. (train on side-view only)
Experimental Result • Contour only result for discriminating similar shape object classes.
Experimental Result • Compare with Shotton [6] on Weizmann Horse test set. • Shotton [6]: Use contour fragment, fixed number of code-words for each combination.
Experimental Result • Weizmann Horse Test Set.
Experimental Result • Graz-17 classes.
Experimental Result • Graz-17 dataset.
Experimental Result • Hybrid-Method result
Conclusion • This article provide a contour based method that exploits very simple and generic shape primitives of line segments and ellipses for image classification and object detection. • Novelty: • Shape-Token to reduce the time cost for matching and the need of memory storage. • No restriction on the number of shape-tokens for combinations. • Allow combination of different feature types.