520 likes | 552 Vues
Venus. Classification. Faces – Different. Faces -- Same. Lighting affects appearance. Three-point alignment. Object Alignment.
E N D
Object Alignment Given three model points P1, P2, P3, and three image points p1, p2, p3, there is a unique transformation (rotation, translation, scale) that aligns the model with the image. (SR + d)Pi = pi
Alignment -- comments • The projection is orthographic projection (combined with scaling). • The 3 points are required to be non-collinear. • The transformation is determined up to a reflection of the points about the image plane and translation in depth.
Proof of the 3-point Alignment: The 3 3-D points are P1, P2, P3. We can assume that they are initially in the image plane. In the 2-D image we get q1, q2, q3. The transformation P1 > q1, P2 > q2, P3 > q3, defines a unique linear transformation of the plane, L(x). We can easily recover this transformation. L is a 2*2 matrix. We fix the origin at P1 = q1. We have two more points that define 4 linear equations for the elements of L. We now choose two orthogonal vectors E1 and E2 in the original plane of P1, P2, P3. We can compute E1’ = L(E1), E2’ = L(E2). We seek a scaling S, Rotation R, so that the projection of SR(E1) = E1’ and SR(E2) = E2’. Let SR(E1) (without the projection) be V1 and SR(E2) = V2. V1 is E1’ plus a depth component, that is, V1 = E1’ + c1z, where z is a unit vector in the z direction. Similarly, V2 = E2’ + c2z. We wish to recover c1 and c2. This will give the transformation between the points (show that it is unique, and it will be possible to recover the transformation). We know that the scalar product (V1 V2) = 0. (E1’ + c1z) (E1’ + c1z) = 0 Therefore c1c2 = -(E’1 E’2). The magnitude -(E’1 E’2) is measurable in the image, call it C12, therefore c1c2= c12. Also |V1| = |V2|. Therefore (E1’ + c1z) (E1’ + c1z) = (E1’ + c1z) (E1’ + c1z). This implies c12 - c22 = k12, where k12 is a measurable quantity in the image (it is |E’12| - |E’22|. The two equation of c1 c2 are: c1c2= c12 c12 - c22 = k12 and they have a unique solution. One way of seeing this is by setting a complex number Z = c1 + ic2. Then Z2 = k12 + ic12. Therefore, Z2 is measurable. We take the square root and get Z, therefore c1, c2. There are exactly two roots giving the two mirror reflection solutions.
Linear Combination of Views O is a set of object points. I1, I2, I3, are three images of O from different views. N is a novel view of O. Then O is the linear combination of I1, I2, I3.
Structural Description G1 Above G2 G2 Touch Above G3 Left-of Right-of G4 G4
Mutual Information Mutualinformation Entropy Binary variable -H(C) = P(C=1)Log(P(C=1) + P(C=0)Log(P(C=0)
Mutual information H(C) F=0 F=1 H(C) when F=1 H(C) when F=0 I(C;F) = H(C) – H(C/F)
Fragments Selection • For a set of training images: • Generate candidate fragments • Measure p(F/C), p(F/NC) • Compute mutual information • Select optimal fragment • After k fragments: Maximizing the minimal addition in mutual information with respect to each of the first k fragments
1e. 1d. 1-st. Merit 0.20 0.18 0.18 0.17 0.16 0.11 0.10 0.09 Weight 6.5 5.5 6.45 5.45 3.52 2.9 2.9 2.86 2-nd 3-rd 4-th Face Fragments by Type
100 x Merit, weight 100 x Merit, weight 15 6 5 4 10 3 x Merit 2 5 1 a. 100 b. 100 0 0 0 1 2 3 4 0 1 2 3 1 . 5 Relative object size Relative object size 1 0 . 5 Relative mutual info. 100 x Merit, weight 0 1 . 2 0 1 2 3 1 - 0 . 5 0 . 8 Relative object size 0 . 6 0 . 4 0 . 2 0 0 0 . 5 1 1 . 5 2 Relative resolution Intermediate Complexity
Fragment ‘Weight’ Likelihood ratio: Weight of F:
Combining fragments w1 wk w2 D1 D2 Dk
Non-optimal Fragments Same total area covered (8*object), on regular grid
Training & Test Images • Frontal faces without distinctive features (K:496,W:385) • Minimize background by cropping • Training images for extraction: 32 for each class • Training images for evaluation: 100 for each class • Test images: 253 for Western and 364 for Korean
Western Fragment Korean Fragment Score Score 0.92 0.92 0.82 0.82 0.77 0.77 0.76 0.76 0.75 0.75 0.74 0.74 0.72 0.72 0.68 0.68 0.67 0.67 0.65 0.65 Weight Weight 3.42 3.42 2.40 2.40 1.99 1.99 2.23 2.23 1.90 1.90 2.11 2.11 6.58 6.58 4.14 4.14 4.12 4.12 6.47 6.47 Extracted Fragments
Classifying novel images Detect Fragments Compare Summed Weights Decision kF Westerner Unknown wF Korean
Effect of Number of Fragments • 7 fragments: 95%, 80 fragments: 100% • Inherent redundancy of the features • Slight violation of independence assumption
Comparison with Humans • Algorithm outperformed humans for low resolution images