Structure from Motion
Structure from Motion. Introduction to Computer Vision CS223B, Winter 2005 Richard Szeliski. Today’s lecture. Calibration estimating focal length and optic center triangulation and pose Structure from Motion two-frame methods factorization bundle adjustment (non-linear least squares)
Structure from Motion
E N D
Presentation Transcript
Structure from Motion Introduction to Computer VisionCS223B, Winter 2005Richard Szeliski
Today’s lecture • Calibration • estimating focal length and optic center • triangulation and pose • Structure from Motion • two-frame methods • factorization • bundle adjustment (non-linear least squares) • robust statistics and RANSAC • correspondence Structure from Motion
Camera calibration • Determine camera parameters from known 3D points or calibration object(s) • internal or intrinsic parameters such as focal length, optical center, aspect ratio:what kind of camera? • external or extrinsic (pose)parameters:where is the camera? • How can we do this? Structure from Motion
Camera calibration – approaches • Possible approaches: • linear regression (least squares) • non-linear optimization • vanishing points • multiple planar patterns • panoramas (rotational motion) Structure from Motion
(Xc,Yc,Zc) uc f u Image formation equations Structure from Motion
Calibration matrix • Is this form of K good enough? • non-square pixels (digital video) • skew • radial distortion Structure from Motion
Camera matrix • Fold intrinsic calibration matrix K and extrinsic pose parameters (R,t) together into acamera matrix • M = K [R | t ] • (put 1 in lower r.h. corner for 11 d.o.f.) Structure from Motion
Camera matrix calibration • Directly estimate 11 unknowns in the M matrix using known 3D points (Xi,Yi,Zi) and measured feature positions (ui,vi) Structure from Motion
Camera matrix calibration • Linear regression: • Bring denominator over, solve set of (over-determined) linear equations. How? • Least squares (pseudo-inverse) • Is this good enough? Structure from Motion
Optimal estimation • Feature measurement equations • Likelihood of M given {(ui,vi)} Structure from Motion
Optimal estimation • Log likelihood of M given {(ui,vi)} • How do we minimize C? • Non-linear regression (least squares), because ûi and vi are non-linear functions of M Structure from Motion
Levenberg-Marquardt • Iterative non-linear least squares [Press’92] • Linearize measurement equations • Substitute into log-likelihood equation: quadratic cost function in Dm Structure from Motion
Levenberg-Marquardt • Iterative non-linear least squares [Press’92] • Solve for minimumHessian:error: • Does this look familiar…? Structure from Motion
Levenberg-Marquardt • What if it doesn’t converge? • Multiply diagonal by (1 + l), increase l until it does • Halve the step size Dm (my favorite) • Use line search • Other ideas? • Uncertainty analysis: covariance S = A-1 • Is maximum likelihood the best idea? • How to start in vicinity of global minimum? Structure from Motion
Camera matrix calibration • Advantages: • very simple to formulate and solve • can recover K [R | t] from M using QR decomposition [Golub & VanLoan 96] • Disadvantages: • doesn't compute internal parameters • more unknowns than true degrees of freedom • need a separate camera matrix for each new view Structure from Motion
Separate intrinsics / extrinsics • New feature measurement equations • Use non-linear minimization • Standard technique in photogrammetry, computer vision, computer graphics • [Tsai 87] – also estimates k1 (freeware @ CMU)http://www.cs.cmu.edu/afs/cs/project/cil/ftp/html/v-source.html • [Bogart 91] – View Correlation Structure from Motion
Separate intrinsics / extrinsics • How do we parameterize R and ΔR? • Euler angles: bad idea • quaternions: 4-vectors on unit sphere • use incremental rotation R(I + DR) • update with Rodriguez formula Structure from Motion
Intrinsic/extrinsic calibration • Advantages: • can solve for more than one camera pose at a time • potentially fewer degrees of freedom • Disadvantages: • more complex update rules • need a good initialization (recover K [R | t] from M) Structure from Motion
u0 u1 u2 Vanishing Points • Determine focal length f and optical center (uc,vc) from image of cube’s(or building’s) vanishing points[Caprile ’90][Antone & Teller ’00] Structure from Motion
(Xc,Yc,Zc) uc f u u0 u1 u2 Vanishing Points • X, Y, and Z directions, Xi = (1,0,0,0) … (0,0,1,0) correspond to vanishing points that are scaled version of the rotation matrix: Structure from Motion
u0 u1 u2 Vanishing Points • Orthogonality conditions on rotation matrix R, • ri¢rj = dij • Determine (uc,vc) from orthocenter of vanishing point triangle • Then, determine f2 from twoequations(only need 2 v.p.s if (uc,vc) known) Structure from Motion
Vanishing point calibration • Advantages: • only need to see vanishing points(e.g., architecture, table, …) • Disadvantages: • not that accurate • need rectahedral object(s) in scene Structure from Motion
Single View Metrology • A. Criminisi, I. Reid and A. Zisserman (ICCV 99) • Make scene measurements from a single image • Application: 3D from a single image • Assumptions • 3 orthogonal sets of parallel lines • 4 known points on ground plane • 1 height in the scene • Can still get an affine reconstruction without 2 and 3 Structure from Motion
Criminisi et al., ICCV 99 • Complete approach • Load in an image • Click on parallel lines defining X, Y, and Z directions • Compute vanishing points • Specify points on reference plane, ref. height • Compute 3D positions of several points • Create a 3D model from these points • Extract texture maps • Output a VRML model Structure from Motion
3D Modeling from a Photograph Structure from Motion
3D Modeling from a Photograph Structure from Motion
Multi-plane calibration • Use several images of planar target held at unknown orientations [Zhang 99] • Compute plane homographies • Solve for K-TK-1 from Hk’s • 1plane if only f unknown • 2 planes if (f,uc,vc) unknown • 3+ planes for full K • Code available from Zhang and OpenCV Structure from Motion
f=468 f=510 Rotational motion • Use pure rotation (large scene) to estimate f • estimate f from pairwise homographies • re-estimate f from 360º “gap” • optimize over all {K,Rj} parameters[Stein 95; Hartley ’97; Shum & Szeliski ’00; Kang & Weiss ’99] • Most accurate way to get f, short of surveying distant points Structure from Motion
Pose estimation • Once the internal camera parameters are known, can compute camera pose [Tsai87] [Bogart91] • Application: superimpose 3D graphics onto video • How do we initialize (R,t)? Structure from Motion
Pose estimation • Previous initialization techniques: • vanishing points [Caprile 90] • planar pattern [Zhang 99] • Other possibilities • Through-the-Lens Camera Control [Gleicher92]: differential update • 3+ point “linear methods”: [DeMenthon 95][Quan 99][Ameller 00] Structure from Motion
(Xc,Yc,Zc) uc f u Pose estimation • Solve orthographic problem, iterate [DeMenthon 95] • Use inter-point distance constraints [Quan 99][Ameller 00] Solve set of polynomial equations in xi2p Structure from Motion
Triangulation • Problem: Given some points in correspondence across two or more images (taken from calibrated cameras), {(uj,vj)}, compute the 3D location X Structure from Motion
Triangulation • Method I: intersect viewing rays in 3D, minimize: • X is the unknown 3D point • Cj is the optical center of camera j • Vj is the viewing ray for pixel (uj,vj) • sj is unknown distance along Vj • Advantage: geometrically intuitive X Vj Cj Structure from Motion
Triangulation • Method II: solve linear equations in X • advantage: very simple • Method III: non-linear minimization • advantage: most accurate (image plane error) Structure from Motion
Structure from motion • Given many points in correspondence across several images, {(uij,vij)}, simultaneously compute the 3D location xi and camera (or motion) parameters (K, Rj, tj) • Two main variants: calibrated, and uncalibrated (sometimes associated with Euclidean and projective reconstructions) Structure from Motion
Structure from motion • How many points do we need to match? • 2 frames: (R,t): 5 dof + 3n point locations 4n point measurements n 5 • k frames: 6(k–1)-1 + 3n 2kn • always want to use many more Structure from Motion
Two-frame methods • Two main variants: • Calibrated: “Essential matrix” Euse ray directions (xi, xi’ ) • Uncalibrated: “Fundamental matrix” F • [Hartley & Zisserman 2000] Structure from Motion
Essential matrix • Co-planarity constraint: • x’ ≈ R x + t • [t]x’ ≈[t] R x • x’T[t]x’ ≈x’ T[t] R x • x’ T E x = 0 withE =[t] R • Solve for E using least squares (SVD) • t is the least singular vector of E • R obtained from the other two s.v.s Structure from Motion
Fundamental matrix • Camera calibrations are unknown • x’ F x = 0 withF = [e] H = K’[t] R K-1 • Solve for F using least squares (SVD) • re-scale (xi, xi’ ) so that |xi|≈1/2 [Hartley] • e (epipole) is still the least singular vector of F • H obtained from the other two s.v.s • “plane + parallax” (projective) reconstruction • use self-calibration to determine K [Pollefeys] Structure from Motion
Three-frame methods • Trifocal tensor • [Hartley & Zisserman 2000] Structure from Motion
Factorization [Tomasi & Kanade, IJCV 92]
Structure [from] Motion • Given a set of feature tracks,estimate the 3D structure and 3D (camera) motion. • Assumption: orthographic projection • Tracks: (ufp,vfp), f: frame, p: point • Subtract out mean 2D position… • ufp = ifT spif: rotation,sp: position • vfp = jfT sp Structure from Motion
Measurement equations • Measurement equations • ufp = ifT spif: rotation,sp: position • vfp = jfT sp • Stack them up… • W = R S • R = (i1,…,iF, j1,…,jF)T • S = (s1,…,sP) Structure from Motion
Factorization • W = R2F3 S3P • SVD • W = U Λ V Λmust be rank 3 • W’ = (U Λ 1/2)(Λ1/2 V) = U’ V’ • MakeRorthogonal • R = QU’ , S = Q-1V’ • ifTQTQif = 1 … Structure from Motion
Results • Look at paper figures… Structure from Motion
Extensions • Paraperspective • [Poelman & Kanade, PAMI 97] • Sequential Factorization • [Morita & Kanade, PAMI 97] • Factorization under perspective • [Christy & Horaud, PAMI 96] • [Sturm & Triggs, ECCV 96] • Factorization with Uncertainty • [Anandan & Irani, IJCV 2002] Structure from Motion