Advanced Image Congealing Techniques for Optimal Alignment in Computer Vision
This presentation by Boris Kimelman delves into advanced topics in image alignment, focusing on the congealing technique to align multiple images of an object to a common canonical template. Highlighting the challenges of image similarity due to factors like illumination and occlusion, the talk explores two main approaches: feature-based and similarity-based methods. Applications discussed include batch image alignment, facial contour detection, and video stabilization. The use of SIFT descriptors and robust algorithms such as RASL is also highlighted, demonstrating their effectiveness in real-world scenarios.
Advanced Image Congealing Techniques for Optimal Alignment in Computer Vision
E N D
Presentation Transcript
Image Congealing(batch/multiple) image (alignment/registration)Advanced Topics in Computer Vision (048921)Boris Kimelman
Introduction • Dramatic increase in popularity of image and video sharing sites • Hard to measure image similarity: • Illumination • Occlusion • Misalignment
Problem Definition • Align many images of an object to a fixed canonical template • Two main approaches: • Feature based • Similarity based • Mathematically: given images , compute transformation such that the images are aligned
Applications • Batch image alignment (congealing) • Identification pre-processing • Video stabilization • Background segmentation • Facial contour detection • Inpainting
Congealing example Input images Input images realigned using the transformations computed by RASL
Unsupervised Joint Alignment of Complex ImagesGary B Huang, Vidit Jain, Erik Learned-Miller ICCV 2007
Basic assumptions • Input images have similar structure and shape • Thus, low variability of pixel values at specific location • Distribution Field: empiricaldensity function at each pixel Pixel stack
Basic algorithm Input: Images Iterate: • Compute empirical distribution for image set • Find a transformation that reduces entropy of distribution field Output: aligned images, distribution fields Each stage increases image likelihood
Funneling: new image alignment • Add to training set and re-run • Instead, save sequence of distribution fields and increase likelihood of new image at each iteration Image Funnel New Image Aligned Image
Congealing Color Images • Attempt 1: choose to be color values • Attempt 2: choose as indicator of edge presence • Attempt 3: choose as SIFT descriptor
Congealing with SIFT descriptor (1) • Cluster SIFT descriptors using k-means • Congealing on hard assignments forces pixels to take relatively small number of values • Use soft assignment of pixels to clusters (GMM EM) • Analogy with grayscaleusing binary alphabet
Congealing with SIFT descriptor (2) Window around pixel SIFT vector and clusters Posterior distribution
Labeled Faces in the Wild database • 13233 images • Size: 250X250X16MB • 5749 people • 1680 people with two or more images
Align for identification Hyper feature based identifier
Evaluation • LFW database contribution • Novel: Information theory point of view Funneling process Demo code available • Results: No measure of alignment accuracy Comparison only against face alignment algorithm • Writing level: convincing illustrations would help
RASL: Robust Alignment by Sparse and Low-Rank Decomposition for Linearly Correlated Images YigangPeng, ArvindBalasubramanian, John Wright, Ma YiCVPR 2011
How to measure image similarity? • Learned-Miller: Minimize sum of entropies of pixel stacks • Least-Squares congealing: minimize norm between images • If the criterion is satisfied exactly the matrix rank is 1 Learned-Miller Generalize: lower rank as much as possible Least Squares
Basic Assumptions • Input images exhibit high linear correlation • If are n aligned images then is low rank: • A practical assumption: • Errors are large in magnitude but sparse
Mathematical Formulation • The model is: • is a low rank matrix that models image batch linear structure • is a matrix of large but sparse errors that models: corruption, occlusion, shadows
Graphical Explanation Matrix of corrupted observations Underlying low-rank matrix Sparse error matrix
Modeling Misalignment • Above model depends on the assumption that images are aligned • Instead of observing , we observe Problem: given observations recover images and transformations
Optimization Formulation (1) • Find a low-rank matrix and a sparse matrix such that • In Lagrangianform:
Optimization Formulation (2) • Cost function is highly non-convex and discontinuous • Replace the cost function with its convex surrogate:
Nuclear norm • is the nuclear norm: • are the matrix singular values • Resembles -norm replacement by -norm
Constraint Linearization • Linearize non-linear constraint and iterate: , is standard basis Problem to solve:
RASL Algorithm • Input: images • Iterate: • Compute jacobian: • Warp: • Solve: • Update: • Output:
Region of Attraction • Perturb each image by a transformation: Alignment is successful if maximum difference of the eye corners across all pairs of images is less than one pixel in the canonical frame
Evaluation • Novel: Unifying framework for image congealing Rank minimization as image similarity Code available • Results: Comprehensive algorithm assessment Compare only against one algorithm Extensive site about rank minimization • Writing level: Convincing Advanced mathematics required (optimization)
Future issues • Multi sensor congealing: complex relationship between corresponding pixels • Learned Miller – occlusion removal by interspace alignment • RASL • mix between image spaces • funneling