Large Scale Discovery of Spatially Related Images

Large Scale Discoveryof Spatially Related Images Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University Prague

Related Vision Problems • Organize my holiday snapshots • Schaffalitzky and Zisserman ECCV’02 • Find images containing a given “object” (“window”) • Sivic ICCV‘03, Nister CVPR‘06, Jegou CVPR’07, Philbin CVPR‘07, Chum ICCV’07 • Find small “object” in a film • Sivic and Zisserman CVPR’04 • Match and reconstruct Saint Marco • Snavely, Seitz and Szeliski SIGGRAPH’06 This Work • Find and match ALL spatially related images in a large database, using only visual information, i.e. not using (flicker) tags, EXIF info, GPS, ….

Visual Only Approach • Large database (100 000 images in our experiments) • Find spatially related clusters • Fast method, even for sizes up to 250 images • Probability of successful discovery of spatial relation of images independent of database size

Image Clustering and its Time Complexity Standard Approach (using image retrieval): Quadratic method in the size of database D -- O(D2) the multiplicative constant at the quadratic term ~ 1 – quadratic even for small D • Take each image in turn • Use a image retrieval system to retrieve related images • Compute connected components of the graph • Proposed method • Seed Generation – hashingcharacterize images by pseudo-random numbers stored in a hash table • time complexity equal to the sum of variances of Poisson distributions • linear for database size D ¼ 250 • 2. Seed Growing – retrieval • complete the clusters only for cluster members c << D, complexity O(cD)

Building on Two Methods • Fast (low recall) seed generation based on hashing • Thorough (high recall) seed growing based on image retrieval Chum, Philbin, Isard, and Zisserman: Scalable Near Identical Image and Shot Detection CIVR 2007 Chum, Philbin, Sivic, Isard, and Zisserman: Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval ICCV 2007

2 1 0 0 4 1 0 0 ... ... Image Representation SIFT descriptor [Lowe’04] Feature detector Vector quantization … Visual vocabulary Bag of words Set of words

Hypothesizing Seeds with min-Hash • Spatially related images share visual words • Problem: Robustly estimate set overlap of high dimensional sparse binary vectors in constant time independent of the dimensionality (d¼105) • Set overlap probabilistically estimated via min-Hash • Similar approach as LSH (locally sensitive hashing) Image similarity measured as a set overlap (using min-Hash algorithm) A1∩ A2 A1 A2 A1 U A2

min-Hash • According to some (replicable) key select a small number of non-zero elements • Similar vectors should have similar selected elements • Key = generate a random number (a hash) for each dimension, choose nonzero element with minimal value of the key 29 12 19 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 26 3 26 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 29 12 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 35 27 7 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1

Seed Generation: Probability of Success An image pair forms a seed if at least one of ks-tuples of min-Hashes agrees. Probability that an image pair is retrieved is a function of the similarity: • where s,k are user-controllable parameters of the method: • s governs the size of the hashing table • k is number of hashing tables • Successfully retrieved pair of images = at least one collision in one of the tables (equivalent to AND-OR)

Probability of Retrieving an Image Pair Images of the same object and unrelated images Near duplicate Images 13.9 % (sim = 0,066) 100% (sim = 0.746) probability of retrieval 100% (sim = 0.322) 8.9 % (sim = 0.057) 99.5% (sim = 0,217) 5.1% (sim = 0.047) similarity (set overlap)

Spatially Related Images 5.1 % (sim = 0,047) 18.9 % (sim = 0,074) probability of retrieval (log scale) 10.7 % 8.9 % 7.2 % 9.8 % similarity (set overlap) 13.9 % 5.1 % 8.9 % 16.3 % 13.9 %

Seed Generation 5% 4% 6% 4% 7% 10% 68.88 % 94.00 % 85.73 % P (no seed) =

Seed Generation Resemblance to RANSAC Related image pair ~ an all inlier sample (there is no need to enumerate them all, one hit is sufficient) Probability of retrieving an image pair ~ fraction of inliers The number of related image pairs ~ how many times we can try 31.84 % 68.88 % 55.13 % 1.94 % P (no seed) =

= probability of retrieval 6.2% 10.4% 16.1% similarity 0.05 0.06 0.07 At Least One Seed in Cluster Estimate of the probability of failure plot against the size of the cluster assumption used in this plot: all images in the cluster are related P(no seed) cluster size

Growing the Seed • Application of Total Recall • Combining average query expansion and transitive closure • 3D geometric constraint (not only affine transformation) • Tighter geometric constraints (10 pixel threshold) Average query expansion (from possibly multiple coplanar structures) backproject features query enhanced query Transitive closure crawl

Summary of the Method Images Rejected seed Unknown structure min-Hash seeds Spatial verification Query Expansion x Seed Failed retrieval Cluster skeleton Missed cluster

Experiment 1 Univ. of Kentucky Dataset [Nister & Stewenius] 2550 clusters of size 4 – very small clusters “partial” ground truth: “different” cluster share the same background How many clusters have at least one seed? 46.9% CONTRAST – DIFFERENT TASK If we were looking for ALL results not ANY (seed) the standard retrieval measure on this dataset would be only 1.63 out of 4

= probability of retrieval 6.2% 10.4% 16.1% similarity 0.05 0.06 0.07 Experimental Validation UKY dataset In University of Kentucky dataset “average” similarity slightly above 0.06 + P(no seed) cluster size

Experimental Results on 100k Images Images downloaded from FLICKR Includes 11 Oxford Landmarks with manually labelled ground truth All Soul's Hertford Ashmolean Keble Balliol Magdalen Bodleian Pitt Rivers Christ Church Radcliffe Camera Cornmarket

Experimental Results on 100k Images Settings scalable to millions images, also finding small clusters Settings scalable to billions images, only finding larger clusters Timing: 17 min 13 sec + 16 min 20 sec = 0.019 sec / image

Application – Object Labelling Factorizing the clusters using multiple constrains Matches between images Weak geometric constraints (coplanarity, disparity) Photographer’s psychology – tends to take pictures of single objects

Automatic 3D Reconstruction

Conclusions • Novel method for fast clustering in large collections • Combines fast low recall method (seed generation) and thorough (total recall) method for seed growing • Probability of finding a cluster rapidly increases with its size and is independent of the size of the database • Can be incrementally updated as the database grows • Efficient: 0.019 sec / image on a single PC • Fully parallelizable • A state of the art near duplicate detection comes as a bonus (as a part of seed generation)

Thank you! Technical Report available http://cmp.felk.cvut.cz/~chum/papers/Chum-TR-08.pdf Thanks to Daniel Martinec, Michal Perďoch, James Philbin, Jakub Pokluda

Large Scale Discovery of Spatially Related Images

Large Scale Discovery of Spatially Related Images

Presentation Transcript

Large Scale Weather

Large Scale Structure

Matching Images to Words for Large-Scale Datasets

large scale Refactoring

Large-scale matching

LARGE SCALE

Large- scale Organisations

Spatially Constrained Segmentation of Dermoscopy Images

Learning to Match Images in Large-Scale Collections

LARGE SCALE ORGANISATIONS

Extensive Analysis and Large-Scale Empirical Evaluation of Tor Bridge Discovery

Large scale

Identification of large-scale genomic rearrangements between closely related organisms

Evolution of large scale structures

Large-Scale Systems

Efficient Algorithms for Large-Scale Topology Discovery

Large Scale Sharing

Large Scale Operations

Large Scale Applications

Control of Large Scale Systems

Large Scale Drupal