Seminar on Image Similarity and Image Retrieval

Seminar on Image Similarity and Image Retrieval Presentation by Feliks Beilis

Background Object categorization and object class detection – How to find images from database with specific query ,for example any red car or any brown horse etc. Methods used : EMD –histograms ,EMD –signatures .

Texture classification . Methods used : Gabor filters ,Patch match ,EMD . Left texture is the source Are these 3 textures the same as the source ?

Image recognition - This means finding specific object ,for example a face of person . Methods used : SIFT, Color interest points . Points that describe this picture (SIFT)

Image editing with patch matching algorithms – How to change image with existing data ,how reconstruct an image . Methods used : NNF – nearest neighbor field ,editing tools with constrains .

Object categorization and object class detection In this section we will talk mostly on EMD – Earth Moving Distance ,but there also other methods for comparing Histograms . We choose to focus on EMD because it matches perceptual similarity for image retrieval better then other methods . In IP we talked about why EMD is better then other methods so this part of the proof I won’t explain thoroughly .

Just a reminder : Minkovskidistance KullbackLieber

X^2 statistics Quadratic form distance

Signatures derived from Histograms and represented as - Cluster mean ,M – is d-dimensional vector of Bins. - Pixels that belong to that cluster J – number of corresponded Bin in histogram Histogram vs Signature The EMD methods that described earlier and used on Histograms, now can be used on Signatures . More intuitive explanation next

Histogram of Image A Histogram of Image B Signature of Image A Signature of Image B

Our database contains 20,000 images . In our first experiment we identified 75 images of red cars ,from this set we choose 10 “good” images ,in those images the background was green/grey . We preformed ten queries using different “good” car each time . The EMD outperformed other methods and results with Signatures are much better then with Histograms . For this experiment we used histograms with Coarse binning and Fine binning . Over 20,000 images the average Coarse binning left us with 15.3 non zero bins ,average Fine binning left us with 39 non zero bins . The Experiment

Middle – Coarse binning Bottom – fine binning

The Experiment In our second experiment colors of the objects and background are pretty similar ,we took 157 of brown horses in green fields ,again 10 “good” images were chosen ,again for Coarse and Fine histograms . For Coarse binning EMD signatures outperformed others but Jefrey divergence and x^2 statistics outperformed EMD histograms . (This can be explained that the distance is computed between more distance bin centers and therefore less meaningful ) For Fine binning EMD outperformed but signatures outperformed all the rest .

Middle – Coarse binning Bottom – fine binning

Emd has desirable properties for image retrieval ,Compared to other methods it has advantages in all parameters . As we saw Signatures have a better results in image retrieval . Conclusion

We will focus on texture classification mostly using Gabor texture features . While color is purely point wise texture property ,texture involves notion of spatial extent ,a single point has no texture . If texture defined in the frequency domain the information of a texture is carried by a point and it’s neighbors . Texture classification

Gabor filter is similar to Fourier filter but are limited to certain freqbands ,they do an excellent job in image or compaction . Gabor filters are defined by harmonic functions and modulated by Gaussian distributions . Short Background on Gabor Filtering

Transform Fourier of Gabor filter ->

After applying Gabor filters on image with different orientation at different scale ,we obtain an array of magnitudes ,these magnitudes describe energy content at different scale orientation of the image . The main purpose of texture based retrieval is to find images or regions with similar textures ,since this similarity is not rotation invariant ,similar textures with different direction may be missed out from retrieval or get a low rank . Example is on next page Texture representation

To solve this problem we suggested a simple circular shift . The orientation with total high energy will be called dominant ,then we will rotate other images to meet dominant image .

Our database included 1000 images with different kind of texture ,it contained both natural and both texture images . In first retrieval experiment all the 15 similar textures retrieved in the first 18 images and only one image was irrelevant . Results

Those results were conducted on color image database with 360 different images ,the same images retrieved within 25 images .

As we saw before we can with Gabor filter represent texture, so represent it as 24 bins (4 for scale and 6 for orientation ) after we represented texture we can use it as Histograms or Signatures and other methods for similarity . How to imply EMD over textures

Database - constructed 1744 texture patches . Using EMD we can find partial matches in textures ,the query was 20% texture and 80% don’t care ,16 patches were the same and followed them patches with partial original texture . Results

origin

We created 250 images database with 25 zebras ,then we cropped a block of zebra stripes pattern and asked for images with at least 20% of that pattern ,the best 8 matches are shown above .

Block of cheetah pattern and asked for images with at least 10% of that pattern ,the best 12 matches are shown above .

In this chapter we talked about Gabor texture retrieval and mostly focused on rotations ,but this method can be extended to other methods and we saw Emd measurement used Gabor properties for texture retrieval ,the textures usually homogenous and correspond to different parts of images ,therefore image retrieval is very useful . Conclusion

Patch match can be achieved by SIFT algorithms or Histogram distance (As described earlier) . Reminder

SIFT- scale invariant feature transform . What is SIFT ? SIFT is algorithm used to describe and detect local features in images ,we will mostly talk on Harris corner detector . Color Harris corner detector Corners have long been considered as useful interest points and there for they were used in many different algorithms .Color also have great importance on matching images .In RGB color cube most interest points are found using just intensity (luminance returned light from bright objects) useful with studio photography or artificial images . However in natural images ,high contrast changes might have place and so the changes won’t be that noticeable using intensity based approach. Image recognition

Intuitive : Harris corner detection achieved by second derivative in axis X and Y meaning Convolving twice with (-1,0,1) and (-1,0,1)^t . More formal explanation for different corners : If R bigger then threshold we have found a corner . Harris corner detector reminder det(M) = trace(M) =

RGB normalized RGB In RGB method the corners are spread all over the image and not concentrate on specific area . In normalized RGB we can see that the corners found around silhouette of the parrot but in dark areas it unstable as can see in the bottom of an image .

Quasi Invariant colors – derived from RGB color with special Equations (HSI, OCS, spherical color space) Harris detector with other color spaces Can we improve corner detection ?

As we will see scale have huge impact on corner detection ,so we will use “Fixed scale” to improve results. There some drawbacks with images that too large or too small . We will use function to set “Fixed scale” . E – cornerness measurement for each pixel (part of Harris algorithm) M – second moment matrix t – the amount of scale change Convolution Harris Detector with scale invariant

The optimum rescale for images decided by experiments with Harris detector 1.2 < t < √2 . Now we can see the changes in Harris Detector after “Fixing scale” the parrot is highly prioritized . Some more improvement is coming ->next

Now let add Color information in Scale decision . We will build function from 3 dimensional color to 1 dimensional data set and it will be combined in already known Scale Invariant function .When we combine this information we will get different definition of interest points . Working with quasi-invariant color space ,the interest points now free of shading ,illumination or specular changes so the lighting conditions don’t effect the image . Natural cluttered animals images have different lightning conditions and this method overcomes it as we will se it now . Colored Scale invariant Harris Corner Detection

The background is structured with high illumination changes and Quasi invariant HSI found the exact scorpion image .

For the retrieval experiment we will capture 1000 images ,for every image, 18 images will be taken with different rotations ,the result is database of 18000 images . As it can be seen Quasi Invariant color outperformed . Image retrieval

Using those methods that explained upper ,we saw that they are much better than luminance based methods . A color scale selection leads to better stability also it can be transformed into various color spaces and we can take advantage of this variable color properties . In retrieval scenarios our approach was much more stable ,which leads to higher retrieval rates . Conclusion

What is it Image editing ? As digital and computational photography have matured ,researchers developed methods for high level editing . Now we can resize an image with good likeness of the original image also we can erase unwanted portion of an image and automatic image completion will complete the data . Image reshuffling algorithms allow us to take a portion of an image and move it around so the reminder will resemble original image . These algorithms depend on user intervention to obtain best results because the user knows his expectation from modified image . Image editing and reconstruction

Is an algorithm that finds in image A for most similar patch in image B . Our algorithm to be efficient relies on 3 keys : Dimensionality on offset space – searches in 2D space for possible patch offset, achieving greater speed and efficiency then standard Kd-tree structure search . Natural structure of images – Our algorithm ignores natural structure in images by searching for each pixel in the patch , Improves efficiency . The law of large numbers – random choice for patch would be a bad guess ,the bigger the patch the chances for correct offset improves . NNF – nearest neighbor field

Example for Patch match Good estimate for match ,it doesn’t need to be perfect .

propagation – searches for good matches of the neighbor patches . Phases of the Algorithm The blue cube propagating (b) above red and left to green and then (c) searches in neighborhoods with certain radius .

The outcome of this Patch Match algorithm is offset map ,this map is 2D field with 2D vectors with the same dimensions as source image . Each vectors stores location of the currently best match vector known . 1.Initialization - is random except areas where we have initial info ,called constrains we talk about the later . 2.Propagation step 3.Random search 2&3 steps executed consecutive for each pixel. The Algorithm

Propagation step – the natural correlation is exploited ,assume that red cube is best match ,now when we moving left from our target ,black cube ,we can try and use this offset as our patch guess ,there a good chance that this offset will be a match ,this is also done bellow the current patch . We will take the best match of these 3 patches (using patch match methods) ,in this matter we propagating bottom left . if we stopped -> go to Search step -> next

Search step – We will use random unit vector and scale it with decreasing radius ,if this radius bellow certain point we stop ,this random vector is added to the best current offset and the patches are compared ,if it better , then the random one takes from that moment ,if not search step repeated with different random vector . Then this algorithm applied multiple times .

The top image reconstructed using patches from the bottom image, after 5 iterations the image complete .

Efficiency -> Our algorithm is much more faster and uses much less memory then Kd-Tree . For 7x7 patch size we found our algorithm 20x to 100x times faster and uses about 20x less memory . For smaller patches we obtain smaller speedups . We also made GPU (8800 GTS video card) implementation for NNF that 7x times faster then CPU implementation. Real world implementation

Now we will talk on novel interactive editing tools enabled by our Algorithm. By modifying the search in various ways we can introduce local constraints on offsets to provide user control on synthesis process . We mostly will focus on : • Search Space Constraints • Deformation Constraints • Hard Constraints (Reshuffling) Editing tools

Seminar on Image Similarity and Image Retrieval

Seminar on Image Similarity and Image Retrieval

Presentation Transcript

Image and Video Retrieval

Image Similarity

Image Retrieval

Image Retrieval

[Image Similarity Based on Histogram]

Image Information Retrieval

Image Retrieval

Image Retrieval

Image Retrieval

Image and Video Retrieval

Botany Image Retrieval

Image Similarity

Special Topic on Image Retrieval

Special Topic on Image Retrieval

Image retrieval and categorization

Image Similarity

[Image Similarity Based on Histogram]

Image Retrieval

Image Similarity

Adaptive tree similarity learning for image retrieval

Similarity searching in image retrieval and annotation

Image Retrieval