380 likes | 402 Vues
COMP 9517 Computer Vision. Frequency Techniques. Frequency Domain V.S. Spatial Domain. Spatial domain The image plane itself Direct manipulation of pixels Changes in pixel position correspond to changes in the scene Frequency domain Fourier transform of an image
E N D
COMP 9517 Computer Vision Frequency Techniques COMP 9517 S2, 2009
Frequency Domain V.S. Spatial Domain • Spatial domain • The image plane itself • Direct manipulation of pixels • Changes in pixel position correspond to changes in the scene • Frequency domain • Fourier transform of an image • Directly related to rate of changes • Changes in pixel position correspond to changes in the spatial frequency COMP 9517 S2, 2009
Frequency Domain Overview • Frequency in image • High frequencies correspond to pixel values that change rapidly across the image • Low frequency components correspond to large scale features in the image • Frequency domain • Defined by values of the Fourier transform and its frequency variables (u, v) COMP 9517 S2, 2009
Frequency Domain Overview • Frequency domain processing COMP 9517 S2, 2009
Fourier Series • Periodic functions could be represented as a weighted sum of sines an cosines • Even functions that are not periodic (but whose area under the curve is finite) can be expressed as the integral of sines and/or cosines multiplied by a weighing function sum = COMP 9517 S2, 2009
Fourier Transformation • For a single variable continuous function f(x), the Fourier transform F(u) is defined by: where . • Given F(u), we can get f(x) using its inverse Fourier transform: • (1) and (2) constitute a Fourier transform pair. COMP 9517 S2, 2009
Two Dimensional Fourier Transformation • In two dimensions, we have: COMP 9517 S2, 2009
Discrete Fourier Transformation • In one dimension, • Note that the location of 1/M does not matter, so long as the product of the two multipliers is 1/M. • Also in the discrete case, the Fourier transform and its inverse always exist. COMP 9517 S2, 2009
Discrete Fourier Transformation • Consider Euler’s formula: • Substituting this expression into (5), and noting cos(-θ)=cos(θ), from which we obtain • Each term depends on all values of f(x), multiplied by sines and cosines of different frequencies. • The domain over which values of F(u) range is called the frequency domain, as u determines the frequency of the components of the transform. COMP 9517 S2, 2009
2-D Discrete Fourier Transformation • Digital images are 2-D discrete functions: COMP 9517 S2, 2009
Frequency Domain Filtering • Frequency is directly related to rate of change, so frequencies in the Fourier transform may be related to intensity variations in the image. • Slowest varying frequency at u = v = 0 corresponds to average gray level of the image. • Low frequencies correspond to slowly varying components in image- for example, large areas of similar gray levels. • Higher frequencies correspond to faster gray level changes- such as edges, noise etc. COMP 9517 S2, 2009
Procedure for Filtering in the Frequency Domain • Multiply the input image by (-1)x+y to center the transform • Compute the DFT F(u,v) of the resulting image • Multiply F(u,v) by a filter G(u,v) • Computer the inverse DFT transform h*(x,y) • Obtain the real part h(x,y) of 4 • Multiply the result by (-1)x+y COMP 9517 S2, 2009
Example: Notch Filter • We wish to force the average value of an image to zero. We can achieve this by setting F(0, 0) =0, and then taking its inverse transform. • So choose the filter function as: • A filter that attenuates high frequencies while allowing low frequencies to pass through is called a lowpass filter. • A filter that attenuates low frequencies while allowing high frequencies to pass through is called a highpass filter. COMP 9517 S2, 2009
Convolution Theorem • Let F(u, v) and H( u, v) be the Fourier transforms of f(x, y) and h(x,y). Then • f(x, y) * h(x, y) and F(u, v)H(u, v) constitute a Fourier transform , i.e. spatial convolution (LHS) can be obtained by taking the inverse transform of RHS, and conversely, the RHS can be obtained as the forward Fourier transform of LHS. • Analogously, convolution in the frequency domain reduces to multiplication in the spatial domain, and vice versa. • Using this theorem, we can also show that filters in the spatial and frequency domains constitute a Fourier transform pair. COMP 9517 S2, 2009
Exploiting the correspondence • If filters in the spatial and frequency domains are of the same size, then filtering is more efficient computationally in frequency domain. • But spatial filters are usually smaller in size. • Filtering is also more intuitive in frequency domain- so design it there. • Then, take the inverse transform, and use the resulting filter as a guide to design smaller filters in the spatial domain. COMP 9517 S2, 2009
Example of Smoothing an Image • In spatial domain, we just convolve the image with a Gaussian kernel • In frequency domain, we can multiply the image by a filter achieve the same effect COMP 9517 S2, 2009
Example of Smoothing an Image • Multiply the input image by (-1)x+y to center the transform • Compute the DFT F(u,v) of the resulting image • Multiply F(u,v) by a filter G(u,v) • Computer the inverse DFT transform h*(x,y) • Obtain the real part h(x,y) of 4 • Multiply the result by (-1)x+y COMP 9517 S2, 2009
Example of Smoothing an Image • Multiply the input image by (-1)x+y to center the transform • Compute the DFT F(u,v) of the resulting image • Multiply F(u,v) by a filter G(u,v) • Computer the inverse DFT transform h*(x,y) • Obtain the real part h(x,y) of 4 • Multiply the result by (-1)x+y COMP 9517 S2, 2009
Example of Smoothing an Image • Multiply the input image by (-1)x+y to center the transform • Compute the DFT F(u,v) of the resulting image • Multiply F(u,v) by a filter G(u,v) • Computer the inverse DFT transform h*(x,y) • Obtain the real part h(x,y) of 4 • Multiply the result by (-1)x+y Х COMP 9517 S2, 2009
Example of Smoothing an Image • Multiply the input image by (-1)x+y to center the transform • Compute the DFT F(u,v) of the resulting image • Multiply F(u,v) by a filter G(u,v) • Computer the inverse DFT transform h*(x,y) • Obtain the real part h(x,y) of 4 • Multiply the result by (-1)x+y COMP 9517 S2, 2009
Example of Smoothing an Image • Multiply the input image by (-1)x+y to center the transform • Compute the DFT F(u,v) of the resulting image • Multiply F(u,v) by a filter G(u,v) • Computer the inverse DFT transform h*(x,y) • Obtain the real part h(x,y) of 4 • Multiply the result by (-1)x+y F(u, v)G(u, v) f (x, y)* g(x, y) COMP 9517 S2, 2009
Gaussian Filter • Gaussian filters are important because their shapes are easy to specify, and both the forward and inverse Fourier transforms of a Gaussian function are real Gaussian functions. • Let H(u) be a one dimensional Gaussian filter specified by: where σ is the standard deviation of the Gaussian curve. • The corresponding filter in the spatial domain is • This is usually a lowpass filter. COMP 9517 S2, 2009
DoG Filter • Difference of Gaussians may be used to construct highpass filters: with and δ1 > δ 2. • The corresponding filter in the spatial domain is COMP 9517 S2, 2009
Image Pyramids • An image pyramid is a collection of decreasing resolution images arranged in the shape of a pyramid. COMP 9517 S2, 2009
Image Pyramids • System block diagram fro creating image pyramids • Computer a reduced-resolution approximation of the input image (mean, Gaussian, subsampling) • Upsample the output of step 1 • Compute the difference between the prediction of step 2 and the input to step 1 COMP 9517 S2, 2009
Image Pyramids • Two image pyramids and their statistics • Upsample and filtering the lowest resolution approximation image • Add the 1-level higher Laplacian’s prediction residual COMP 9517 S2, 2009
Scale-invariant Feature Transform • Scale-invariant feature transform (SIFT) : an local feature detection algorithm • A solution for correspondence problem • Desirable Feature Characteristics • Scale Invariance • Rotation Invariance • Illumination invariance • Viewpoint invariance COMP 9517 S2, 2009
SIFT Algorithm • Scale-Space Extrema Detection • Keypoint localization • Orientation Assignment • Keypoint Descriptor COMP 9517 S2, 2009
SIFT Algorithm • Scale-Space Extrema Detection (detect interest point /keypoint) • the image is convolved with Gaussian filters at different scales: • the difference of successive Gaussian-blurred images are taken: • keypoints are then taken as maxima/minima of the Difference of Gaussians (DoG) that occur at multiple scales COMP 9517 S2, 2009
SIFT Algorithm Lowe, David G. 'Distinctive Image Features from Scale Invariant Features', International Journal of Computer Vision, Vol. 60, No. 2, 2004, pp. 91-110 COMP 9517 S2, 2009
SIFT Algorithm • Keypoint localization • Scale-space extrema detection produces too many keypoint candidates, some of which are unstable. • The next step in the algorithm is to perform a detailed fit to the nearby data for accurate location, scale, and ratio of principal curvatures. • For each candidate keypoint: • Interpolation of nearby data is used to accurately determine its position. • Keypoints with low contrast are removed • Responses along edges are eliminated COMP 9517 S2, 2009
SIFT Algorithm • After scale space extrema are detected • the SIFT algorithm discards low contrast keypoints • filters out those located on edges COMP 9517 S2, 2009
SIFT Algorithm • Compute Gradient for each blurred image • For region around keypoint • Create Histogram with 36 bins for orientation • Weight each point with Gaussian window of 1.5σ • Create keypointfor all peaks with value>=.8 max bin COMP 9517 S2, 2009
SIFT Algorithm • Keypoint descriptor • Find the blurred image of closest scale • Sample the points around the keypoint • Rotate the gradients and coordinates by the previously computer orientation • Separate the region into sub regions • Create histogram for each sub region with 8 bins • Weight the samples with N(σ) = 1.5 Region width • TrilinearInterpolation (1-d factor) to place in histogram bins COMP 9517 S2, 2009
SIFT Algorithm • Keypoint descriptor • the feature descriptor is computed as a set of orientation histograms on 4*4 pixel neighborhoods. • The orientation histograms are relative to the keypoint orientation • the contribution of each pixel is weighted by the gradient magnitude, and by a Gaussian with 1:5 times the scale of the keypoint. COMP 9517 S2, 2009
SIFT Algorithm • Keypoint descriptor • Histograms contain 8 bins each, and each descriptor contains an array of 4 histograms around the keypoint. • This leads to a SIFT feature vector with 4*4*8 = 128 elements. This vector is normalized to enhance invariance to changes in illumination. COMP 9517 S2, 2009
References • Rafael C. Gonzalez and Richard. E. Woods, 2002, Digital Image Processing, 2nd edition, Addison Wesley. • Lowe, David G. (1999). "Object recognition from local scale-invariant features". Proceedings of the International Conference on Computer Vision. 2. pp. 1150–1157. • Lowe, David G. (2004). 'Distinctive Image Features from Scale Invariant Features', International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110. COMP 9517 S2, 2009
Acknowledgement • Some material, including images and tables, were drawn from the textbook, Digital Image Processing by Gonzalesz and Woods, and P.C. Rossin’s presentation. COMP 9517 S2, 2009