CS 414 Multimedia Systems Design Lecture 4 Digital Image Representation

1. CS 414 - Spring 2008 CS 414 � Multimedia Systems Design Lecture 4 � Digital Image Representation Klara Nahrstedt Spring 2008

2. CS 414 - Spring 2008 Administrative Group Directories will be established by Friday, 1/25 MP1 will be out on 1/25

3. Images � Capturing and Processing CS 414 - Spring 2008

4. Capturing Real-World Images Picture � two dimensional image captured from a real-world scene that represents a momentary event from the 3D spatial world CS 414 - Spring 2008

5. Image Concepts An image is a function of intensity values over a 2D plane I(r,s) Sample function at discrete intervals to represent an image in digital form matrix of intensity values for each color plane intensity typically represented with 8 bits Sample points are called pixels

6. Digital Images Samples = pixels Quantization = number of bits per pixel Example: if we would sample and quantize standard TV picture (525 lines) by using VGA (Video Graphics Array), video controller creates matrix 640x480pixels, and each pixel is represented by 8 bit integer (256 discrete gray levels) CS 414 - Spring 2008

7. Image Representations Black and white image single color plane with 2 bits Grey scale image single color plane with 8 bits Color image three color planes each with 8 bits RGB, CMY, YIQ, etc. Indexed color image single plane that indexes a color table Compressed images TIFF, JPEG, BMP, etc.

8. Digital Image Representation (3 Bit Quantization) CS 414 - Spring 2008

9. Color QuantizationExample of 24 bit RGB Image CS 414 - Spring 2008

10. Image Representation Example

11. Graphical Representation CS 414 - Spring 2008

12. Image Properties (Color) CS 414 - Spring 2008

13. Color Histogram CS 414 - Spring 2008

14. Image Properties (Texture) Texture � small surface structure, either natural or artificial, regular or irregular Texture Examples: wood barks, knitting patterns Statistical texture analysis describes texture as a whole based on specific attributes: regularity, coarseness, orientation, contrast, � CS 414 - Spring 2008

15. Texture Examples CS 414 - Spring 2008

16. Spatial and Frequency Domains Spatial domain refers to planar region of intensity values Frequency domain think of each color plane as a sinusoidal function of changing intensity values apply DFT (Discrete Fourier Transform) to subsets of pixels for compression

17. Convolution Filters Filter an image by replacing each pixel in the source with a weighted sum of its neighbors Define the filter using a convolution mask, also referred to as a kernel non-zero values in small neighborhood, typically centered around a central pixel generally have odd number of rows/columns

18. Convolution Filter CS 414 - Spring 2008 Here is a mathematician's domain. Most of filters are using convolution matrix. With the Convolution Matrix filter, if the fancy takes you, you can build a custom filter. What is a convolution matrix? It's possible to get a rough idea of it without using mathematical tools that only a few ones know. Convolution is the treatment of a matrix by another one which is called "kernel". The Convolution Matrix filter uses a first matrix which is the Image to be treated. The image is a bi-dimensionnal collection of pixels in rectangular coordinates. The used kernel depends on the effect you want. GIMP uses 5x5 or 3x3 matrices. We will consider only 3x3 matrices, they are the most used and they are enough for all effects you want. If all border values of a kernel are set to zero, then system will consider it as a 3x3 matrix. The filter studies successively every pixel of the image. For each of them, which we will call the "initial pixel", it multiplies value of this pixel and values of the 8 surrounding pixels by the kernel corresponding value. Then it adds the results, and the initial pixel is set to takes this final result value. A simple example: On the left is the image matrix: each pixel is marked with its value. The initial pixel has a red border. The kernel action area has a green border. In the middle is the kernel and, on the right is the convolution result. Here is what happened: the filter read successively, from left to right and from top to bottom, all the pixels of the kernel action area. It multiplied the value of each of them by the kernel corresponding value and added results: (100*0)+(50*1)+(50*0)*(100*0)+(100*0) +(100*0)+(100*0)+(100*0)+(100*0)+(100*0) = 50. The initial pixel took the value 50. Previously, when the initial pixel had value=50, it took the value 100 of the above pixel (the filter doesn't work on the image but on a copy) and so disappeared into the "100" background pixels. As a graphical result, the initial pixel moved a pixel downwards. Here is a mathematician's domain. Most of filters are using convolution matrix. With the Convolution Matrix filter, if the fancy takes you, you can build a custom filter. What is a convolution matrix? It's possible to get a rough idea of it without using mathematical tools that only a few ones know. Convolution is the treatment of a matrix by another one which is called "kernel". The Convolution Matrix filter uses a first matrix which is the Image to be treated. The image is a bi-dimensionnal collection of pixels in rectangular coordinates. The used kernel depends on the effect you want. GIMP uses 5x5 or 3x3 matrices. We will consider only 3x3 matrices, they are the most used and they are enough for all effects you want. If all border values of a kernel are set to zero, then system will consider it as a 3x3 matrix. The filter studies successively every pixel of the image. For each of them, which we will call the "initial pixel", it multiplies value of this pixel and values of the 8 surrounding pixels by the kernel corresponding value. Then it adds the results, and the initial pixel is set to takes this final result value. A simple example: On the left is the image matrix: each pixel is marked with its value. The initial pixel has a red border. The kernel action area has a green border. In the middle is the kernel and, on the right is the convolution result. Here is what happened: the filter read successively, from left to right and from top to bottom, all the pixels of the kernel action area. It multiplied the value of each of them by the kernel corresponding value and added results: (100*0)+(50*1)+(50*0)*(100*0)+(100*0) +(100*0)+(100*0)+(100*0)+(100*0)+(100*0) = 50. The initial pixel took the value 50. Previously, when the initial pixel had value=50, it took the value 100 of the above pixel (the filter doesn't work on the image but on a copy) and so disappeared into the "100" background pixels. As a graphical result, the initial pixel moved a pixel downwards.

19. Mean Filter

20. Mean Filter

21. Common 3x3 Filters Low/High pass filter Blur operator H/V Edge detector

22. Example

23. Edge Detection Identify areas of strong intensity contrast filter useless data; preserve important properties Fundamental technique e.g., use gestures as input identify shapes, match to templates, invoke commands

24. Edge Detection

25. Characteristics of Edges (1D) Identify high slope in first derivative Pixel is on an edge if value of the gradient exceeds a threshold Edges characterize boundaries and are therefore a problem of fundamental importance in image processing. Edges in images are areas with strong intensity contrasts � a jump in intensity from one pixel to the next. Edge detecting an image significantly reduces the amount of data and filters out useless information, while preserving the important structural properties in an image. There are many ways to perform edge detection. However, the majority of different methods may be grouped into two categories, gradient and Laplacian. The gradient method detects the edges by looking for the maximum and minimum in the first derivative of the image. The Laplacian method searches for zero crossings in the second derivative of the image to find edges. An edge has the one-dimensional shape of a ramp and calculating the derivative of the image can highlight its location. Suppose we have the following signal, with an edge shown by the jump in intensity below: (figure above) If we take the gradient of this signal (which, in one dimension, is just the first derivative with respect to t) we get the following: (lower figure) Clearly, the derivative shows a maximum located at the center of the edge in the original signal. This method of locating an edge is characteristic of the �gradient filter� family of edge detection filters and includes the Sobel method. A pixel location is declared an edge location if the value of the gradient exceeds some threshold. As mentioned before, edges will have higher pixel intensity values than those surrounding it. So once a threshold is set, you can compare the gradient value to the threshold value and detect an edge whenever the threshold is exceeded. Furthermore, when the first derivative is at a maximum, the second derivative is zero.Edges characterize boundaries and are therefore a problem of fundamental importance in image processing. Edges in images are areas with strong intensity contrasts � a jump in intensity from one pixel to the next. Edge detecting an image significantly reduces the amount of data and filters out useless information, while preserving the important structural properties in an image. There are many ways to perform edge detection. However, the majority of different methods may be grouped into two categories, gradient and Laplacian. The gradient method detects the edges by looking for the maximum and minimum in the first derivative of the image. The Laplacian method searches for zero crossings in the second derivative of the image to find edges. An edge has the one-dimensional shape of a ramp and calculating the derivative of the image can highlight its location. Suppose we have the following signal, with an edge shown by the jump in intensity below: (figure above) If we take the gradient of this signal (which, in one dimension, is just the first derivative with respect to t) we get the following: (lower figure) Clearly, the derivative shows a maximum located at the center of the edge in the original signal. This method of locating an edge is characteristic of the �gradient filter� family of edge detection filters and includes the Sobel method. A pixel location is declared an edge location if the value of the gradient exceeds some threshold. As mentioned before, edges will have higher pixel intensity values than those surrounding it. So once a threshold is set, you can compare the gradient value to the threshold value and detect an edge whenever the threshold is exceeded. Furthermore, when the first derivative is at a maximum, the second derivative is zero.

26. Simple Edge Detection Example: Let assume single line of pixels Calculate 1st derivative (gradient) of the intensity of the original data Using gradient, we can find peak pixels in image If I(x) represents intensity of pixel x and I�(x) represents gradient (in 1D), then the gradient can be calculated by convolving the original data with a mask (-1/2 0 +1/2) I�(x) = -1/2 *I(x-1) + 0*I(x) + �*I(x+1) CS 414 - Spring 2008

27. Sobel Operator CS 414 - Spring 2008 SOBEL EXPLANATION The mask is slid over an area of the input image, changes that pixel's value and then shifts one pixel to the right and continues to the right until it reaches the end of a row. It then starts at the beginning of the next row. The example below shows the mask being slid over the top left portion of the input image represented by the green outline. The formula shows how a particular pixel in the output image would be calculated. The center of the mask is placed over the pixel you are manipulating in the image. And the I & J values are used to move the file pointer so you can mulitply, for example, pixel (a22) by the corresponding mask value (m22). It is important to notice that pixels in the first and last rows, as well as the first and last columns cannot be manipulated by a 3x3 mask. This is because when placing the center of the mask over a pixel in the first row (for example), the mask will be outside the image boundaries. SOBEL EXPLANATION The mask is slid over an area of the input image, changes that pixel's value and then shifts one pixel to the right and continues to the right until it reaches the end of a row. It then starts at the beginning of the next row. The example below shows the mask being slid over the top left portion of the input image represented by the green outline. The formula shows how a particular pixel in the output image would be calculated. The center of the mask is placed over the pixel you are manipulating in the image. And the I & J values are used to move the file pointer so you can mulitply, for example, pixel (a22) by the corresponding mask value (m22). It is important to notice that pixels in the first and last rows, as well as the first and last columns cannot be manipulated by a 3x3 mask. This is because when placing the center of the mask over a pixel in the first row (for example), the mask will be outside the image boundaries.

28. Basic Method Step 1: filter noise using mean filter Step 2: compute spatial gradient Step 3: mark points > threshold as edges

29. Mark Edge Points Given gradient at each pixel and threshold mark pixels where gradient > threshold as edges Canny algorithm extends basic method Stages of the Canny algorithm [edit] Noise reduction Because the Canny edge detector uses a filter based on the first derivative of a Gaussian, it is susceptible to noise present on raw unprocessed image data, so to begin with the raw image is convolved with a Gaussian filter. The result is as a slightly blurred version of the original which is not affected by a single noisy pixel to any significant degree. [edit] Finding the intensity gradient of the image An edge in an image may point in a variety of directions, so the Canny algorithm uses four filters to detect horizontal, vertical and diagonal edges in the blurred image. For each pixel in the result, the direction of the filter which gives the largest response magnitude is determined. This direction together with the filter response then gives an estimated intensity gradient at each point in the image. [edit] Non-maximum suppression Given estimates of the image gradients, a search is then carried out to determine if the gradient magnitude assumes a local maximum in the gradient direction. From this stage referred to as non-maximum suppression, a set of edge points, in the form of a binary image, is obtained. These are sometimes referred to as "thin edges". [edit] Tracing edges through the image and hysteresis thresholding Intensity gradients which are large are more likely to correspond to edges than if they are small. It is in most cases impossible to specify a threshold at which a given intensity gradient switches from corresponding to an edge into not doing so. Therefore Canny uses thresholding with hysteresis. Thresholding with hysteresis requires two thresholds - high and low. Making the assumption that important edges should be along continuous curves in the image allows us to follow a faint section of a given line and to discard a few noisy pixels that do not constitute a line but have produced large gradients. Therefore we begin by applying a high threshold. This marks out the edges we can be fairly sure are genuine. Starting from these, using the directional information derived earlier, edges can be traced through the image. While tracing an edge, we apply the lower threshold, allowing us to trace faint sections of edges as long as we find a starting point. Once this process is complete we have a binary image where each pixel is marked as either an edge pixel or a non-edge pixel. From complementary output from the edge tracing step, the binary edge map obtained in this way can also be treated as a set of edge curves, which after further processing can be represented as polygons in the image domain. [edit] Differential edge detection A more refined approach to obtain edges with sub-pixel accuracy is by using the following differential approach of detecting zero-crossings of the second-order directional derivative in the gradient direction (Lindeberg 1998) that satisfy a sign-condition on the third-order directional derivative in the same direction (for more details, please see the relations between edge detection and ridge detection in the article on ridge detection) where Lx, Ly ... Lyyy denote partial derivatives computed from a scale-space representation L obtained by smoothing the original image with a Gaussian kernel. In this way, the edges will be automatically obtained as continuous curves with subpixel accuracy. Hysteresis thresholding can also be applied to these differential and subpixel edge segments. [edit] Parameters The Canny algorithm contains a number of adjustable parameters, which can affect the computation time and effectiveness of the algorithm. The size of the Gaussian filter: the smoothing filter used in the first stage directly affects the results of the Canny algorithm. Smaller filters cause less blurring, and allow detection of small, sharp lines. A larger filter causes more blurring, smearing out the value of a given pixel over a larger area of the image. Larger blurring radii are more useful for detecting larger, smoother edges - for instance, the edge of a rainbow. Thresholds: the use of two thresholds with hysteresis allows more flexibility than in a single-threshold approach, but general problems of thresholding approaches still apply. A threshold set too high can miss important information. On the other hand, a threshold set too low will falsely identify irrelevant information (such as noise) as important. It is difficult to give a generic threshold that works well on all images. No tried and tested approach to this problem yet exists. To experiment with the parameters of the Canny algorithm, the on-line Canny application on http://matlabserver.cs.rug.nl can be useful. [edit] Conclusion The Canny algorithm is adaptable to various environments. Its parameters allow it to be tailored to recognition of edges of differing characteristics depending on the particular requirements of a given implementation. In Canny's original paper, the derivation of the optimal filter led to a Finite Impulse Response filter, which can be slow to compute in the spatial domain if the amount of smoothing required is important (the filter will have a large spatial support in that case). For this reason, it is often suggested to use Rachid Deriche's Infinite Impulse Response form of Canny's filter, which is recursive, and which can be computed in a short, fixed amount of time for any desired amount of smoothing. The second form is suitable for real time implementations in FPGAs or DSPs, or very fast embedded PCs. In this context, however, the regular recursive implementation of the Canny operator does not give a good approximation of rotational symmetry and therefore gives a bias towards horizontal and vertical edges. [edit] References Canny, J., A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, 8:679-714, 1986. R. Deriche, Using Canny's criteria to derive an optimal edge detector recursively implemented, Int. J. Computer Vision, Vol. 1, pp. 167-187, April 1987. Stages of the Canny algorithm [edit] Noise reduction Because the Canny edge detector uses a filter based on the first derivative of a Gaussian, it is susceptible to noise present on raw unprocessed image data, so to begin with the raw image is convolved with a Gaussian filter. The result is as a slightly blurred version of the original which is not affected by a single noisy pixel to any significant degree. [edit] Finding the intensity gradient of the image An edge in an image may point in a variety of directions, so the Canny algorithm uses four filters to detect horizontal, vertical and diagonal edges in the blurred image. For each pixel in the result, the direction of the filter which gives the largest response magnitude is determined. This direction together with the filter response then gives an estimated intensity gradient at each point in the image. [edit] Non-maximum suppression Given estimates of the image gradients, a search is then carried out to determine if the gradient magnitude assumes a local maximum in the gradient direction. From this stage referred to as non-maximum suppression, a set of edge points, in the form of a binary image, is obtained. These are sometimes referred to as "thin edges". [edit] Tracing edges through the image and hysteresis thresholding Intensity gradients which are large are more likely to correspond to edges than if they are small. It is in most cases impossible to specify a threshold at which a given intensity gradient switches from corresponding to an edge into not doing so. Therefore Canny uses thresholding with hysteresis. Thresholding with hysteresis requires two thresholds - high and low. Making the assumption that important edges should be along continuous curves in the image allows us to follow a faint section of a given line and to discard a few noisy pixels that do not constitute a line but have produced large gradients. Therefore we begin by applying a high threshold. This marks out the edges we can be fairly sure are genuine. Starting from these, using the directional information derived earlier, edges can be traced through the image. While tracing an edge, we apply the lower threshold, allowing us to trace faint sections of edges as long as we find a starting point. Once this process is complete we have a binary image where each pixel is marked as either an edge pixel or a non-edge pixel. From complementary output from the edge tracing step, the binary edge map obtained in this way can also be treated as a set of edge curves, which after further processing can be represented as polygons in the image domain. [edit] Differential edge detection A more refined approach to obtain edges with sub-pixel accuracy is by using the following differential approach of detecting zero-crossings of the second-order directional derivative in the gradient direction (Lindeberg 1998) that satisfy a sign-condition on the third-order directional derivative in the same direction (for more details, please see the relations between edge detection and ridge detection in the article on ridge detection) where Lx, Ly ... Lyyy denote partial derivatives computed from a scale-space representation L obtained by smoothing the original image with a Gaussian kernel. In this way, the edges will be automatically obtained as continuous curves with subpixel accuracy. Hysteresis thresholding can also be applied to these differential and subpixel edge segments. [edit] Parameters The Canny algorithm contains a number of adjustable parameters, which can affect the computation time and effectiveness of the algorithm. The size of the Gaussian filter: the smoothing filter used in the first stage directly affects the results of the Canny algorithm. Smaller filters cause less blurring, and allow detection of small, sharp lines. A larger filter causes more blurring, smearing out the value of a given pixel over a larger area of the image. Larger blurring radii are more useful for detecting larger, smoother edges - for instance, the edge of a rainbow. Thresholds: the use of two thresholds with hysteresis allows more flexibility than in a single-threshold approach, but general problems of thresholding approaches still apply. A threshold set too high can miss important information. On the other hand, a threshold set too low will falsely identify irrelevant information (such as noise) as important. It is difficult to give a generic threshold that works well on all images. No tried and tested approach to this problem yet exists. To experiment with the parameters of the Canny algorithm, the on-line Canny application on http://matlabserver.cs.rug.nl can be useful. [edit] Conclusion The Canny algorithm is adaptable to various environments. Its parameters allow it to be tailored to recognition of edges of differing characteristics depending on the particular requirements of a given implementation. In Canny's original paper, the derivation of the optimal filter led to a Finite Impulse Response filter, which can be slow to compute in the spatial domain if the amount of smoothing required is important (the filter will have a large spatial support in that case). For this reason, it is often suggested to use Rachid Deriche's Infinite Impulse Response form of Canny's filter, which is recursive, and which can be computed in a short, fixed amount of time for any desired amount of smoothing. The second form is suitable for real time implementations in FPGAs or DSPs, or very fast embedded PCs. In this context, however, the regular recursive implementation of the Canny operator does not give a good approximation of rotational symmetry and therefore gives a bias towards horizontal and vertical edges. [edit] References Canny, J., A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, 8:679-714, 1986. R. Deriche, Using Canny's criteria to derive an optimal edge detector recursively implemented, Int. J. Computer Vision, Vol. 1, pp. 167-187, April 1987.

30. Compute Edge Direction Calculation of Rate of Change in Intensity Gradient Use 2nd derivative Example: (5 7 6 4 152 148 149) Use convolution mask (+1 -2 +1) I��(x) = 1*I(x-1) -2*I(x) + 1*I(x+1) Peak detection in 2nd derivate is a method for line detection. CS 414 - Spring 2008 Detecting an edge Taking an edge to be a change in intensity taking place over a number of pixels, edge detection algorithms generally compute a derivative of this intensity change. To simplify matters, we can consider the detection of an edge in one dimension. In this instance, our data can be a single line of pixel intensities. For instance, we can intuitively say that there should be an edge between the 4th and 5th pixels in the following 1-dimensional data: 5 7 6 4 152 148 149 To firmly state a specific threshold on how large the intensity change between two neighbouring pixels must be for us to say that there should be an edge between these pixels is, however, not always an easy problem. Indeed, this is one of the reasons why edge detection may be a non-trivial problem unless the objects in the scene are particularly simple and the illumination conditions can be well controlled. [edit] Computing the 1st derivative Many edge-detection operators are based upon the 1st derivative of the intensity - this gives us the intensity gradient of the original data. Using this information we can search an image for peaks in the intensity gradient. If I(x) represents the intensity of pixel x, and I'(x) represents the first derivative (intensity gradient) at pixel x, we therefore find that: For higher performance image processing, the 1st derivative can therefore be calculated (in 1D) by convolving the original data with a mask: -1/2 0 +1/2 [edit] Computing the 2nd derivative Some other edge-detection operators are based upon the 2nd derivative of the intensity. This is essentially the rate of change in intensity gradient. In the ideal continuous case, detection of zero-crossings in the second derivative captures local maxima in the gradient. Peak detection in the second derivative, on the other hand, is a method for line detection, provided that the image operators are expressed at a proper scale. As noted above, a line is a double edge, hence we will see an intensity gradient on one side of the line, followed immediately by the opposite gradient on the opposite site. Therefore we can expect to see a very high change in intensity gradient where a line is present in the image. To find lines, we can alternatively search for zero-crossings in the second derivative of the image gradient. If I(x) represents the intensity at point x, and I"(x) is the second derivative at point x: Again most algorithms use a convolution mask to process quickly the image data: +1 -2 +1 [edit] Thresholding Once we have calculated our derivative, the next stage is to apply a threshold, to determine where the result suggest an edge to be present. The lower the threshold, the more lines will be detected, and the results become increasingly susceptible to noise, and also to picking out irrelevant features from the image. Conversely a high threshold may miss subtle lines, or segmented lines. A commonly used compromise is thresholding with hysteresis. This method uses multiple thresholds to find edges. We begin by using the upper threshold to find the start of a line. Once we have a start point, we trace the edge's path through the image pixel by pixel, marking an edge whenever we are above the lower threshold. We stop marking our edge only when the value falls below our lower threshold. This approach makes the assumption that edges are likely to be in continuous lines, and allows us to follow a faint section of an edge we have previously seen, without meaning that every noisy pixel in the image is marked down as an edge. [edit] Edge detection operators 1st order: Roberts Cross, Prewitt, Sobel, Canny 2nd Order: Marr-Hildreth, zero-crossings of the second-order derivative in the gradient direction. Currently, the Canny operator (or variations of this operator) is the most commonly used edge detection method. A large number of edge detection operators have been published but so far none has shown significant advantages over the Canny-type operators in general situations. In his original work, Canny studied the problem of designing an optimal pre-smoothing filter for edge detection, and then showed that this filter could be well approximated by a first-order Gaussian derivative kernel. Canny also introduced the notion of non-maximum suppression, which means that edges are defined as points where the gradient magnitude assumes a maximum in the gradient direction. On a discrete grid, the non-maximum suppression stage can be implemented by estimating the gradient direction using first-order derivatives, then rounding off the gradient direction to multiples of 45 degrees, and finally comparing the values of the gradient magnitude in the estimated gradient direction. A more refined approach to obtain edges with sub-pixel accuracy is by using the following differential approach of detecting zero-crossings of the second-order directional derivative in the gradient direction (Lindeberg 1998) that satisfy a sign-condition on the third-order directional derivative in the same direction (for more details, please see the relations between edge detection and ridge detection in the article on ridge detection) where Lx, Ly ... Lyyy denote partial derivatives computed from a scale-space representation L obtained by smoothing the original image with a Gaussian kernel. In this way, the edges will be automatically obtained as continuous curves with subpixel accuracy. Hysteresis thresholding can also be applied to these differential and subpixel edge segments. [edit] Noise Reduction Edge detection is complicated with false edges created by image noise. The number of false edges can be lowered by using image noise reduction techniques before detecting edges. Detecting an edge Taking an edge to be a change in intensity taking place over a number of pixels, edge detection algorithms generally compute a derivative of this intensity change. To simplify matters, we can consider the detection of an edge in one dimension. In this instance, our data can be a single line of pixel intensities. For instance, we can intuitively say that there should be an edge between the 4th and 5th pixels in the following 1-dimensional data: 5 7 6 4 152 148 149 To firmly state a specific threshold on how large the intensity change between two neighbouring pixels must be for us to say that there should be an edge between these pixels is, however, not always an easy problem. Indeed, this is one of the reasons why edge detection may be a non-trivial problem unless the objects in the scene are particularly simple and the illumination conditions can be well controlled. [edit] Computing the 1st derivative Many edge-detection operators are based upon the 1st derivative of the intensity - this gives us the intensity gradient of the original data. Using this information we can search an image for peaks in the intensity gradient. If I(x) represents the intensity of pixel x, and I'(x) represents the first derivative (intensity gradient) at pixel x, we therefore find that: For higher performance image processing, the 1st derivative can therefore be calculated (in 1D) by convolving the original data with a mask: -1/2 0 +1/2 [edit] Computing the 2nd derivative Some other edge-detection operators are based upon the 2nd derivative of the intensity. This is essentially the rate of change in intensity gradient. In the ideal continuous case, detection of zero-crossings in the second derivative captures local maxima in the gradient. Peak detection in the second derivative, on the other hand, is a method for line detection, provided that the image operators are expressed at a proper scale. As noted above, a line is a double edge, hence we will see an intensity gradient on one side of the line, followed immediately by the opposite gradient on the opposite site. Therefore we can expect to see a very high change in intensity gradient where a line is present in the image. To find lines, we can alternatively search for zero-crossings in the second derivative of the image gradient. If I(x) represents the intensity at point x, and I"(x) is the second derivative at point x: Again most algorithms use a convolution mask to process quickly the image data: +1 -2 +1 [edit] Thresholding Once we have calculated our derivative, the next stage is to apply a threshold, to determine where the result suggest an edge to be present. The lower the threshold, the more lines will be detected, and the results become increasingly susceptible to noise, and also to picking out irrelevant features from the image. Conversely a high threshold may miss subtle lines, or segmented lines. A commonly used compromise is thresholding with hysteresis. This method uses multiple thresholds to find edges. We begin by using the upper threshold to find the start of a line. Once we have a start point, we trace the edge's path through the image pixel by pixel, marking an edge whenever we are above the lower threshold. We stop marking our edge only when the value falls below our lower threshold. This approach makes the assumption that edges are likely to be in continuous lines, and allows us to follow a faint section of an edge we have previously seen, without meaning that every noisy pixel in the image is marked down as an edge. [edit] Edge detection operators 1st order: Roberts Cross, Prewitt, Sobel, Canny 2nd Order: Marr-Hildreth, zero-crossings of the second-order derivative in the gradient direction. Currently, the Canny operator (or variations of this operator) is the most commonly used edge detection method. A large number of edge detection operators have been published but so far none has shown significant advantages over the Canny-type operators in general situations. In his original work, Canny studied the problem of designing an optimal pre-smoothing filter for edge detection, and then showed that this filter could be well approximated by a first-order Gaussian derivative kernel. Canny also introduced the notion of non-maximum suppression, which means that edges are defined as points where the gradient magnitude assumes a maximum in the gradient direction. On a discrete grid, the non-maximum suppression stage can be implemented by estimating the gradient direction using first-order derivatives, then rounding off the gradient direction to multiples of 45 degrees, and finally comparing the values of the gradient magnitude in the estimated gradient direction. A more refined approach to obtain edges with sub-pixel accuracy is by using the following differential approach of detecting zero-crossings of the second-order directional derivative in the gradient direction (Lindeberg 1998) that satisfy a sign-condition on the third-order directional derivative in the same direction (for more details, please see the relations between edge detection and ridge detection in the article on ridge detection) where Lx, Ly ... Lyyy denote partial derivatives computed from a scale-space representation L obtained by smoothing the original image with a Gaussian kernel. In this way, the edges will be automatically obtained as continuous curves with subpixel accuracy. Hysteresis thresholding can also be applied to these differential and subpixel edge segments. [edit] Noise Reduction Edge detection is complicated with false edges created by image noise. The number of false edges can be lowered by using image noise reduction techniques before detecting edges.

31. Compute Edge Direction Compute direction of maximum change

32. Summary Other Important Image Processing Operations Image segmentation Image recognition Formatting Conditioning Marking Grouping Extraction Matching Image Synthesis CS 414 - Spring 2008

CS 414 Multimedia Systems Design Lecture 4 Digital Image Representation

CS 414 Multimedia Systems Design Lecture 4 Digital Image Representation

Presentation Transcript

CS 414 – Multimedia Systems Design Lecture 1 - Introduction

CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation

CS 414 – Multimedia Systems Design Lecture 1 - Introduction

CS 414 – Multimedia Systems Design Lecture 22 – Multimedia Session Protocols

CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation

CS 414 – Multimedia Systems Design Lecture 22 – Multimedia Session Protocols

CS 414 – Multimedia Systems Design Lecture 21 – Multimedia Session Protocols

CS 414 – Multimedia Systems Design Lecture 1 - Introduction

CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation

CS 414 – Multimedia Systems Design Lecture 18 – Multimedia Transport Subsystem (Part 4)

CS 414 – Multimedia Systems Design Lecture 5 – Digital Video Representation

CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation

CS 414 – Multimedia Systems Design Lecture 5 – Digital Video Representation

CS 414 – Multimedia Systems Design Lecture 5 – Digital Video Representation

CS 414 – Multimedia Systems Design Lecture 20 – Multimedia Session Protocols

CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation

CS 414 – Multimedia Systems Design Lecture 4 – Visual Perception and Digital Image Representation

CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation

CS 414 – Multimedia Systems Design Lecture 22 – Multimedia Session Protocols