260 likes | 380 Vues
Highlights Lecture on the image part (10) Automatic Perception 16. Volker Krüger Aalborg Media Lab Aalborg University, Copenhagen vok@media.aau.dk. Agenda. Take you through the highlights in the context of a typical MED3 project Not go into details
E N D
Highlights Lecture on the image part (10)Automatic Perception 16 Volker Krüger Aalborg Media Lab Aalborg University, Copenhagen vok@media.aau.dk
Agenda • Take you through the highlights in the context of a typical MED3 project • Not go into details • For details see: book, slides, PE-questions, keywords, exercises • Programming Languages
Why should MED-students learn about digital images? • Because images are fun • Understand the media: Images • Understand how images can be manipulated in order to create visual effects • Rotation, scaling, blurring, etc. • As done in standard image manipulation tools • Remove parts of an image • Combine graphics with real images • Combine (part of) one image with another • Generate control signals for an application (project) • Understand how to find and follow (well defined) objects in an image • Recognize objects (many industrial applications)
Actor sitting Point 1: 22,33 Point 2: 24, 39 ….. Fundamental Steps in Computer Vision Representation and description Segmentation Knowledge base Recognition and Interpretation Preprocessing Problem domain Result Image acquisition
Image Acquisition • System setup • Define a working area (distance from the camera, etc.) • Choosing a camera • Field of view, resolution, color vs. B/W, optics • The lighting • Special background • Special clothing
Preprocessing • Grey level enhancement • Point processing • Remove noise • Neighbor operation • Median filter • Mean filter • Prepare image for further processing • Down sample • Select region of interest (ROI) • Scale or rotate image • Blur • Convolution
Segmentation • Separating the foreground pixels from the background pixels • Often, we want a binary image as output
Segmentation • Methods • Intensity/color • Motion (the object is moving) • Edges • Region growing • Post-segmentation: Remove noise in the binary image
Segmentation using Intensity/color • Thresholding (like Chroma-keying) • Algorithm: for each pixel • If THmin < pixel value < THmax • Then pixel belongs to object • Otherwise pixel belongs to background • Pros and cons: • Very simple • Requires that the object has a unique intensity/color • Supported by EyesWeb • Partially supported by Jitter
Segmentation using Motion • Assume that only the object is moving => motion can be used to find the object • Algorithm: for each pixel • Subtract the current image from a background image OR from the previous image => distance • If distance > TH then the pixel belongs to the object • Pros and cons: • Simple (very advanced background subtraction is not simple!) • Requires that the object pixels each have a different value than the background • Supported by EyesWeb (not the very advanced version!) • Not supported by Jitter
Segmentation using Edges • Edges = intensity/color changes between object and background • Defines the border between objects and background • Algorithm: for each pixel • Calculate the magnitude of the gradient • If magnitude > Threshold, then edge pixel • Pros and cons: • Relatively simple (Canny is more difficult) • Requires that the border pixels of the object each have a different value than the background • Supported by Eyesweb • Partially supported by Jitter
Segmentation using Region growing • Assume the object has a uniform intensity/color which is different from the intermediate background • Algorithm: • Find a seed point (pixel) inside the object • ”Grow” an object region using connectivity • Pros and cons: • Difficult: how to find the seed point… • Can find a blob even though pixels with a similar intensity/color exist in other parts of the image • Don’t think its supported by Eyesweb??? • Not supported by Jitter
Post-segmentation: Remove noise in the Binary image • After segmentation, noise often remains • Noise: • Wrongly segmented regions • Camera noise => noise pixels • Methods: • Those from preprocessing • Median filter • Mean filter • Morphology • Supported by Eyes-Web • “mean” is supported by Jitter
Morphology Erosion Dilation Opening Closing
Representation • Represent each blob by a number of features • Finding the blobs: • Connected component analysis • Find all segmented pixels which are connected
Representation - Features • Representing each object by a set of features (=characteristics) • Calculated from the list of pixels (blob) • Features: • Center of gravity • Bounding box • Area • Perimeter = length of contour • Compactness • Circularity • Orientation • Many other features: see the notes • Some are supported by EyesWeb • CoG is supported by Jitter One BLOB
Recognition • Which one of the blobs found by segmentation is the correct blob (=object)? • Finding the object in the image • Directly: Template matching • Which of the blobs match the model we have of the object? • Feature matching
Template Matching Input image Output Template
Feature Matching • Distance in ”feature-space” • Feature 1: Area • Feature 2: Circularity • 2 dimensional feature space • Model: • BLOBs in the image Feature 2 MATCH! Feature 1 Measure the distance!
Learning the Parameters • Off-line training • All the intensity/color/edge thresholds, background images and model parameters (for recognition) need to be set • Simple solution • Play around with the values until it works • Not very scientific! • Instead - learn the values through a number of representative test images • The more test images the better! • Learn the mean and variances (or color histograms) • Look at the histograms of your images!
Programming Languages • EyesWeb • Jitter • Java
Java • Everything is possible • ”Hard” to get started • Find image processing functions on the web or on the CD for the book
EyesWeb • Very easy to program • Lot’s of available functionality • Limited by the blocks UNLESS you write your own blocks! • See the tutorials on the web • Program blocks in C supported by IPL and OpenCV – very powerful
Jitter • Easy to program • Enhancement of Max/MSP • Lot’s of available functionality for sound • Limited blocks for vision • Limited by the blocks UNLESS you write your own blocks! • See the tutorials on the web
Pros and Cons • Java • If you know how to program or want to learn • Do the image processing in EyesWeb/Jitter and send control signals to Java for visualization • EyesWeb • If you don’t want to spend too much time programming • EyesWeb including your own blocks • A good compromise • Jitter: More programming needed! • Ask the students from previous MED3-years!
Final comments • Before the exam you need to know what goes on inside the EyesWeb/Jitter blocks you are using! • The course is evaluated through the project exam (using the PE-questions and keywords) • Prepare by: • Reading the book • Look at the slides • Discuss the PE-questions and keywords within the groups