1 / 39

4054 Machine Vision

4054 Machine Vision. Dr. Simon Prince Dept. Computer Science University College London. http://www.cs.ucl.ac/uk/s.prince/4054.htm. Course Mailing List. IMPORTANT!

reeves
Télécharger la présentation

4054 Machine Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4054 Machine Vision Dr. Simon Prince Dept. Computer Science University College London http://www.cs.ucl.ac/uk/s.prince/4054.htm

  2. Course Mailing List IMPORTANT! You must sign up to the course mailing list. Regardless of the actual module code that you are taking this under, please sign up by sending a mail to 4054-request@cs.ucl.ac.uk with the word “join” in the subject line. You should receive a confirmation that you have joined. If you do not receive this then please contact the helpdesk. Course announcements will be on this mailing list. All announcements to this list will be assumed to have been received.

  3. Schedule LECTURES Tuesday: 12-1pm MPEB 1.13 Thursday: 10-12pm Wilkins JBR Meeting Room Thursday: 2-3pm South Wing Committee Room Thursday: 4-5pm MPEB 113 Friday: 11-12: Roberts 110 PRACTICAL SESSIONS Thursday 3-4pm MPEB 4.17 Friday 3-5pm MPEB 1.05 Demonstrator: Alastair Moore Information via: http://www.cs.ucl.ac.uk/staff/A.Moore/teaching.htm

  4. Materials • Lecture Notes • Slides • Key Papers • Practicals

  5. Time Commitment 30 one-hour lectures 15 hours problem classes – do the programming practicals in these. Expect to have to spend ten hours a week on • Reading papers, reviewing class notes • Working through proofs and implementing algorithms

  6. Books Essentially, there are no good books, but these are the best of a bad lot

  7. Books Good for tracking (free online) Good for geometry (1 free chapter online)

  8. Books Good for geometry (1 free chapter online)

  9. Assumed Knowledge • Linear Algebra up to and including the Singular Value Decomposition • Probability and random variables (course 3006, ongoing) • Familiarity with the multivariate normal distribution

  10. Problems "In mathematics you don't understand things. You just get used to them." John Von Neumann, 1903-1957 How to get used to them? Do problems from the appropriate book. Implement the algorithms in the lecture notes. Work through the proofs in the text and appendices. Familiarity and understanding of mathematics only comes with use.

  11. Exam • 2.5 Hours • Choose 3 questions from 5 • No restrictions • The exam is in February.

  12. Coursework To complete COMP4054 Machine Vision, you must complete two assignments. One comprises a Matlab implementation and is taken from the first half of the course. The second comprises a critical literature review and is taken from the second half of the course. In both cases, you have a choice of topic. Practical 1: Programming Assignment Deadline: 21/12/07 Practical 2: Literature Review Deadline: 8/1/08 No extensions except in the most serious circumstances

  13. Practical #1 • In practical 1 you are required to complete one of five short Matlab assignments – you can choose from these assignments. Although you are only required to submit one assignment, you are strongly recommended to complete all of these projects nonetheless. They are designed to help you understand aspects of the course and will help with your revision. Supporting materials are available from the main COMP4054 website. • Topics: • Geometry of a single camera • Geometry of multiple cameras • Dense stereo vision • Background subtraction • Face detection

  14. Practical #1 • Regardless which of the five projects you choose, the format of the report will be the same. It should consist of 2-5 pages containing: • a short literature review of the project area • a description of the techniques in succinct mathematical terms • a description of what was done • the results obtained • relevant figures to explain your method and results • an analysis and critique of the results • suggestions for further improvements to the method • code (suitably commented) should be included in an appendix

  15. Practical #2 • Write a 2000 word literature review on one of the following topics: • Object class recognition • Face Detection (i.e. finding faces in images) • Facial Identity Recognition • Object Tracking • Representations of Shape in Computer Vision • Your literature review should include: • An overview of the history of the topic • An in-depth description of 3-4 critical papers of your choosing • A discussion of other papers and relation to these critical papers • A discussion of how success is quantified in this area, and a description of what you think the state of the art is • A discussion of the ways in which current methods are deficient • Suggestions for likely directions of future work in this area

  16. Website, Mail, Office Hours Website: http://www.cs.ucl.ac.uk/staff/s.prince/4054.htm Mail: s.prince@cs.ucl.ac.uk With 4054 in the subject line Problems: I will be available on Monday afternoons, 3-5pm if you have serious problems. Room 5.06

  17. What is computer vision? Computer vision is concerned with developing artificial system which extract information from image or video data. Computer vision is an expanding academic field with increasing attendance at major conferences. There is also considerable interest from industry with many major players (Intel, Mitsubishi, Microsoft etc.) establishing computer vision laboratories

  18. Computer Vision Tasks Tasks for computer vision might include: Reconstruction- the attempt to build a three dimensional model of the scene from one or more images Camera Tracking - identifying the movement of the camera relative to the scene in a fixed image sequences Object Detection - establishing that a certain type of object (e.g. a dog) is in the scene Segmentation- establishing exactly which pixels belong to a certain object Scene Parsing - establishing a complete understanding of a scene where we have information about what object is at each pixel and how these objects occlude each other

  19. Computer Vision Tasks Identity Recognition - having found an object (e.g. a face) draw inferences about whether it is a particular face Image Enhancement - increase the resolution of the image (super-resolution), remove noise (denoising), or fill in missing areas (in-painting) Generation - having learnt something about a type of object or scene, use the model to generate new images of this type of object Object Tracking - apply a dynamic model to follow an object and monitor changes in its appearance over time Object Description - establish characteristics of an object once we have found it. For example, establish the sex, age or expression of a human face.

  20. Applications of Computer Vision Optical Character Recognition Robotics Security

  21. Applications of Computer Vision Augmented Reality Image Retrieval

  22. Applications of Computer Vision Medical Image Analysis Industrial Inspection Model Building

  23. Applications of Computer Vision Autonomous Vehicles Military Applications

  24. Relationship with other fields Also a very close relationship with graphics – vision is inverse graphics!

  25. Brief History of Computer Vision • 1970s: the lack of computing power dictated algorithms, emphasis on low-level vision, and binary images • 1980s: close relationship with researchers investigating animal vision • 1990s:geometry of multiple views of the same scene, and estimation of camera pose and scene geometry.

  26. Brief History of Computer Vision 2000s: several trends have emerged • Machine learning and vision have grown much closer • Discrete optimization techniques have found widespread use • Much more emphasis on benchmarking and quantitative evaluation • Trend towards larger training datasets

  27. Why is computer vision hard? 1. Dimensionality of Input Space Consider a RGB image at VGA (640x480) resolution with a 8 bit pixel intensity resolution (256 gray levels). The number of total possible images is 256^(640x480x3) or roughly 10^90000. Conclusion: Almost none of the possible images have ever been seen!

  28. Why is computer vision hard? 2. Vision is an inverse problem The mapping from scenes to images is many to one: the data we receive is non-unique. How can we possibly establish what is out there? 3. Speed VGA camera at 30Hz receives 27Mb per second of data – how can we process all this?

  29. Some Mitigating Factors • We know a lot about the generative model (graphics) • We usually have a lot of prior knowledge about what we expect to see in the image (helps with non-uniqueness) • There is a large amount of training data readily available.

  30. Course Overview • Geometry of a Single Camera • Image transformations • How to 3d points project to pixels • Special cases of imaging (A) (B) (C) (D) Augmented Reality Tracking Image Mosaicing

  31. Course Overview • 2. Geometry of multiple cameras • Stereo vision • Epipolar geometry • Finding and matching distinctive keypoints • Shape from silhouette Models from Sparse Stereo Vision Shape from silhouette

  32. Course Overview • 3. Inference at Individual Pixels • Generative approach • Parametric vs. non parametric models • Mixture Models Colour Based Segmentation

  33. Course Overview • 4. Markov Random Fields – Connecting Pixels • MCMC Methods to solve MRFs • Exact MAP Inference in MRFs (graph cuts) • Binary vs. Multi-label cases (A) (B) (C) Dense stereo vision

  34. Course Overview • 5. Models of Texture • Models of small (~ 5x5) patches of pixels • Repairing natural images • Texture synthesis Image In-painting

  35. Course Overview • 6. Models of Objects • Model larger regions of the image • Generative models for pixel covariance • Factor analysers, mixtures of factor analysers Face Detection

  36. Course Overview • 7. Sparse Models of Objects • Model only sparse, but distinctive features • Bag of words model • Constellation models Object Class Recognition

  37. m+2(F:,3) m+2F(:,1) m+2F(:,2) m+2(G:,2) m+2G(:,3) m+2G(:,1) Course Overview • 8. Face Recognition • Subspace models for recognition • Within- and between- individual variance • Recognition across pose vs. Face Models

  38. Course Overview • 9. Models of Shape • Point distribution model • Active Shape Models • Active appearance models Active Appearance Models Active Shape Models

  39. Course Overview • 10. Tracking • The Kalman Filter • Extensions of the Kalman Filter • Particle Filtering Examples of Tracking Objects

More Related