1 / 48

Visual Scene Understanding (CS 598)

Visual Scene Understanding (CS 598). Derek Hoiem. Course Number: 46411 Instructor: Derek Hoiem Room:  Siebel Center 1109 Class Time:  Tuesday and Thursday 11:00am – 12:15pm Office Hours:  Tuesday and Thursday 12:15-1pm; by appointment Contact: dhoiem@uiuc.edu, Siebel 3312. Today.

Télécharger la présentation

Visual Scene Understanding (CS 598)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room:  Siebel Center 1109 Class Time:  Tuesday and Thursday 11:00am – 12:15pm Office Hours:  Tuesday and Thursday 12:15-1pm; by appointment Contact: dhoiem@uiuc.edu, Siebel 3312

  2. Today • Introductions • Overview of logistics • Overview of class material

  3. Vision: What is it good for? Biological (Humans) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Technological (Computers) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Note: Unfortunately, these got erased when my computer crashed

  4. Course Logistics

  5. Class Content Overview • Tutorials and Perspectives • Paper reading • Spatial Inference • Objects • Actions • Context and Integration

  6. Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.

  7. Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.

  8. Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.

  9. Visual Scene Understanding Visual scene understanding is the ability to infer general principles and current situations from imagery in a way that helps achieve goals.

  10. I. Spatial Inference

  11. Getting Around

  12. Getting Around

  13. Getting Around

  14. Spatial Inference: applications Automated Vehicles Household Robots Graphics Applications Predict object size/position

  15. Spatial Inference: open questions • How do we represent space? • Surface orientations, depth maps, voxels? • How do we infer it from available sensory data (image, stereo, motion, laser range finder)?

  16. II. Objects

  17. Finding Things and Observing Them Image classification: Are there any dogs? Photo credit: iansand – flickr.com

  18. Finding Things and Observing Them Object Localization: Where are the dog(s)?

  19. Finding Things and Observing Them Verification: Is this a dog?

  20. Finding Things and Observing Them Description: Furry, small, nice, side view

  21. Finding Things and Observing Them Identification: My friend Sally?

  22. Recognizing Stuff SKY WATER SAND

  23. Object Recognition: applications Photo Search Security Robots

  24. Object Recognition: open questions • How many examples does it take to learn one category well? • How many examples does it take to learn 100 categories well? • How do these answers depend on the level of supervision? • Can recognition be solved with simple methods and massive amounts of data? • How can we quickly recognize an object? • How can we scale up to deal with thousands of categories?

  25. III. Actions

  26. Taking Action [Saxena et al. 2008]

  27. Recognizing Actions KTH Dataset Figure from Laptev et al. 2008

  28. Recognizing Actions Figure from Laptev et al. 2008

  29. Reading Emotions Photo credit: Comstok

  30. Actions: applications Video Search Security

  31. Actions: open questions • How are actions defined? • Does it make sense to categorize them? • If not, how do we recognize them? • What are good visual representations for inferring actions? • How can we recognize activities?

  32. IV. Context and Integration [Hoiem et al. 2008]

  33. Context and Integration • Objects + scene categories  better detection • Movement + objects  action/activity recognition • Space + objects  navigation [Hoiem et al. 2008]

  34. Context and Integration: applications Everything that vision is good for

  35. Context and Integration: open questions • Should context be explicit (e.g., “cars drive on the road”) or implicit (feature-based)? • How do we model and learn the interactions between different processes and scene characteristics? • How do we deal with the growing complexity as more and more pieces are put together?

  36. General Problems in Computer Vision • Better understanding of limitations and their sources • Need new experimental paradigms • Improve generalization • Aim to generalize across datasets, categories, and tasks • Work on knowledge sharing and transfer • Vision as a way of learning about the world • Integration into AI • Systems that acquire knowledge over time

  37. Successes of Computer Vision • Point matching (e.g. 2d3) • Tracking • Structure from motion • Stitching • Product inspection • Multiview 3d reconstruction • Face recognition and modeling • Object recognition on pre-2000 datasets • Interactive segmentation (ongoing)

  38. To Do • Register on bulletin board • Post comments on Thursdays reading (due tomorrow) • Look over schedule and decide which days to present (due next Tues) • Start thinking about projects • Let me know if you want a specific pairing (due Tues)

  39. Questions?

  40. Goals • Make you a better researcher (esp. in vision) • More knowledge • Better critical thinking skills • Improved communication skills • Improved research skills

  41. Grades • Participation: 25% • Posting • Class discussion • Presentation: 25% • Projects: 50% • Proposal, progress report, final paper, and oral

  42. Policies • Attendance required (see syllabus) • Give credit where due • No formal prerequisites • Everything needs to be on time

  43. Reading • Read well • Post comments to bulletin board at least 24 hours before class

  44. Presentations • Presenter • Everyone does two • Good quality coverage of topic (40 min) • See syllabus for guidelines • Sign up by next Tuesday (at latest) • TBAs are your choice (decide at least 4 weeks in advance) • Demonstrator • If all days are taken, pair up • One person’s job will be to demonstrate some aspect of the algorithm (e.g., where it succeeds and fails) by running it on many examples • May require implementation • Note taker

  45. Projects • Timeline • Proposal: Feb 12 (3 ½ weeks!) • Progress report: Mar 19 • Presentation: paper May 5, oral later • Progress report • Presentation • Paper • Oral • In pairs • Can choose partner or be randomly paired • Suggestions on web • Potentially will lead to publication (e.g. NIPS)

  46. To Do • Register on bulletin board • Post comments on Thursdays reading (due tomorrow) • Look over schedule and decide which days to present (due next Tues) • Start thinking about projects • Let me know if you want a specific pairing (due Tues)

  47. Questions?

More Related