1 / 30

Machine learning & object recognition

Machine learning & object recognition. Cordelia Schmid Jakob Verbeek. Content of the course. Visual object recognition Machine learning. Visual recognition - Objectives. Retrieval of particular objects and scenes Accuracy and scalability to large databases. …. person. glass.

karaburke
Télécharger la présentation

Machine learning & object recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine learning & object recognition Cordelia Schmid Jakob Verbeek

  2. Content of the course • Visual object recognition • Machine learning

  3. Visual recognition - Objectives • Retrieval of particular objects and scenes • Accuracy and scalability to large databases …

  4. person glass drinking indoors Visual object recognition - Objectives • Detection of object categories • is there a … in this picture • More generally: relevance of labels (action, place, ...)

  5. Visual recognition - Objectives • Localization of object categories • where are the … in this image • Predict bounding boxes around category instances

  6. Visual recognition - Objectives • Semantic segmentation of (object) categories • Which pixels correspond to …. • Possibly identifying different category instances

  7. Visual recognition - Objectives • Human pose estimation • Self-occlusion and clutter

  8. Visual recognition - Objectives • Human action recognition in video • Interaction of people and objects, temporal dynamics

  9. Visual recognition - Objectives • Human action action localization in time, or space-time

  10. Visual recognition - Objectives • Image captioning: Given an image produce a natural language sentence description of the image content

  11. Difficulties: within object variations Variability: Camera position, Illumination,Internal parameters Within-object variations

  12. Difficulties: within-class variations

  13. Visual recognition • Robust image description • Appropriate descriptors for objects and categories • Statistical modeling and machine learning • Capture the variability within and between specific object and scenes and the same for categories

  14. Robust image description • Scale and affine-invariant keypoint detectors • Robust keypoint descriptors

  15. Robust image description • Matching despite significant viewpoint changes

  16. Why machine learning? • Early approaches: simple features + handcrafted models • Can handle only few images, simple tasks L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

  17. Why machine learning? • Early approaches: manual programming of rules • Tedious, limited and does not take into accout the data Y. Ohta, T. Kanade, and T. Sakai, “An Analysis System for Scenes Containing objects with Substructures,” International Joint Conference on Pattern Recognition, 1978.

  18. Internet images, personal photo albums Movies, news, sports Why machine learning? • Today: Lots of data, complex tasks

  19. Medical and scientific images Surveillance and security Why machine learning? • Today: Lots of data, complex tasks

  20. Why machine learning? • Today: Lots of data, complex tasks • Instead of trying to encode rules directly, learn them from examples of inputs and desired outputs

  21. Types of learning problems • Supervised • Classification • Regression • Unsupervised • Clustering • Manifold learning • Semi-supervised • Reinforcement learning • Active learning • ….

  22. Supervised learning • Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs • Two main scenarios: • Classification: outputs are discrete variables (category labels). Learn a decision boundary that separates one class from the other (separate images with and without cars in them) • Regression: also known as “curve fitting” or “function approximation.” Learn a continuous input-output mapping from examples (estimate the human pose parameters given an image)

  23. Unsupervised Learning • Given only unlabeled data as input, learn some sort of structure from the data • The objective is often more vague or subjective than in supervised learning • This is more of an exploratory/descriptive data analysis

  24. Unsupervised Learning • Clustering: Discover groups of “similar” data points

  25. Unsupervised Learning • Dimensionality reduction, manifold learning • Discover a lower-dimensional surface on which the data lives

  26. Unsupervised Learning • Density estimation • Find a function that approximates the probability density of the data (i.e., value of the function is high for “typical” points and low for “atypical” points) • Can be used for anomaly detection

  27. Other types of learning • Semi-supervised learning: lots of data is available, but only small portion is labeled (e.g. since labeling is expensive) • Why is learning from labeled and unlabeled data better than learning from labeled data alone? ?

  28. Other types of learning • Active learning: the learning algorithm can choose its own training examples, or ask a “teacher” for an answer on selected inputs

  29. Master Internships • Internships are available in the LEAR group • Action localization (C. Schmid) • Image captioning (J. Verbeek) • If you are interested send an email to us

  30. Image captioning • Given an image produce a natural language sentence description of the image content

More Related