EECS 442 – Computer vision

EECS 442 – Computer vision Databases for object recognition and beyond Segments of this lectures are courtesy of Prof F. Li, R. Fergus and A. Zisserman

Caltech 101 • Pictures of objects belonging to 101 categories. • About 40 to 800 images per category. Most categories have about 50 images. • The size of each image is roughly 300 x 200 pixels.

Caltech 101 images

Caltech-101: Drawbacks • Smallest category size is 31 images: • Too easy? • left-right aligned • Rotation artifacts • Saturated performance

Results up to 2007 -- recent methods obtain almost 100%

Caltech-256 • Smallest category size now 80 images • About 30K images • Harder • Not left-right aligned • No artifacts • Performance is halved • More categories • New and larger clutter category

Caltech 256 images baseball-bat dog basketball-hoop kayac traffic light

The PASCAL Visual Object Classes (VOC) Dataset and Challenge Mark EveringhamLuc Van GoolChris WilliamsJohn WinnAndrew Zisserman

Dataset Content • 20 classes: aeroplane, bicycle, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, train, TV • Real images downloaded from flickr, not filtered for “quality” • Complex scenes, scale, pose, lighting, occlusion, ...

OccludedObject is significantly occluded within BB Difficult Not scored in evaluation TruncatedObject extends beyond BB Pose Facing left Annotation • Complete annotation of all objects • Annotated in one session with written guidelines

Examples Aeroplane Bicycle Bird Boat Bottle Bus Car Cat Chair Cow

History • New dataset annotated annually • Annotation of test set is withheld until after challenge

Other recent datasets ESP [Ahn et al, 2006] LabelMe [ Russell et al, 2005] TinyImage Torralba et al. 2007 Lotus Hill [ Yao et al, 2007] MSRC [Shotton et al. 2006]

3D object dataset [Savarese & Fei-Fei 07]

Poses 72 • 8 azimuth angles • 3 zenith • 3 distances • ~ 7000 images! … … … … … … … … 1 1 2 … 10 Instances

Largest dataset for object categories up to date J. Deng, H. Su. K. Li , L. Fei-Fei , • ~20K categories; • 14 million images; • ~700im/categ; • free to public at www.image-net.org

http://www.image-net.org

is a knowledge ontology • Taxonomy • Partonomy • The “social network” of visual concepts • Hidden knowledge and structure among visual concepts • Prior knowledge • Context

More Datasets…. UIUC Cars (2004) S. Agarwal, A. Awan, D. Roth CMU/VASC Faces (1998) H. Rowley, S. Baluja, T. Kanade FERET Faces (1998) P. Phillips, H. Wechsler, J. Huang, P. Raus COIL Objects (1996) S. Nene, S. Nayar, H. Murase MNIST digits (1998-10) Y LeCun & C. Cortes KTH human action (2004) I. Leptev & B. Caputo Sign Language (2008) P. Buehler, M. Everingham, A. Zisserman Segmentation (2001) D. Martin, C. Fowlkes, D. Tal, J. Malik. 3D Textures (2005) S. Lazebnik, C. Schmid, J. Ponce CuRRET Textures (1999) K. Dana B. Van Ginneken S. Nayar J. Koenderink CAVIAR Tracking (2005) R. Fisher, J. Santos-Victor J. Crowley Middlebury Stereo (2002) D. Scharstein R. Szeliski

Links to datasets The next tables summarize some of the available datasets for training and testing object detection and recognition algorithms. These lists are far from exhaustive. Databases for object localization Databases for object recognition On-line annotation tools Collections

EECS 442 – Computer vision

EECS 442 – Computer vision

Presentation Transcript

Image Processing and Computer Vision

Based on slides from D. Patterson and www-inst.eecs.berkeley/~cs152/

CS448f: Image Processing For Photography and Vision

Advanced Computer Vision Chapter 8

Chapter 5: Multiprocessors and Thread-Level Parallelism

Cloud Computing RCIS tutorial

ROBOTIC VISION

Based on slides from D. Patterson and www-inst.eecs.berkeley/~cs152/

Combinatorial Optimization and Computer Vision

Using Algebraic Geometry for Solving Polynomial Problems in Computer Vision

Tensor Voting: A Perceptual Organization Approach to Computer Vision and Machine Learning

Attention in Computer Vision

Sports Vision

Xiuwen Liu Department of Computer Science Florida State University

Computer and Robot Vision I

Tensor Voting: A Perceptual Organization Approach to Computer Vision and Machine Learning

Gregory (Greg) Maltby, PMP, BSCS October 11, 2010 EECS 710

How to Get Y our CVPR Paper R ejected?

TOWARD A GENERALIZED THEORY OF UNCERTAINTY (GTU) Lotfi A. Zadeh Computer Science Division

SI 760 / EECS 597 / Ling 702 Language and Information