On-the-Fly Person Retrieval System for Un-Annotated Videos

On-the-fly Specific Person Retrieval Omkar M. Parkhi, Andrea Vedaldi and Andrew Zisserman 24thMay 2012 University of Oxford

Overview Textual Queries Ranked Shots Search for: Large collection of un-annotated videos People “Barack Obama” “George Bush” “Courtney Cox” On-the-fly i.e. with no previous knowledge or model for these queries

ScrubsDataSet • 12 Episodes from Seasons 1-5 and 8 • 5 hours of video data • About 400k frames, partitioned into 5k shots • About 300k near frontal face detections • 768 x 576 MPEG2 format

Demo • Search for “Courteney Cox” in Scrubs dataset. • Steps: • Download example images from Google • Train a ranking function • Apply ranking function to video collection

DEMO

On the fly person retrieval system OFF-LINE PROCESSING ON-LINE PROCESSING Text Query “CourteneyCox” Negative Training Images Facial Features & Descriptors Facial Features & Descriptors Video Collection Google Image Search “Courteney Cox” Fast Linear Classifier Face Tracks Ranking Facial Features & Descriptors Results

Detection and Tracking Viola-Jones face detection on each frame Tracking measures “connectedness” of a pair of faces by point tracks intersecting both Doesn’t require contiguous detections No drift Faces clustered into tracks [Everinghamet al. 2006, Apostoloff & Zisserman, 2007]

Scrubs Data Set • 12 Episodes from Seasons 1-5 and 8 • 5 hours of video data • About 400k frames, partitioned into 5k shots • 300k face detections • 6k face tracks • 768 x 576 MPEG2 format

Detecting facial feature points • Pictorial structure model • Joint model of feature appearance and position • [Felzenszwalband Huttenlocher’2004, Everinghamet al. 2006]

Face Appearance Representation • Affine transformation of face to canonical frame • Independent photometric normalization of parts • Represent gradients over circle centred on facial feature points • Feature descriptor is a 3849 dimensional vector • [Everinghamet al. 2006]

Negative Training Images • Combination of faces from • Random downloaded images • Labeled Faces in the Wild dataset • Caltech Faces dataset • About 16k face detections. Caltech 10, 000 Web Faces: http://www.vision.caltech.edu/Image_Datasets/Caltech_10K_WebFaces/ Labeled Faces in the Wild: http://vis-www.cs.umass.edu/lfw/

On-the-fly Person Retrieval OFF-LINE PROCESSING ON-LINE PROCESSING Text Query “CourteneyCox” Negative Training Images Facial Features & Descriptors Facial Features & Descriptors Video Collection Google Image Search “Courteney Cox” Fast Linear Classifier Face Tracks Ranking Facial Features & Descriptors Results

DEMO

TRECVid 2011 (IACC.1.B) • About 200 hours of video data. • 8k videos. • MPEG4, 320x240 pixels • 130k shots, • About 3 million face detections • 25,535 face tracks.

DEMO

Facial attributes – FaceTracer project • Examples: • gender: male, female • age: baby, child, youth, middle age, senior • race: white, black, asian • smiling, mustache, eye-wear, hair colour • Method • person independent training set with attribute • facial feature representation • discriminative training of classifier for attribute N. Kumar, P. N. Belhumeur and S. K. Nayar, FaceTracer: A Search Engine for Large Collections of Images with Faces, European Conference on Computer Vision (ECCV), 2010 http://www.cs.columbia.edu/CAVE/projects/face_search/

DEMO

Quantitative Performance - Scrubs Dataset • Performance evaluation for 3 guest actors (Brendan Fraser, Courteney Cox and Michael J Fox) • 12 dataset videos split into training and test sets (3 Training, 9 Testing) • Annotations: • Manual labeling of training and test set for each actor • Manual labeling of positive training images from Google • Negative training images from Caltech Faces dataset.

Quantitative Performance - Scrubs Dataset • Retrieval Average Precision (AP)

Quantitative Performance - Scrubs Dataset • Using more training data per track

Future Work • Exploring sources for positive examples • Better feature representations • Combination of attributes and identities

Any Questions?

On-the-Fly Person Retrieval System for Un-Annotated Videos

On-the-Fly Person Retrieval System for Un-Annotated Videos

Presentation Transcript

Integrative fly analysis: specific aims

Action Research on the Fly:

On-the-Fly Pipeline Parallelism

Cluster Computing on the Fly:

On The Fly Garbage Collector

Formative Assessment on the Fly

Faces on the fly

“Sub on the Fly”

Lobsters on the Fly

OTFR (On-The-Fly Reprocessing)

Conflict Resolution “On the Fly”

Query Specific Fusion for Image Retrieval

Person-specific head model

Calculating the entropy on-the-fly

Fly On The Wall Event.

Feedback on the Fly

Interpretation on the fly

A Fly on the Wall . . .

TACTILE GRAPHICS ON THE FLY

On-the-fly Tech Support

Wiccan Love Spells For A Specific Person

Hypnosis To Attract A Specific Person