Tracking the Invisible: Learning Where the Object Might be

Tracking the Invisible:Learning Where the Object Might be Helmut Grabner1, Jiri Matas2, Luc Van Gool1,3, Philippe Cattin4 1ETH-Zurich, 2Czech Technical University, Prague, 3K.U. Leuven, 4University of Basel

Outline • 1. Introduction • 2. Learning the Motion Couplings • 3. Practical Implementation • 4. Experimental Results • 5. Conclusion

Outline • 1. Introduction

Introduction • Goal: Estimate the Position of an Object

Introduction • Many temporary, but potential very strong links exists between a tracked objectand parts of the images. • We discover the dynamic elements – called SUPPORTERS – that predict the position of the target. [L. Cerman, J. Matas, J., V. Hlavac, Sputnik Tracker,SCIA, 2009] [M. Yang, Y. Wu, G. Hua. Context-Aware Visual Tracking PAMI, 2009]

SUPPORTERS • 1. Came with different strength. • 2. Changeover time. • SUPPORTERS help Tracking of objects which change there appearance very quickly and occludedobjects or object outside the image.

Tracking Carl • Given the position of the object of interest in the frame. • Local image features from the whole image are extracted (yellow points). Image features: our implementation uses Harris points [5] and describe them by a SIFT inspired descriptor [10].

Local Image Features as Supporters • These image features are usually divided into object points and points belonging to the background.

Local Image Features as Supporters • Object points lie on the object surface and thus always have a strong correlation to the object motion (green points)

Local Image Features as Supporters • The object gets occluded or changed its appearance, the supporters can help infer its position.

Discovering the Supporters

Outline • 2. Learning the Motion Couplings

Problem Formulation • We want to learn a model of P(x|I), predicting the position xof an object in the image I.

Implicit Shape Model - Features • After training from a large database of labeled images, the model can be used to detect objects from that object class in test images. • indicator functionsP(f|I)

Implicit Shape Model – Object Displacement Each feature subsequently casts probabilistic votes for possible object positions, where the hypothesis score is obtained as a sum over all votes. P(x|f):votes for the object position x.

Implicit Shape Model

Model of Carl – Object Detection

Model of Supporters

Voting Where μis the mean and Σis the covariance matrix.

Reliable Information & Motion Coupling feature point x(i) = (x(i), y(i)) associated with a feature f(i) and the position x* = (x*, y*) of the target. • where parameter αϵ [0, 1] determines the weighting of the past estimates. • The current image It and the object position Xt*is included using p(f|It), • the indicator of feature f in the current image; • and p(Xt*-Xt(i)|f), the current relative object position.

Implementation angle Ф distance r covariance matrix Σ Store all detected feature points in a database DB. The entries (d ,r ,Ф ,Σ ) store the descriptor and the estimated voting vector, encoded by radius, angle and covariance matrix.

Updates Adjust the voting vector, the radius r and the relative angle Ф Update the covariance matrix Σ

Outline • 3. Practical Implementation

Algorithm : Determination of Position

Outline • 4. Experimental Results

ETH-Cup: Supporters

[11] S. Stalder, H. Grabner, and L. V. Gool. Beyond semisupervised tracking: Tracking should be as simple as detection,butnot simpler than recognition. In Proc. IEEE WS on On-line Learning for Computer Vision, 2009.

Experimental Results

ETH-Cup: Improving Object Tracking • Recall: TP/(TP+FN) • Precision: TP/(TP+FP)

Medical Application • (a) Manually labeled valve positions, (b-f) the found valve positions in 5 cardiac phases. As the confidence of the voting space was too low, in cardiac phase 16, the valve positions were interpolated (indicated by the dotted lines) from neighboring images.

Outline • 5. Conclusion

Conclusion • SUPPORTERS help Tracking of objects which 1.Change thereappearance very quickly. 2.Occluded objects or object outside the image.

Tracking the Invisible: Learning Where the Object Might be