Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
100 likes | 252 Vues
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. Analysis and Recognition of Video Data Tamir Nuriel. Flowchart of the approach. Interest Points Detector. Gaussian smoothing in the space dimension. Gabor filters in the time dimension.
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
E N D
Presentation Transcript
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words Analysis and Recognition of Video Data Tamir Nuriel
Interest Points Detector • Gaussian smoothing in the space dimension. • Gabor filters in the time dimension. • Extract spatial-temporal cube around interesting points.
Descriptor • Brightness gradients on x, y and t directions. • The computed gradients are concatenated to form a vector. This descriptor is then projected to a lower dimensional space using the principal component analysis (PCA) dimensionality reduction technique. • Instead of performing dimension reduction using PCA - Histogram of gradients in each direction.
Codebook Formation • The codebook is constructed by clustering using the k-means algorithm and Euclidean distance as the clustering metric. • The center of each resulting cluster is defined to be a spatial-temporal codeword.
Learning the Action Models by pLSA • Maximizing • E-step: • M-step:
Experimental results • Patches from different actions from the KTH dataset:
Experimental results • Marking patches in video
Experimental results • Confusion Matrix
References • J. C. Niebles, H. Wang and L. Fei-Fei, “Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words”, International Journal of Computer Vision. In press. 2008. • C. Schuldt, I. Laptev, B. Caputo, “Recognizing human actions: a local SVM approach”, In Proc. ICPR 2004. • L. Zelnik-Manor, M. Irani, “Event-based analysis of video”, CVPR 2001.