Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation

Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation Yap-Peng Tan Drew D. Saur Sanjeev R. Kulkarni Peter J. Ramadge Presented by Alex Summers

B.S. National Taiwan University MA & PhD, Princeton University Associate Professor,Nanyang Technological University Since the year 2000… 12 Journal Papers 29 Conference Papers 11 Patents 9 PhD students “A Vision-Based Approach to Early Detection of Drowning Incidents in Swimming Pools” The Authors: Yap-Peng Tan

B.S.E. Princeton University Joined Anderson Consulting 1996 Performance Analysis 2 publications No public home page Completely different Drew D. Saur: Author of http://macorchard.com Proficient in high-resolution game programming with Commodore/Microsoft BASIC and 6502 machine code Didn’t respond to my email questioning his identity The Authors: Drew D. Saur

The Degrees: Clarkson University 1983 B.S. Mathematics 1984 B.S. Electrical Engineering 1985 M.S. Mathematics 1985 M.S Electrical Engineering (Stanford) 1991 PhD Electrical Engineering (MIT) Supervised Yap-Peng Tan’s PhD Received two Young Investigator Awards 150 publications (4 secret “laser” projects) “A Multiple Target Tracking Algorithm Supporting RV/Decoy Discrimination During Deployment (SECRET)” The Authors:Sanjeev R. Kulkarni

B.Sc./M.E. Electrical Engineering(University of Newcastle, Aus) PhD University of Toronto Professor of EE at Princeton Current Research Interests: Video and Image Processing Hybrid Dynamic Systems Computer-Based Control Possibly 7 children “Stochastic Optimization of Regenerative Systems using Infinitesimal Perturbation Analysis” The Authors: Peter J. Ramadge

Published February 2000 Citations Citeseer : 11 Google Scholar : 37 19 references (1 by authors) Also.. Pictures Diagrams of results Plenty of maths The Paper

Digital video becoming widespread Libraries of digital video exist Content-based browsing desirable Searching and annotation of video Video is usually compressed (MPEG) Decompressed video very large Decompressing video takes time The Problem

Work only with compressed video Identify events of interest Characterise in terms of camera motion Use data easily accessible from MPEG Divide video into “temporal segments” Estimate camera motion Use this data to classify regions of video Overview of the method

Work only with compressed video Overview of the method • For example, basketball video..

Identify events of interest For example, basketball video.. Fast breaks Full court advances Shots at the basket Characterise in terms of camera motion For example, basketball video.. Periods of persistent camera motion Camera zoom-in Pauses in camera motion Overview of the method

Use data easily accessible from MPEG Not the full decoded frames MPEG-specific data Obtainable by partial decoding (20%) Divide video into “temporal segments” Separate on scene changes For example, basketball video.. Switches between wide-angle and close-up shots Overview of the method

Estimate camera motion Analyse the motion vectors This is the clever bit (more to come..) Use this data to classify regions of video Heuristics developed by hand Based on prior knowledge of video Obtains an annotated copy of the clip Overview of the method

Introduction to MPEG • Begin with uncompressed video • Series of (numbered) frames • Each contains full info about image • How can we make this smaller?

Introduction to MPEG • Naïve approach – delete frames • Would result in poor sample rate • Can we fill in the space “cheaply”?

Filling in the blanks Another video…

Filling in the blanks • Compare current frame with previous

Filling in the blanks • Compare current frame with previous • Divide into blocks

Filling in the blanks • Compare current frame with previous • Divide into blocks • Identify matches between blocks

Filling in the blanks • Compare current frame with previous • Divide into blocks • Identify close matches between blocks • If match found, encode “motion vector”

Filling in the blanks • Yellow frames are P(predicted)-frames • Encoded using motion vectors • Predicted from previous frame above

Filling in the blanks • Green frames are B(bidirectional)-frames • Predicted from both previous and next I/P frame

Motion vectors estimate camera motion Use P-frames from MPEG format Very noisy measurement Aspects of scene may not move Some move faster than others (?) Moving features change colour (no match) Subtler problems (see later) Need to “best-fit” the data found Use numerical analysis techniques The Method - idea

Consider transformation of image points Parameterise transformation Translation (assumed zero – fixed camera) Zoom Rotation Pan/Tilt Perspective effects Mathematical model Problem becomes estimating parameters Iterative “least-squares” fitting is used The Method - parameters

Block matching may not respect movement Zero motion vectors used if reasonable Scenes may be stationary – most will be zero motion vectors P-frames not evenly distributed Exclude eccentricities in the results Exclude blocks with zero motion vectors If mostly zero motion vectors, set camera motion to be zero Interpolate in the gaps The Method - refinements Solution Problem

Filling in the blanks • Yellow frames are P(predicted)-frames • Encoded using motion vectors • Predicted from previous frame above

Applied techniques to basketball video Segment into close-up and wide-angle Estimate camera motion for wide-angle Use pan and zoom estimation to detect: Fast breaks and Full-Court Advances Shots at the basket Criteria are identified by hand e.g. a fast break followed by camera zoom-in probably indicates a shot The Method – application

Compare with estimation from uncompressed video: The Method - results

Four Basketball sequences are used to test the techniques Aim to identify fast breaks (FBs) and shots on basket 92% of FBs are successfully identified 8% of regions identified as FBs are falsely identified 97% of shots on basket are successfully identified 25% of regions identified are not shots on the basket The Method - results

Techniques for estimating camera motion from MPEG-compressed files are shown Applying the techniques, it is possible to generate an annotated video file Manual or more-detailed analysis may be required to correct inaccuracies Once annotated file is obtained: Possible to search file for events of interest Useful for building “highlight” programmes Possible to analyse a sports team’s play Summary

Well-written, flows. Maths provided to back up the ideas Compares well with similar non-compressed approach Correctly identifies almost all points of interest Relies heavily on human interaction Possibly dubious evaluation criteria False positives Tiny pictures! My Evaluation

Could algorithms be trained to learn the criteria for identifying interesting features? Similarly, to self-adjust for anomalies in the data? Should the motion vectors in the B-frames be used, also? How successful would the approach be for e.g. football, when much of the background may be constant colour? Questions

Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation

Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation

Presentation Transcript

Video Steganography with Perturbed Motion Estimation

Video Camera Technique

Camera/Video

Create a Video With the Flip Video Camera

Video Communication Final Project =Motion Estimation=

Video Camera Basics

Film/Video Annotation Project

Video Camera (SONY)

Video Camera Operations

VIDEO OF MOTION

Compressed Video Downscaling

Video Communication Final Project – Motion Estimation

VIDEO ANNOTATION TOOL

MOTION ESTIMATION AND VIDEO COMPRESSION

Distributed Video Coding with Unsupervised Learning of Motion Estimation

Video Camera Technique

Image Mosaicing with Motion Segmentation from Video

Drone Video Camera

Video Camera Surveillance

Lights, Camera, Video!

Video Motion Capture