An H.264-based Scheme for 2D to 3D Video Conversion

An H.264-based Scheme for 2D to 3D Video Conversion Mahsa T. Pourazad PanosNasiopoulos Rabab K. Ward IEEE Transactions on Consumer Electronics 2009

Outline • Introductionto 3D television • 2D-to-3D Conversion Scheme • Camera motion Correction • Correction of Displacement Estimates • Perceptual Depth Enhancement • Performance Evaluation • Conclusion

Introduction • 3D television • Stereoscopic • Multi-view • 2D plus depth • 3D display

Introduction • 2D to 3D video streams • 2D video stream + Depth map • Depth Image Based Rendering(DIBR) [1] • 2 different viewpoints (projected on left and right retinas) [1] L. Zhang, “Stereoscopic image generation based on depth images for 3D TV,” IEEE Trans. Broadcasting, vol. 51, no.2, pp191-199, 2005.

Introduction • Depth map estimation • Light, shade, relative size, motion parallax, partial occlusion, textural gradient, geometric perspective…… • Manual, semi automatic or automatic • Machine learning • Extract depth from blur • Edge information • Motion vector information • H.264/AVC standard • Can’t work on static objects

2D-to-3D Conversion Scheme • Use abs(MVx) for estimating the depth map • Depth of point P can be easily obtained if the disparity d is known.

2D-to-3D Conversion Scheme • H.264/AVC Motion vector estimation • variable block sizes • Quarter-pixel matching accuracy • Correction • Moving camera • Object boundary • Perceptual depth enhancement

Camera Motion Correction • Camera panning • Recognize camera motion • Adjust “Skip Mode” • Adjust net motion • Zoom in/out • Check the tendency of the camera • MVs are scaled accordingly [2] [2] D. Kim, D. Min, K. Sohn, “Stereoscopic video generation method using motion analysis,” 3DTV Conf. pp. 1-4, 2007.

Correction of Displacement Estimates • Is this motion vector correct? • Readjust MVs by making it equal to the median MV Motion vector is very different from neighbors’ ? Yes Check the variance of the corresponding block in residual frame Object boundary pixels? No MV=median of neighbors’ MV

Perceptual Depth Enhancement • Non-linear scaling model • The further the object is, the smaller the scaling factor. • The enhanced disparity value (N uniformly spaced depth layer) Ex: Layer 0(i=0, S(0)=Zfar/Znear) Layer N-1(i=N-1, S(N-1)=1)

Performance Evaluation • Video sequences • “Interview”, “Orbi” • True Depth Maps • Captured by 3D-depth range camera (Zcam) • 0 to 255 (256 depth layers) • JM12.2 version of the H.264/AVC standard • Compare with [3] [3] I. Ideses, L. P. Yaroslavsky, and B. Fishbain, “Real-time 2D to 3D video conversion,” Journal of Real-Time Image Processing, vol. 2, no. 1, pp. 3-9, 2007.

Performance Evaluation Orbi Interview Video sequence Recorded depth map

Performance Evaluation Estimated depth map by [3] Estimated depth map by our approach

Performance Evaluation • 15 people graded the videos from 1 to 10 of 3D perception and visual quality

Performance Evaluation

Performance Evaluation • Badly matched pixelsin the estimated depth (Th=1) Percentage of correctly matched pixels

Conclusion • This paper present a efficient method that estimates the depth map of a 2D video sequence using its H.264/AVC estimated motion information. • It can be implemented in real-time at the receiver-end, without increasing the transmission bandwidth requirement.

An H.264-based Scheme for 2D to 3D Video Conversion