3D HUMAN BODY POSE ESTIMATION BY SUPERQUADRICS

3D HUMAN BODY POSE ESTIMATION BY SUPERQUADRICS Ilya Afanasyev, Massimo Lunardelli, Nicolo' Biasi, Luca Baglivo, Mattia Tavernini, Francesco Setti and Mariolino De Cecco Department of Mechanical and Structural Engineering (DIMS), Mechatronics Lab. EU-FP7-Marie Curie COFUND-Trentino Project N° 226070

Content • Introduction • The input data description • The algorithm description • Demo of Test Results • Conclusions

Introduction We present the 3D reconstruction and human body pose estimation system using Superquadrics (SQ) math. model and RANSAC search with a least square fitting & verifying algorithms. Human Body pose estimation algorithm under consideration Video-frames from multicamera system Final Human Body pose model Preprocessing: segmentation Input data from VERITAS project Fitting SQ to 3D data 3D point cloud

Our starting point We use the multiple stereo system (8 pairs of cameras) and the garment with the special clothing marks to recover 3D human body surface with superimposed colored markers. The multicamera system and garment belong to EU\FP7-FP7-ICT – VERITAS project: http://veritas-project.eu/

Segmentation The segmentation is based on clothing analysis (i.e. recognition of the special clothing marks on the garment) and divides the Human Body into 9 parts (body, arms, forearms, hips and legs). The garment doesn’t have a hood, so our Human Body SQ-model doesn’t have the head. The multicamera system and garment belong to EU\FP7-FP7-ICT – VERITAS project: http://veritas-project.eu/

What is the input data? • 3D video of Human Body movement has been captured from a multi-camera system and consisted of 119 frames. • 3D data processed offline separately for every frame and concludes 3D coordinates of appr. 2100 datapoints of the Human Body pose. • 3D data points are accompanied with segmentation matrix, the elements of which set belonging of every point to the body or definite limb. As the result of the clothing segmentation we have approximately 800 datapoints of the body, 30-70 points of left/right arms, 15-25 points of forearms, 300-600 points of hips, and 80-150 points of legs. The multicamera system and garment belong to EU\FP7-FP7-ICT – VERITAS project: http://veritas-project.eu/

What is the proposed method? We propose using the hierarchical RANSAC-based model-fitting technique with a composite SQ model of human body (HB) and limbs. SQ models permit to describe complex-geometry objects with few parameters and generate simple minimization function to estimate an object pose. We assume shape and dimensions of the body and limbs are known a-priori with correct anthropometric parameters in the metric coordinate system. The algorithm recovers 3D position of the body as the largest object (“Body Pose Search”) and then restores the human limbs poses (“Limbs Pose Search”). To cope with measurement noise and outliers, the object pose is estimated by RANSAC-SQ-fitting technique. We control the fitting quality by setting inlier thresholds for limbs and body.

HB Pose Estimation algorithm Threshold for body 55% Threshold for limbs 60%

Human Body model in Superquadrics We present Human Body (HB) as a model in 9 superquadrics – superellipsoids. HB anthropometric parameters: the shape parameters ε1 = ε2 = 0.5;  the scaling parameters: → Body: a1 = 0.095, a2 = 0.18, a3 = 0.275 (m). →Arms: a1 = a3 = 0.055, a2 = 0.15 (m). → Forearms: a1 = a3 = 0.045, a2 = 0.13 (m). → Hips: a1 = a2 = 0.075, a3 = 0.2 (m). → Legs: a1 = a2 = 0.05, a3 = 0.185 (m). Abbreviations: B– body, LA/RA– Left/Right Arms, LF/RF– Left/Right Forearms, LH/RH– Left/Right Hips, LL/RL – Left/Right Legs. LS– Left Shoulder, E – Elbow,LHJ – Left Hip Joint, ηLA – angle position of Left Shoulder,K– Knee, etc.

Human Body model in Superquadrics The explicit form of the parametric equation of the superquadrics, which is usually used for SQ representation and visualization, is The implicit equation of superquadrics is used for mathematical modeling to do fitting 3D data: where x,y,z - superquadric system coordinates; η, ω – spherical coordinates; a1, a2, a3 – the scaling parameters; a4, a5 – the shape parameters.

Body model in Superquadrics The position of Human Body is defined by the following rotation & translation sequences of the Body Superquadrics: 1. Translation of center of BODY (xc, yc, zc), along x, y, z-coordinates. 2. Rotation α among x (clockwise). 3. Rotation β among y (clockwise). 4. Rotation γ among z (clockwise). The rotation matrix of BODY RBODY is: The transformation matrix of BODY RBODY is:

Limb models in Superquadrics The position of Left Shoulder according to the center of the body coordinate system is estimated by SQ explicit equation: The transformation Left Shoulder - Left Arm (LS-LA) can be expressed with the following rotation & translation sequences: 1. Rotation α among x (clockwise). 2. Rotation β among z (anticlockwise). 3. Rotation γ among y (clockwise). 4. Translation of SQ center on distance a2 along y. where RLA is the rotation matrix of Left Arm

Limb models in Superquadrics The full transformation for every point of system “Body - Left Forearm” (B-LF) can be calculated this way: where PB, PLF - coordinates of Body and Left Forearm points correspondingly. The transformation Elbow - Left Forearm (E-LF) is created by: 1. Rotation δLF among x (clockwise). 2. Translation of SQ center on -a2 along y. The transformations: Body - Left Shoulder (B-LS) and Left Arm - Elbow (LA-E) are:

RANSAC Body Pose Search We use RANSAC ("RANdomSAmple Consensus") algorithm to find the body pose hypothesis, i.e. 6 variables: 3 angles of rotation (α, β, γ) and 3 translation coordinates (xC, yC, zC). Having these variables we can calculate the transformation matrix TBODY. We are fitting a model described by the superquadric implicit equation to 3D data of the body. We are taking 6 points in the world coordinate system (xWi, yWi, zWi) from appr. 800 data points of the body and transform them to the SQ centered coordinate system (xSi, ySi, zSi), using Then we are calculating the inside-outside function according to the superquadric implicit equation in world coordinate system:

RANSAC Body Pose Search The inside-outside function has 11 parameters: where 5 parameters are known (a1, a2, a3, ε1, ε2) and 6 parameters (α, β, γ,xC, yC, zC) should be found by minimizing the cost-function: Thus we are fitting SQ model to random dataset by minimizing the inside-outside function of distance to SQ surface. We used both the “Trust-Region algorithm” and “Levenberg-Marquardt algorithm” in the nonlinear least-square minimization method. After that we are evaluating number of inliers by comparing the distances between every point of 3D point cloud and SQ model with a distance threshold t (to accelerate the calculations we took the distance threshold t = 2 cm):

RANSAC Limb Pose Search Analogically to Body Pose search, we are realizing RANSAC Limb Pose Search. The main differences between RANSAC Body and Limb Fitting are:  in using SQ pairs of limbs: arm-forearms, hips-legs. in picking up s = 3 points for every limb (although we use the body transform matrix TBODY, obtained from the Body Pose Search).  in using 4 variables for Limbs Pose Search: 4 angles of rotation (α, β, γ, δ).  in minimizing the joint cost-function of SQ pair together, considering two limbs simultaneously: where abbreviations LA and LF mean Left Arm (LA) and Forearm (LF) Limbs correspondingly (as an example).

Demo of Test Results For most of 3D video frames, the amount of inliers is more than 65%. At the top: left – a pose of a human in the garment, right – “cloud of points”. At the bottom: left – the result of RANSAC-fitting to 3D data (pink points – inliers, cyan – outliers), right – final pose estimation.

Demo of Test Results The lack of data points for arms and forearms gives the displacements of the upper limb poses from one video frame to other. It spoils the impression from the Human Body movement when preparing video collecting together the individual frames processed by RANSAC-SQ-fitting. This problem can be solved in future by correcting 3D Human Body Pose Estimation algorithm, or improving 3D data point acquisition process, or using other sensor and segmentation techniques.

Conclusions • 3D real data of Human Body was obtained by a multi-camera system and structured by the special clothing analysis. • The human body was modeled by a composite SuperQuadric (SQ) model presenting body and limbs with correct a-priori known anthropometric dimensions. • The proposed method based on hierarchical RANSAC-object search with a robust least square fitting SQ model to 3D data: at first the body, then the limbs. • The solution is verified by evaluating the matching score (the number of inliers corresponding to a-piori chosen distance threshold), and comparing this score with admissible inlier threshold for the body and limbs. • For most of 3D video frames, we achieve the amount of inliers is more than 65% that means that algorithm works well. • This method can be useful for applications dealt with 3D Human Body recognition, localization and pose estimation. • This method will also work with any 3D point cloud data acquired by other sensors and segmented using any other algorithms.

Acknowledgements IlyaAfanasyev worked under creation of the algorithms for 3D object recognition and pose estimation by support from EU\FP7-Marie Curie-COFUND – Trentino program. 3D data acquisition and segmentation were executed by UniTN team in the framework of project VERITAS funded by FP7, EU. The authors are very grateful to colleagues from Mechatronics dep., University of Trento (UniTN), namely Alberto Fornaser. Grazie!!

3D HUMAN BODY POSE ESTIMATION BY SUPERQUADRICS

3D HUMAN BODY POSE ESTIMATION BY SUPERQUADRICS

Presentation Transcript

Poselets: Body Part Detectors trained Using 3D Human Pose Annotations

Human Pose Recognition

Pose Estimation and Segmentation of People in 3D Movies

Poselets : Body Part Detectors Trained Using 3D Human Pose Annotations

Why pose estimation?

Pose Estimation

Pose Estimation

3d Pose Detection

Monocular 3D Pose Estimation and Tracking by Detection

3D Motion Estimation

3D Human Body Pose Estimation using GP-LVM

POSE–CUT Simultaneous Segmentation and 3D Pose Estimation of Humans using Dynamic Graph Cuts

Hierarchical Part-Based Human Body Pose Estimation

Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction

Human body model and pose estimation and abnormality classification

Database-Based Hand Pose Estimation

Make up Body language Pose

Human Pose detection

2D Human Pose Estimation in TV Shows

Multi-Person 3D Human Pose Estimation from Monocular Images