EEC-693/793 Applied Computer Vision with Depth Cameras

EEC-693/793Applied Computer Vision with Depth Cameras Lecture 8 Wenbing Zhao wenbing@ieee.org

Outline Human skeleton tracking

Skeleton Tracking • Real-Time Human Pose Recognition in Parts from Single Depth Images, by J. Shotton et al at Microsoft Research Cambridge & Xbox incubation • http://research.microsoft.com/apps/pubs/default.aspx?id=145347 • Real-time human pose recognition is difficult and challenging because of the different body poses, sizes, dresses, heights and so on • Kinect uses a rendering pipeline where it matches the incoming data (raw depth data from Kinect) with sample trained data • The machine learned data is collected from the base characters with different types of poses, hair types, and clothing, and in different rotations and views • The machine learned data is labeled with individual body parts and matched with the incoming depth data to identify which part of the body it belongs to • The rendering pipeline processes the data in several steps to track human body parts from depth data

The Rendering Pipeline Processes • From depth image, we can easily identify the human body object • In the absence of any other logic, the sensor will not know if this is a human body or something else • To start recognizing a human body, we match each individual pixel of incoming depth data with the data the machine has learned • The data each individual machine has learned is labeled and has some associated values to match with incoming data • matching is based on the probability that the incoming data matches with the data the machine has learned

The Rendering Pipeline Processes • The next step is to label the body parts by creating segments • Kinect uses a trained tree structure (known as a decision tree) to match the data for a specific type of human body • Eventually, every single pixel data passes through this tree to match with body parts • Once the different body parts are identified, the sensor positions the joint points with the highest probable matched data • With identified joint points and the movement of those joints, the sensor can track the movement of the complete body

The Rendering Pipeline Processes • The joint positions are measured by three coordinates (x,y,z) • x and y define the position of the joint • z represents the distance from the sensor • To get the proper coordinates, the sensor calculates the three views of the same image: front, left, and top views => define 3D body proposal

Skeleton Tracking • The Kinect for Windows SDK provides us with a set of APIs that allow easy access to the skeleton joints • The SDK supports the tracking of up to 20 joint points • Tracking state: Tracked, Not Tracked, or Position Only • Tracking modes: default and seated • Default mode: detects the user based on the distance of the subject from the background • Seated mode: uses movement to detect the user and distinguish him or her from the background, such as a couch or chair

Skeleton Tracking • Kinect can fully track up to two users • It can detect up to 6 users (4 of them with position only)

Skeleton Tracking • Seated skeleton: up to 10 joints • The seated pipeline provides a different segmentation mask than the default pipeline: • Continuity of the segmentation mask is not guaranteed outside of the arms, head, and shoulder areas • The seated segmentation mask doesn't correspond exactly to the player outline like the standing (full-body) mask does • The seated pipeline environment has less data, with more noise and variability than the standing environment • The seated mode uses more resources than the default pipeline and yields a lower throughput (in frames per second) on the same scene kinect.SkeletonStream.TrackingMode = SkeletonTrackingMode.Seated;

Capturing and Processing Sekelton Data • Enable the skeleton stream channel with the type of depth image format • Attach the event handler to the skeleton stream channel • Process the incoming skeleton frames • Render a joint on UI this.sensor = KinectSensor.KinectSensors[0]; this.sensor.SkeletonStream.Enable(); this.sensor.SkeletonFrameReady += skeletonFrameReady; void skeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { }

Processing Skeleton Data void skeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame()) { if (skeletonFrame == null) { return; } skeletonFrame.CopySkeletonDataTo(totalSkeleton); Skeleton firstSkeleton = (from trackskeleton in totalSkeleton where trackskeleton.TrackingState == SkeletonTrackingState.Tracked select trackskeleton).FirstOrDefault(); if (firstSkeleton == null) { return; } if (firstSkeleton.Joints[JointType.HandRight].TrackingState == JointTrackingState.Tracked) { this.MapJointsWithUIElement(firstSkeleton); } } } Skeleton[] totalSkeleton = new Skeleton[6];

Render the Right-Hand Joint on UI We have to map the coordinate from the skeleton space to regular image space

Render the Right-Hand Joint on UI private void MapJointsWithUIElement(Skeleton skeleton) { Point mappedPoint = ScalePosition(skeleton.Joints[JointType.HandRight].Position); Canvas.SetLeft(righthand, mappedPoint.X); Canvas.SetTop(righthand, mappedPoint.Y); } • depthPoint will return the X and Y points corresponding to the skeleton joint point private Point ScalePosition(SkeletonPoint skeletonPoint) { DepthImagePoint depthPoint = this.sensor.CoordinateMapper. MapSkeletonPointToDepthPoint(skeletonPoint, DepthImageFormat. Resolution640x480Fps30); return new Point(depthPoint.X, depthPoint.Y); }

Build TrackingHand App • Create a new C# WPF project with name TrackingHand • Add Microsoft.Kinect reference • Design GUI • Added WindowLoaded() method in xaml file • Adding code

GUI Design • Canvas control, then add Ellipse control in Canvas

Adding Code KinectSensor sensor; Skeleton[] totalSkeleton = new Skeleton[6]; • Add member variables: • WindowLoade method (WindowClosing() same as before): private void WindowLoaded(object sender, RoutedEventArgs e) { this.sensor = KinectSensor.KinectSensors[0]; this.sensor.SkeletonStream.TrackingMode = SkeletonTrackingMode.Seated; this.sensor.SkeletonStream.Enable(); this.sensor.SkeletonFrameReady += skeletonFrameReady; // start the sensor. this.sensor.Start(); }

Adding Code void skeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) { using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame()) { if (skeletonFrame == null) { return; } skeletonFrame.CopySkeletonDataTo(totalSkeleton); Skeleton firstSkeleton = (from trackskeleton in totalSkeleton where trackskeleton.TrackingState == SkeletonTrackingState.Tracked select trackskeleton).FirstOrDefault(); if (firstSkeleton == null) { return; } if (firstSkeleton.Joints[JointType.HandRight].TrackingState == JointTrackingState.Tracked) { this.MapJointsWithUIElement(firstSkeleton); } } } • Event handler for skeleton frames:

Adding Code private void MapJointsWithUIElement(Skeleton skeleton) { Point mappedPoint = ScalePosition(skeleton.Joints[JointType.HandRight].Position); Canvas.SetLeft(righthand, mappedPoint.X); Canvas.SetTop(righthand, mappedPoint.Y); //this.textBox1.Text = "x="+mappedPoint.X+", y="+mappedPoint.Y; } private Point ScalePosition(SkeletonPoint skeletonPoint) { DepthImagePoint depthPoint = this.sensor.CoordinateMapper. MapSkeletonPointToDepthPoint(skeletonPoint, DepthImageFormat. Resolution640x480Fps30); return new Point(depthPoint.X, depthPoint.Y); } • For UI display

Challenge Task • For advanced students, please modify the project to make it a drawing app • Shows all traces of the hand movement • Add a virtual button to clear traces to make a new drawing • Add a small palette chooser for change the color of the drawing point (an Ellipse) • Note that you must add code such that the button/palette is pushed/selected using the gesture EEC492/693/793 - iPhone Application Development

EEC-693/793 Applied Computer Vision with Depth Cameras

EEC-693/793 Applied Computer Vision with Depth Cameras

Presentation Transcript

Computer Vision Syndrome

Publish paper in Computer Vision Area

CSCE 643: Introduction to Computer Vision

Welcome to the First Workshop on R G B - D : Advanced Reasoning with Depth Cameras!

CSCI-431 Introduction to Computer Vision

Where computer vision needs help from computer science (and machine learning)

Computer Vision

Machine Vision: Image Capture and Digitization

3-D Computer Vision CSc 83029

Robot vision

Computer Vision Research @ UNR

Lecture 4a: Cameras

Inverse Depth Parameterization for Monocular SLAM Vision Seminar

CSE 185 Introduction to Computer Vision

Computer Vision Aids for the Blind and Low-Vision Patients

Attention in Computer Vision

Stereo Vision System

Depth

Computer Vision cmput 499/615

Computer Vision @

Computer Vision, CS 763

Computer Vision