1 / 7

gpu enhanced Real-time video segmentation

Adam Wagner Kevin Forbes. gpu enhanced Real-time video segmentation. Motivation. Take advantage of GPU architecture for highly parallel data-intensive application Enhance image segmentation using Microsoft Kinect IR depth images

lorna
Télécharger la présentation

gpu enhanced Real-time video segmentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adam Wagner Kevin Forbes gpu enhanced Real-time video segmentation

  2. Motivation • Take advantage of GPU architecture for highly parallel data-intensive application • Enhance image segmentation using Microsoft Kinect IR depth images • Reduce frame-to-frame segmentation overhead with optical flow and iterative simulated annealing • “Depth-supported real-time video segmentation with the Kinect” • Algorithm uses Potts model and Metropolis method for segmentation on GPU

  3. Implementation No base source code Software frameworks: • OpenCV – image capture, transformations, optical flow • OpenNI – Kinect middleware • CUDA – NVIDIA GPGPU driven architecture Testbed: rcl1.engr.arizona.edu • CPU: Quad-core Intel Xeon 5160, 3.0GHz • GPU: NVIDIA GeForce GTX 480 • 480 CUDA Cores • GDDR5 • Threads/block = 1024 • Shared memory / block = 48KB

  4. Methodology • Primary effort focused on parallelization of segmentation algorithm • Without source, code was written from scratch for CPU, then parallelized • Memory indexing rearranged to improve coalescing of global loads/stores • Much later in semester, some code became available from paper authors • Image divided into thread blocks on GPU • Image data loaded into block shared memory from global memory • Each thread performs state update on a single pixel

  5. Results Input Image: 512 x 384 RGB to HSV Conversion 2000 Metropolis Iterations

  6. Conclusions • Parallelized algorithm shows vast improvement over CPU version • Makes real-time video processing a possibility • Implementation does not match paper • More improvement possible through use of simpler data types • Still more fine tuned memory arrangement • Increase work done by each thread

More Related