1 / 21

Dense Image Over-segmentation on a GPU

Alex Rodionov 4/24/2009. Dense Image Over-segmentation on a GPU. Outline. 1) Problem/Motivation 2) Goals 3) Overview of Algorithm 4) Implementation 5) Results 6) Future work & Conclusions. Problem. Applications. Computer vision (object recognition, intelligent segmentation, ...)

sun
Télécharger la présentation

Dense Image Over-segmentation on a GPU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alex Rodionov 4/24/2009 Dense Image Over-segmentation on a GPU

  2. Outline • 1) Problem/Motivation • 2) Goals • 3) Overview of Algorithm • 4) Implementation • 5) Results • 6) Future work & Conclusions

  3. Problem

  4. Applications • Computer vision (object recognition, intelligent segmentation, ...) • Apply Algorithm X to a grid of superpixels vs. a grid of pixels • Image compression

  5. Goals • Run at interactive framerates • Fast enough for interactive use (video) • Segment a 640x480 image in 200ms or less (5+ FPS)

  6. The Algorithm • 0) Preprocess image (to grayscale, smooth) • 1) Calculate speed map • 2) Place N seed points throughout the image • 3) Initialize a distance function to these seeds • 4) Evolve distance function one timestep • 5) Superpixel boundaries: pixels where distance func == 0

  7. 0) Preprocess

  8. 1) Calculate Speed - Function of image gradient magnitude (edge strength)

  9. 2) Place seeds

  10. 3) Init Distance Function Pixel value = distance to closest seed

  11. 4) Evolve the function - Superpixels grow over time - Evolution done by partial differential equation - d(pixel)/dt function of spatial derivatives, speed, and proximity to other superpixels

  12. 5) Extract boundary Zero-crossings of distance function define boundary

  13. Result

  14. GPU Implementation • Original implementation of TurboPixels done in MATLAB, parts accelerated with C • GPU Implementation is C++ accelerated with CUDA • Not all parts of original algorithm mappable to GPU (algorithms not parallel!)

  15. GPU Implementation • Example: Distance Transform • Foreach pixel, get distance to nearest pixel-of-importance • Used in initializing distance function • Used during evolution to get nearest superpixel boundary point • Original algorithm used Fast Marching Method to calculate (uses global data structure, not parallel) • Replace with a GPU-friendly substitute

  16. GPU Implementation • Assignment map • Array of superpixel IDs • Keeps track of superpixel coverage, ownership • Prevents merging of superpixels

  17. Performance Optimizations • Use CUDA arrays,textures when kernel performs random or neighbor accesses • Little use of shared memory • Kernels either read once+write once, or have complex access patterns not easily done with shmem • Loop unrolling, aligning image array sizes, ...

  18. Results

  19. Sample Result • Platform: NVIDIA GTX280 • Image size: 640x480 • Time: 443ms (2.25FPS) • Timesteps: 122 • Superpixels: 1000 • Software implementation: 30 sec on 481x321 image

  20. Conclusions • Implemented a TurboPixels-like image oversegmentation algorithm on a GPU • Performance goal of 5fps on 640x480 not quite attained • Achieved significant speedup over software implementation (although one written mostly in MATLAB...)

  21. Future Work • Algorithmic optimizations: • Evolve area around the expanding boundary, instead of evolving everything • Use variable-length timesteps to reduce number of timesteps and amount of work • Application to video: • Once it runs fast enough, then what? • Modify algorithm to take advantage of inter-frame coherence

More Related