1 / 16

Performance modeling in GPGPU computing

Performance modeling in GPGPU computing. Wenjing xu Professor: Dr.Box. What’s GPGPU?. GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications . What’s modeling.

wind
Télécharger la présentation

Performance modeling in GPGPU computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance modeling in GPGPU computing Wenjingxu Professor: Dr.Box

  2. What’s GPGPU? GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications.

  3. What’s modeling a simplified representation of a system or phenomenon it is the most explicit way in which to describe a system or phenomenon use the parameter we set to build formula to Analysis system

  4. Relate work Hong and Kim [3] introduce two metrics, Memory Warp Parallelism (MWP) and Computation Warp Parallelism (CWP) in order to describe the GPU parallel architecture. Zhang and Owens [4] develop a performance model based on their microbenchmarks so that they can identify bottlenecks in the program. Supada[5] performance model consider memory latencies are varied depending on the data type and the type of memory

  5. 1 Introduction and background Different application and device cannot use same setting Find the relationship between each parameters in this model, and choose the best block size for each application on different device to get peak performance.

  6. varies data size with varies size of block have different performance

  7. How GPU working

  8. Memory latency hiding

  9. The structure of threads

  10. Specification of GeForce GTX 650

  11. Parameters

  12. Block size setting under threads limitation NMB>= NTB = N* NTW (N is integer) >= NRT/ NRB

  13. Memory resource

  14. Block size setting under stream multiprocessor resource MR / MTR >= N* NTB (N is integer) N* NTB (N is integer) <= NRT N<= MSM / MSB

  15. Conclusion Though more threads can hide memory access latency, but the more thread use the more resource needed. Find the balance point between resource limitation and memory latency is a shortcut to touch the peak performance. By different application and device this performance model shows it advantage, adaptable and without any rework and redesign let application running on the best tuning.

More Related