Download
gpu programming overview n.
Skip this Video
Loading SlideShow in 5 Seconds..
GPU Programming Overview PowerPoint Presentation
Download Presentation
GPU Programming Overview

GPU Programming Overview

178 Vues Download Presentation
Télécharger la présentation

GPU Programming Overview

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. GPU ProgrammingOverview Spring 2011 류승택

  2. What is a GPU? GPU stands for Graphics Processing Unit Simply – It is the processor that resides on your graphics card. GPUs allow us to achieve the unprecedented graphics capabilities now available in games (Demo: NVIDIA GTX 400)

  3. Introduction • GPGPU (General-Purpose Computation on GPUs) • The first commodity, programmable parallel architecture • GPU evolution driven by computer game market • Advantage of data-parallelism • GPUs are >10x faster than CPU for appropriate problems • Advantage of commodity • GPUs are inexpensive • GPUs are Ubiquitous • Desktops, laptops, PDAs, cell phones • Achieving this speedup • Requires a large amount of GPU-specific knowledge

  4. Motivation • Challenge Statement • GPGPU signifies the dawn of the desktop parallel computing age

  5. Why Program on the GPU ? Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf

  6. Why Program on the GPU ? • Compute • Intel Core i7 – 4 cores – 100 GFLOP • NVIDIA GTX280 – 240 cores – 1 TFLOP • Memory Bandwidth • System Memory – 60 GB/s • NVIDIA GT200 – 150 GB/s • Install Base • Over 200 million NVIDIA G80s shipped

  7. How did this happen? • Games demand advanced shading • Fast GPUs = better shading • Need for speed = continued innovation • The gaming industry has overtaken the defense, finance, oil and healthcare industries as the main driving factor for high performance processors.

  8. NVIDIA GPU Evolution Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

  9. Real-time Rendering • Realtime Rendering • Graphics hardware enables real-time rendering • Real-time means display rate at more than 10 images per second 3D Scene = Collection of 3D primitives (triangles, lines, points) Image = Array of pixels

  10. Graphics Review • Modeling • Rendering • Animation

  11. Graphics Review: Modeling • Modeling • Polygons vs Triangles • How do you store a triangle mesh? • Implicit Surfaces • Height maps • …

  12. Triangles Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

  13. Triangles Image courtesy of A K Peters, Ltd. www.virtualglobebook.com. Imagery from NASA Visible Earth: visibleearth.nasa.gov.

  14. Triangles

  15. Triangles

  16. Implicit Surfaces Images from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch01.html

  17. Height Maps Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

  18. Graphics Review: Rendering • Rendering • Goal: Assign color to pixels • Two Parts • Visible surfaces • What is in front of what for a given view • Shading • Simulate the interaction of material and light to produce a pixel color

  19. Rasterization • What about ray tracing?

  20. Visible Surfaces Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

  21. Visible Surfaces • Z-Buffer / Depth Buffer • Fragment vs Pixel Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

  22. Shading Images courtesy of A K Peters, Ltd. www.virtualglobebook.com

  23. Shading Image from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html

  24. Rasterization and Interpolation Raster Operations Graphics Pipeline Vertex Transforms Primitive Assembly Frame Buffer • Scissor Test • Stencil Test • Depth Test • Blending

  25. Graphics Pipeline Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

  26. Graphics Pipeline Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

  27. Graphics Pipeline Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

  28. Graphics Pipeline Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

  29. Graphics Review: Animation • Move the camera and/or agents, and re-render the scene • In less than 16.6 ms (60 fps)

  30. Evolution of the Programmable Graphics Pipeline • Pre GPU • Fixed function GPU • Programmable GPU • Unified Shader Processors

  31. Early 90s – Pre GPU Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf

  32. OpenGL Pipeline

  33. OpenGL Pipeline

  34. GPU Shader • Fixed functionalities • Programmable functionalities • Flexible memory access

  35. Stream Program => GPU • A stream is a sequence of data (could be numbers, colors, RGBA vectors,…)

  36. Vertex Shader • Vertex transformation • Once per vertex • Input attributes • Normal • Texture coordinates • Colors

  37. Geometry Shader • Geometry composition • Once per geometry • Input primitives • Points, lines, triangles • Lines and triangles with adjacency • Output primitives • Points, line strips or triangle strips • [0, n] primitives outputted

  38. Fragment Shader • Pre-pixel (or fragment) composition • Once per fragment • Operations on interpolated values • Vertex attributes • User-defined varying variables

  39. GPU Shader

  40. Programming Graphics Hardware

  41. PC Architecture

  42. Bus Interface • ISA (Industry Standard Architecture) • 버스 인터페이스 • 90년대 초반의 XT, AT시절부터 사용 • 이론적으로 최대 16Mbps의 속도 • 주변기기에서의 병목현상은 심각 • 처리속도가 크게 문제되지 않는 사운드카드나 모뎀등을 연결하는 정도로 쓰이고 있음 • PCI (Peripheral Component Interconnect) • parallel connection • ISA 후속으로 주변장치 연결을 위해 사용되고 있는 인터페이스 • ISA슬롯보다 크기가 작고 IRQ 공유 • 일반적인 32비트 33MHz는 133Mbps의 속도, 64비트 66MHz는 524Mbps 속도 • 주변 장치 대부분이 PCI인터페이스를 사용 PCI AGP ISA

  43. Bus Interface PCIe x1 PCIe x16 • AGP (Accelerated Graphics Port) • Serial Connection (cheap, scalable) • 인텔에 의해 개발 • PCI에 기반을 두고 있으나 전송 속도는 PCI보다 두배 이상 빠름 • 기본적으로 66MHz로 작동 • AGP = 2 x PCI (AGP 2x = 2 x AGP) • AGP 1x방식일 경우는 최고 264Mbps • AGP 2x방식에서는 최고 533Mbps • 3D 그래픽 카드용 • PCIe (PCI Express) • Serial Connection • 최대 8.0 GB/s 의 대역폭 (PCIe = 2 x AGP x 8) • 전 세계 그래픽 시장을 책임지고 있는 인텔 / ATI / NVIDIA 가 이 새로운 규격을 차세대 그래픽 인터페이스로 확실하게 인정  • 기존 PCI의 제한 때문에 탄생한 그래픽 프로세싱 유닛(GPUs)에 독보적 존재였던 AGP가 PCI Express로 대체되고 있는 상황 PCI GeForce 7800 GTX (PCIe x16)

  44. Rasterization and Interpolation Raster Operations Generation I: 3dfx Voodoo (1996) • One of the first true 3D game cards • Worked by supplementing standard 2D video card. • Did not do vertex transformations: these were done in the CPU • Did do texture mapping, z-buffering. Image from “7 years of Graphics” Vertex Transforms Primitive Assembly Frame Buffer CPU GPU PCI

  45. 1995-1998: Texture Mapping and Z-Buffer • PCI: Peripheral Component Interconnect • 3dfx’s Voodoo

  46. Texture Mapping

  47. Texture Mapping: Perspective-Correct Interpolation

  48. Texture Mapping: Perspective-Correct Interpolation

  49. Aside: Mario Kart 64 • High fragment load / low vertex load Image from: http://www.gamespot.com/users/my_shoe/

  50. Aside: Mario Kart Wii • High fragment load / low vertex load? Image from: http://wii.ign.com/dor/objects/949580/mario-kart-wii/images/