1 / 40

CS179: GPU Programming

CS179: GPU Programming. Lecture 9: Lab 4 Recitation. Today. 3D Textures PBOs Fractals Raytracing Lighting/ Phong Shading Memory Coalescing. 3D Textures. Recall advantages of textures: Not global memory, faster accesses Still available to all threads/blocks Larger in size

quilla
Télécharger la présentation

CS179: GPU Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS179: GPU Programming Lecture 9: Lab 4 Recitation

  2. Today • 3D Textures • PBOs • Fractals • Raytracing • Lighting/Phong Shading • Memory Coalescing

  3. 3D Textures • Recall advantages of textures: • Not global memory, faster accesses • Still available to all threads/blocks • Larger in size • Better caching • Filtering, clamping, sampling, etc.

  4. 3D Textures • 3D textures store volume data • Could be used for rendering smoke, particles, fractals, etc. • Allocate a 3D cudaArray to make 3D texture • cudaMalloc3DArray gives correct pitch • Declare texture in device • texture<type, 3, mode> tex • Access on device with texture sampling • tex3D(tex, x, y, z)

  5. 3D Textures • Some texture properties you can set: • tex.normalized: ints or normalized floats • tex.filterMode: linear or point filtering • tex.addressMode[x]: wrap or clamp (for each dimension) • Bind texture to array • cudaBindTextureToArray • Unbinding is typical, but probably not necessary • All of this is done for you in lab 4!

  6. PBOs • Pixel Buffer Objects (PBOs) • Store volume data • Used to easily render in OpenGL • Recall lab 3 • VBOs stored vertex data • Vertex data remained on GPU -- minimal transfer to/from CPU • Rendered via OpenGL on GPU • Same story here • Pixels instead of verts, but same idea

  7. PBOs • Initialize: • glGenBuffersARB(target, &pbo) • target is the target buffer object • Bind to OpenGL: • glBindBufferARB(target, pbo) • Assign Data: • glBufferDataARB(target, size, data, usage) • data is a pointer to the data, usage tells us how often we read/write • Map to CUDA: • cudaGLMapBufferObject/cudaGLUnmapBufferObject

  8. Fractals • Fractals: infinite complexity given by simple instructions • “Self-similar, recursive” • Difficult for us to process (but nice for a computer!) • Many different kinds (we’ll look at Julia Set) • How to render on CUDA: • Calculate fractal volume or area • Copy into texture • Volume render as PBO • What is a Julia Set?

  9. Mandlebrot Set • “Father” of Julia Set

  10. Mandlebrot Set

  11. Mandlebrot Set • Simpler than it looks • Recursively defined by zn+1 = zn2 + c • c is imaginary constant • z0 = 0 • Three possible results based on c: • Converge to 0 (black space) • Stays in finite orbit (boundary • Escapes to infinite (blue area)

  12. Mandlebrot Set • Computed by iteratively computing zn • Assume after some point, it escapes and we can stop checking… • ||zn|| > 2, for example • Coloring is oftentimes based on rate of escape • Don’t need more than a few dozen iterations to see behavior • Demo?

  13. Julia Set • Each pixel on Mandlebrot set has corresponding Julia Set

  14. Julia Set • Idea: instead of starting with z0 = 0, let z0 = c0 • c0 changing will change Julia Set dramatically!

  15. Julia Sets • Why are they useful? • Nothing really practical yet • But they look cool! • Can teach us about chaos, model weather, coastlines, etc. • Parallelizable problem, so good for us!

  16. Julia Sets • Lab 4 is even more exciting than Julia Sets… • 4D Julia Sets!

  17. Julia Sets • 4D: Using quaternions instead of imaginary • Quaternions: 3D extension to imaginary numbers • i2 = j2 = k2 = ijk = -1 • ij = k = -ji, jk = i = -kj, ki = j = -ik • Many uses in graphics • Rotations • Kinematics • Visualizations • Etc. • We give you some nice quaternion functions (sqr_quat, mul_quat, etc.)

  18. Julia Sets • How do we render 4D object? • Projection: taking nD slices of an (n+1)D object • Ex.: MRI Scan - 2D images of a 3D volume • For 4D Julia set, render volume slices of 4D object • Think of it as time evolving object • Slice is one frame in time • Now we have 3 parameters: • z0- starting point for Julia set • c - constant for Mandlebrot set • zp - slicing plane for projection

  19. Julia Sets • How to render: • Transform each coordinate in volume texture to quaternion • q = (pos.x, pos.y, pos.z, dot((pos, 1), plane)) • Implemented for you as pos_to_quat • Store escape speed or convergence in volume texure • Volume render - raytracing

  20. Raytracing • Kind of what it sounds like: tracing rays • Start at some origin ray.o • Step in direction ray.d • If we collide with something, render it! • To check shadows, raytrace back toward light - if object hit, then in shadow • Raytracing used for super high-def images • Can also be used to calculate lighting, volumes, etc.

  21. Raytracing

  22. Raytracing

  23. Raytracing • Might not work great for fractals • Fractals are infinitely thin, so we might skip over many details • Use distance function estimator • Gives lower bound for distance to set from any point in space • Let z’n also be iteratively computed as z’n+1= 2znz’n, z’0= (1,0,0,0)

  24. Raytracing • Rendering this distance function isosurface is okay • Usage: • Iterate zn and z’n until we escape or reach maximum iterations • Return distance of previous slide • Render all pixels “close enough” to set in volume

  25. Raytracing • Better idea: use a bit of raytracing • Load volume data with distances to set • Store in volume texture • Raytrace along a ray through texture • Stop once we see distance is very low, under some epsilon • Each ray handled by one thread, so pretty quick

  26. Raytracing • Better raytracing: • Current model: step along ray by step * ray.d • step = some small constant, e.g. 0.005 • What if we are 0.5 units away? • Don’t need to step by 0.005 • Use adaptive sampling: • step = factor * dist • factor = 0.01-0.5 works well • No need to worry about thread divergence

  27. Raytracing • Calculating ray: • Inverse matrix needed to calculate where we are looking • invViewMatrix given to you, calculated for you • Pass it into constant memory c_invViewMatrix on GPU • ray.o = invViewMat * (0, 0, 0, 1) • ray.d = invViewMat * (u, v, -2.0) • u, v are screen coordinates -- calculate these based on 2D thread index

  28. Lighting • Once we hit fractal, render it! • What color? • Depends on lighting, shading model, material properties… • You get to color based on however you like • Something with some complexity would be good • We suggest phong shading

  29. Phong Shading • 3 Components: Ambient, diffuse, specular

  30. Phong Shading • Ambient: Just a flat color • amb = amb_color;

  31. Phong Shading • Diffuse: Adds soft shadows and highlights based on normal • diff = diff_color * cos(a) • a is angle between light and surface normal • Remember to use normalized vectors! N L a

  32. Phong Shading • Specular: adds in bright highlights • spec = spec_color * dot(R, eye)S • R is L reflected across N • Eye = vector to eye • S = shininess (weird: higher S = less shiny) R N L eye

  33. Phong Shading • Final output color is just sum of components: • out = amb + diff + spec • Main info we need to know: • Light direction (chosen up to you, just hardcode) • Normal (must compute) • Eye vector (this is just -ray.d)

  34. Phong Shading • Calculating Normal via gradient: sample volume texture • For each component (x, y, z): • Sample texture at component + some offset (x + 0.01) • Sample texture at component - some offset (x - 0.01) • Calculate difference per component • Resulting differences are normal components! • We can also directly sample d_juliaDist • This can be pretty slow, but normals will be smoother • Up to you, if you’d like

  35. Coalesced Memory • Recap: coalesced memory gets better access rates • Must be in-order, aligned, and together • Comes into play with thread indexing • index = threadIdx.x + blockDim.x * (blockIdx.x + gridDim.x * blockIdx.y); • index = threadIdx.x + blockDim.y * (blockIdx.y + gridDim.y * blockIdx.x);

  36. Your Task • Some prelab questions will be available • All TODO code is in frac_kernel.cu • Host code: • Copy necessary memory into GPU • Map/Unmap buffers • Run kernels (2 this time, one to compute fractal, one to render) • Use timer events to record time of recalculation • Device code: • d_setfractal: loads volume data with data from d_juliaDist • d_juliaDist: returns estimated distance to Julia set at given point • d_juliaNormal: samples texture to calculate normal • d_render: raytraces to render isosurface, uses shading model to color fractal

  37. Your Task • GPU architecture: • Indexing is made easiest with 1D block, 2D grid • Defined for you, see globally defined consts and dim3s • Space is bounded by region [-2, 2]3 • You’ll need to convert back and forth between this space and texture array indices • Feel free to play with any architecture/setup • In general, feel free to play with anything! • Coloring can be really cool… • Try other functions (z3 + c, for example)

  38. Your Task • Extra Credit: • Raytracing: use raytracing to render shadows (10pts) • Once we hit surface, trace back toward light source • If we hit surface again, the original surface pixel was in shadow, make it slightly darker • Adaptive Detailing: higher detail when we’re zoomed in (5pts) • Allows us to see the infiniteness of the fractal • Essentially, just adjust epsilon based on distance to camera • epsilon: how close we must be to fractal to be considered a “hit”

More Related