720 likes | 845 Vues
Afrigraph 2004 Tutorial A: Part II Ray Tracing B ased Approaches. Ingo Wald MPI Informatik Saarbrücken, Germany. Outline. Motivation: Why Ray Tracing for Massive Models ? Offline systems Pharr: Memory Coherent Ray Tracing Kilauea: Data-parallelism with ray forwarding
E N D
Afrigraph 2004 Tutorial A:Part II Ray Tracing Based Approaches Ingo Wald MPI Informatik Saarbrücken, Germany
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Christensen: Multiresolution caching • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2001: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Christensen: Multiresolution caching • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2001: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
Why Ray Tracing for Massive Models ? • Summary Part I: • Rendering time is linear in #triangles Too many triangles take too long to render Cannot render all triangles at interactive rates Have to somehow reduce #triangles (culling, simplification, approximation, ...) … • Entire argumentation chain depends on initial axiom“Rendering time is linear in #triangles” • But: Only applies to brute force rasterization, … Afrigraph 2004 Tutorial A
Why Ray Tracing for Massive Models ? • Summary Part I: • Rendering time is linear in #triangles Too many triangles take too long to render Cannot render all triangles at interactive rates Have to somehow reduce #triangles (culling, simplification, approximation, ...) … • Entire argumentation chain depends on initial axiom“Rendering time is linear in #triangles” • But: Only applies to brute force rasterization, … • … NOT for ray tracing ! Afrigraph 2004 Tutorial A
Ray Tracing Simple Algorithm 1.) Create ray from eye through pixel Afrigraph 2004 Tutorial A
Ray Tracing Simple Algorithm 1.) Create ray from eye through pixel 2.) Trace ray into scene Afrigraph 2004 Tutorial A
Ray Tracing Simple Algorithm 1.) Create ray from eye through pixel 2.) Trace ray into scene • Find objects nearby ray (traverse spatial data structures) • Compute ray-object intersection tests • Determine closest hitpoint Grid (2D) Octree (2D) Afrigraph 2004 Tutorial A
Ray Tracing Simple Algorithm 1.) Create ray from eye through pixel 2.) Trace ray into scene 3.) Compute color of ray (“shade” the ray) • Maybe including secondary rays… (won’t discuss that here) Afrigraph 2004 Tutorial A
Ray Tracing Simple Algorithm 1.) Create ray from eye through pixel 2.) Trace ray into scene 3.) Compute color of ray (“shade” the ray) 4.) Display final image Afrigraph 2004 Tutorial A
Ray Tracing: Properties • Simple algorithm, with many advantages • Support for advanced shading and global illumination • Not directly related to massive model rendering… • Supports instantiation • Visibility culling built in • Occlusion culling built in, too • Per pixel visibility • Trivially parallelizable • Logarithmic scalability in scene size • Due to traversal of (hierarchical) spatial acceleration structures Afrigraph 2004 Tutorial A
Ray Tracing: Properties • Simple algorithm, with many advantages • Support for advanced shading and global illumination • Not directly related to massive model rendering… • Supports instantiation • Visibility culling built in • Occlusion culling built in, too • Per pixel visibility • Trivially parallelizable • Logarithmic scalability in scene size • Due to traversal of (hierarchical) spatial acceleration structures • For complex models, „log scalability“ is most important ! Afrigraph 2004 Tutorial A
Ray Tracing: Log. Scalability • Logarithmic scalability in practice: • Example from Isosurface Ray Tracing (similar for polygons) • Marschner-Lobb dataset: Synthetically generated • From 32x32x32 voxels (32kVox) to 1024x1024x1024 (1GVox) Afrigraph 2004 Tutorial A
Ray Tracing: Log. Scalability • Logarithmic scalability in practice: • Example from Isosurface Ray Tracing (similar for polygons) • Rendering time *= 2, for scenesize *= 10^4.5 ! Afrigraph 2004 Tutorial A
Why use Ray Tracing for Massive Model Rendering ? • Logarithmic scalability: • Essentially solves the complexity problem • Number of triangles not the main problem any more… Afrigraph 2004 Tutorial A
Why use Ray Tracing for Massive Model Rendering ? • #Triangles not the main problem any more • Proof by example: “Sunflowers” scene • ~one billion triangles • Plus shadows, textures, transparency, … • Rendered interactively on few PCs Afrigraph 2004 Tutorial A
Ray Tracing for Massive Model Rendering • #Triangles not the main problem any more • So, where’s the problem ? Afrigraph 2004 Tutorial A
Ray Tracing for Massive Model Rendering • #Triangles not the main problem any more • So, where’s the problem ? Main problem: Efficient scene storage and access ! The storage problem • “logarithmic cost”: assumes all data is in memory • “Sunflowers” example ? • Only possible through instantiation special case ! • For general complex models: usually not the case • Boeing 777: 40GB on disk • Lawrence-Livermore dataset: 24 GB (already compressed) • Full model actually 270x24GB (270 time slices) Storage is worst indiv. problem of ray tracing MCMs. Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Christensen: Multiresolution caching • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2002: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
“Memory Coherent Ray Tracing”[Pharr, Siggraph97] • Basic observation • Ray tracing ideally suited for complex models • But: • Complex scenes won’t fit into memory • Typical depth-first ray tracing destroys coherence • W/ global illumination: Scene access usually (almost) random • Excessive OS paging and thrashing, huge rendering times… • Basic idea • Do “manual” (explicit) caching/paging on suitable entities • Reorder computations to achieve coherence • Based on scheduler to minimize disk I/O Avoid page thrashing, significantly improve performance Afrigraph 2004 Tutorial A
Memory Coherent Ray Tracing - Reordering Computations • Typical ray tracing approach: depth-first • Trace first ray until hit point is found • Trace all secondary rays of first ray in depth-first manner • Start new pixel only after all of the old one’s rays are finished… • Problem: Highly incoherent scene access • Primary ray of 2nd pixel needs similar data as primary of 1st pixel • But: Secondary rays of 1st pixel have already swapped out data… • Solution: Improve scene access coherence by reordering • Explicit caching • Know what’s in memory at what time ! • Reordering of rays • Based on “ray scheduler” that tries to minimize I/O Trade uncontrolled OS paging for controlled disk I/O Afrigraph 2004 Tutorial A
MCRT – Geometry Cache Caching framework • Subdivide scene into grid of “voxels” • Each voxel small enough to fit into memory • Voxel contains all necessary data • Triangles, vertices, local acceleration data structure, ... • If triangle overlaps two voxels, replicate it • Voxel contains all data to trace a ray through it • Build voxels in preprocessing step, store on disk • Perform explicit caching on these voxels • Keep cache of currently loaded voxels (“geometry cache”) • Use fixed-size cache • If new voxel needs to be loaded, discard old one. • Plus: texture cache, displacement cache, … (no details here) Afrigraph 2004 Tutorial A
MCRT – Ray Scheduling • Each ray consists of • Origin, direction, current hit information, … • Ref. to pixel it belongs to • Weight factor that specifies how much ray may contribute to pixel • Once ray hits emissive object • Add its contribution (ray weight * surface emission) to pixel • Rays can be traced in any order • Store queue of “active” rays per voxel • I.e., list of all rays that demand intersection with that voxel • Perform scheduling based on voxel queue Afrigraph 2004 Tutorial A
MCRT – Ray Scheduling • Basic scheduling algorithm Generate eye rays and place them into respective queue(s) While there are queued rays Choose a voxel to process For each ray in voxel Intersect ray with voxel’s geometry If there is an intersection Run shader and compute BRDF Insert spawned rays into voxel’s queue (w/ proper weights) If surface is emissive, store contribution to image Terminate ray else Advance ray to the next voxel queue along its way Afrigraph 2004 Tutorial A
Memory Coherent Ray Tracing Scheduling Overview Afrigraph 2004 Tutorial A
MCRT – Results • Reordering significantly improves coherence • Always work on similar rays (same queue) at same time • In particular: First iteration All primary rays in same queue ! • Scheduling can significantly reduce disk I/O • Can prefer rays from voxels whose geometry is already loaded • When loading, can prefer voxels with many active rays • Amortize loading cost over as many rays as possible • Significantly less paging • Can efficiently ray trace scenes much larger than memory Afrigraph 2004 Tutorial A
MCRT – Results • Can efficiently ray trace scenes much larger than memory • Lake scene, rendered using path tracing and environment illum. • MCRT: can be rendered at 10% cache size • Keep in mind: This was 1997 !!! Afrigraph 2004 Tutorial A
MCRT – Results • Significantly less paging Much faster rendering Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Christensen: Multiresolution caching • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2002: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
Kilauea: Data-Parallel Ray Tracing • So far: Ray tracing ideally suited for complex models… • … if one can store the model • Pharr: Reduce disk I/O if memory is too small for scene • Kilauea: Use combined memory of several PCs • Combined memories of multiple PCs can hold entire scene • … if it’s just enough PCs • But: need data-parallel approach Afrigraph 2004 Tutorial A
Data-Parallel Ray Tracing • Typical data-parallel ray tracing [Reinhard et al.] • Distribute scene data over multiple rendering clients (RCs) • Subdivide scene into grid of voxels, distribute voxels over RCs • NO client has entire scene Need communication between different clients • Always send ray to client having the voxel it needs to traverse • If ray leaves current RC’s voxel: forward to RC having next voxel • Similar to MCRT, except: • Forward to other RC instead of delay in queue • Can easily scale scene size • As long as sum(all RC memories) > scene size Afrigraph 2004 Tutorial A
Data-Parallel Ray Tracing • Can easily scale scene size • As long as sum(all RC memories) > scene size • Problem • High communication demands • bad communication/compute ratio (slow) • Bottlenecks at “hot-spot” voxels (e.g. voxel w/ camera) very bad scalability in compute power Afrigraph 2004 Tutorial A
Kilauea Approach • “as usual”: Distribute scene data over multiple RCs • But: no ray “forwarding” • Instead: Broadcast ray to all clients • Trace on all clients in parallel • Each with what scene data it has (no paging) • Result is minimum of all hits found at any client • Kilauea Results • Like typical data-par.RT: Can efficiently scale memory • Plus: Fewer communication (no forwarding) • Plus: Reduced hot-spotting (all clients work on any ray) Afrigraph 2004 Tutorial A
Kilauea Results • Additionally: Same approach also for other data • Not only geometry, also for huge photon maps Allows highly detailed photon maps even in complex scenes Afrigraph 2004 Tutorial A
Kilauea Results • Kilauea: Efficiently combine multiple PCs’ memories • But: Many clients do too much • Client traces ray even if it ended much earlier on different client • “Early ray termination” and “occlusion culling” partially disabled Rather bad scalability in compute power • Partial fix: Hybrid data parallel / image-space parallelism • E.g. for 10GB scene on 30 1GB-PCs: • Build 3 micro-clusters of 10 1GB-PCs each • Data-parallel on each of the 3 micro-clusters • Image-space parallelism among micro-clusters Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Christensen: Multiresolution caching • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2002: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
Christensen 2003: Multiresolution Geometry Caching • Keep cache of loaded geometry voxels • Like MCRT • Plus: Maintain multiresolution representation of scene • Defined by different tesselation levels of NURBS-modelled scene • Optimize cache use by selecting proper level • Select proper level by tracking “ray differentials” [Igehy] • “Coherent rays: detailed geometry”… • “Incoherent rays: coarse geometry”… Efficiently avoids thrashing the cache by incoherent rays Afrigraph 2004 Tutorial A
Christensen: Multiresolution Geometry Caching • Efficient cache utlization: • Can render even highly complex models Afrigraph 2004 Tutorial A
Christensen: Multiresolution Geometry Caching • Efficient cache utlization: • Can render even highly complex models • But: Only applicable for special case • Needs multiresolution-suitable representation (tesselated NURBS) • Not easily applicable to general (“triangle soup”) problem • Not considered in any more detail here… Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2002: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
Interactive Ray Tracing • Parker/Muuss: Interactive Ray Tracing • Idea: Use ray tracing also for interactive applications • Already use for highly complex models • Muuss: Outdoor scenes • Parker: Complex Visualization datasets • Muuss: Complex outdoor scenes • Directly ray trace CSG models no tesselation required • Plus: use instantiation to create outdoor complexity • Complex scenes of multi-million triangle equivalent (but moderate storage) • Achieve interactive performance through parallelization Afrigraph 2004 Tutorial A
Parker et al.:Interactive Ray Tracing • Shared-memory supercomputer: cc-NUMA architecture • Also “distributed” memory, caching and demand-loading… • … done entirely by HW • Directly ray trace non-polygonal data • Direct ray tracing of isosurfaces Massively complex datasets (visible female) without tesselation Afrigraph 2004 Tutorial A
Parker et al.:Interactive Ray Tracing • Shared-memory supercomputer: cc-NUMA architecture • Also “distributed” memory, caching and demand-loading… • … done entirely by HW • Directly ray trace non-polygonal data • Direct ray tracing of isosurfaces Massively complex datasets (visible female) without tesselation • But: no special techniques for handling complex models No more details in this “massive model” tutorial… Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Christensen: Multiresolution ray tracing • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2001: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A
Saarland 2001: Distributed IRT for Massively Complex Models • Basic Idea • Interactive Ray Tracing now (2001) possible • Can we also handle massively complex models ? • Data-parallel problematic in communication and scalability • Memory Coherent Ray Tracing worked fine for complex models • Can we do similar at interactive rates ? • Main Problem • MCRT heavily depends on extensive reordering • Interactivity significantly limits reordering capabilities • Therefore • Assume that at least working set fits into memory • Often need only small part of scene for each frame (occlusion,...) • Use reordering only for hiding loading latency Afrigraph 2004 Tutorial A
Saarland 2001: Distributed IRT for Massively Complex Models • Cluster-based interactive ray tracing system • One server, multiple clients • Image-space load balancing (tile-based) • For complex models: Can’t replicate scene on each client • Caching framework • Similar to MCRT: Voxelize geometry • Caching based on self-contained voxels • Use kd-tree instead of grid, otherwise exactly the same • Each client caches scene data of its own • Get voxels from (centralized) model server • Fetch over network Afrigraph 2004 Tutorial A
Saarland 2001: Distributed IRT for Massively Complex Models • “Only” caching doesn’t suffice • Loading latency too high • Entire client stalls for several milliseconds on cache miss • Hide latency by reordering • Reordering: If any ray (would) stall • Suspend stalling ray • Start execution on new ray • Fetch data asynchronously • Resume stalled ray once data is available • Much simpler framework than MCRT • No reordering except on cache miss • Not for secondary rays, global illumination, … Afrigraph 2004 Tutorial A
Saarland 2001: Distributed IRT for Massively Complex Models Results • Caching allows for rendering model larger than memory • At least if footprint of frame is small enough • Powerplant: 12.5 MTri, several GB on disk (2001!) • Rendered on 5-7 PCs w/ 200-400MB geometry cache • Reordering can hide “most” loading stalls • Must be able to load all missing data in same frame • Only for ray casting plus simple shadows/reflections • Full global illumination touches too much geometry • Depends on low loading bandwidth temporal coherency… Afrigraph 2004 Tutorial A
Saarland 2001: Distributed IRT for Massively Complex Models Afrigraph 2004 Tutorial A
Outline • Motivation: Why Ray Tracing for Massive Models ? • Offline systems • Pharr: Memory Coherent Ray Tracing • Kilauea: Data-parallelism with ray forwarding • Interactive Systems • Utah: Realtime Ray Tracing on Supercomputers • Massively complex visualization applications • Saarland 2001: Distributed RTRT for Massive Models • Cluster-based approach w/ demand loading, caching and reordering • Saarland 2004: Caching-based Massive Model Rendering • Single-PC based w/ demand loading and proxies for missing data • Summary and Open Problems Afrigraph 2004 Tutorial A