550 likes | 702 Vues
This paper presents an innovative algorithm for computing cache-efficient layouts of Bounding Volume Hierarchies (BVHs), critical data structures used in ray tracing, collision detection, and visibility culling. The proposed method optimizes the arrangement of BVH nodes and associated geometric primitives to improve memory access speed, leveraging probabilistic models that account for spatial and parent-child locality. This research addresses various geometric applications and proposes a layout construction method applicable to any spatial partitioning hierarchy, including kd-trees.
E N D
Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs) Sung-Eui Yoon Lawrence Livermore National Laboratory Dinesh Manocha Univ. of North Carolina at Chapel Hill
Goal • Compute cache-coherent layouts of bounding volume hierarchies (BVHs) • For various geometric applications • Handles any kind of BVHs and spatial partitioning hierarchies (e.g., kd-tree)
Bounding Volume Hierarchies (BVHs) • Widely used data structures in: • Ray tracing • Collision detection • Visibility culling Ray tracing Dynamic simulation
Bounding Volumes (BVs) • Axis-aligned bounding boxes (AABBs) • Oriented bounding boxes (OBBs) [Gottschalk et al. 96] • Spheres [Hubbard 93] • Discrete orientation polytopes (k-DOPs) [Klosowski et al. 98] Triangles of a mesh
Layout of BVHs • Nodes (and triangles) of BVHs are stored in arrays • What is a good layout? • How to compute cache-efficient layouts? A 1D Layout of nodes: Layout method B C A B C D E D E
Motivation • Lower growth rate of data access speed 46X Growth rate during 1993 – 2004 20X 1.5X Courtesy: http://www.hcibook.com/e3/online/moores-law/
Memory Hierarchies and Caches Fast memory or cache Slow memory Block transfer Disk CPU 1 sec 10-6 sec 10-4 sec Access time:
Main Contributions • An algorithm computing cache-efficient layouts of BVHs • Probabilistic model • Simple layout construction method • Applicable to spatial partitioning hierarchies
Related Work • Mesh layouts • Layouts of search trees • Layouts of BVHs
Related Work • Mesh layouts • Cache-coherent layouts of meshes and graphs [Yoon et al. 05, Yoon and Lindstrom 06] • Layouts of search trees • Layouts of BVHs Require an input graph that represents access patterns on a BVH
Related Work • Mesh layouts • Layouts of search trees • [Gil and Itai 99, Alstrup et al. 03] • Layouts of BVHs Require a probability function that each node will be accessed
Related Work • Mesh layouts • Layouts of search trees • Layouts of BVHs • Studied in collision detection [Ericson 04] and ray tracing [Havran 97] • Blocking-based layouts [Terdiman 03, van Emde Boas 77]
Outline • Probabilistic model • Layout computation • Results
Outline • Probabilistic model • Layout computation • Results
Traversals of Collision Queries on BVHs • Takes two objects • Two 3D objects for collision detection • One 3D object and one ray for ray tracing BVH2 BVH1
Two Localities • Parent-child locality • Spatial locality
Parent-child Locality B A B A BVH2 BVH1
Spatial Locality D C E C D E BVH2 BVH1
Probabilistic Model • Quantify localities in a uniform way • Measure the probability for localities • Based on geometric relationships between bounding volumes
Probabilistic Model • Pr (n) • Probability that a node, n, will be accessed during runtime traversal • Two major factors • Prob. that p is accessed • Conditional prob. that p is also intersected given g is intersected Intersected b g Accessed and Intersected p n where Xp (or Xg)is a boolean random variable indicating collision between p (or g) and b
Probability Computation • : Conditional prob. that p is also intersected given g is intersected • Do not know any information about b Intersected b g Intersected p n
b Sp∩Sg Contact Space Intersected • Contact space of b against p and g • Denoted as Sp and Sg b g Intersected p n Sp = p Sg = g
Contact Space • Assume b is a sphere • Computed from Minkowski sum • Configuration space, in general • Too expensive to compute Sp Sg Sp Sg b b Sp∩Sg Sp∩Sg
Approximate Probability Computation • Assumes “b” to be a point, a degenerated case • Exact value is not required • Only 5% incorrect decisions compared to considering many other cases • Surface area heuristics (SAH) [MacDonald and Booth 90, Havran 00] • Equivalent to our approximation
Outline • Probabilistic model • Layout computation • Results
Overview of Layout Algorithm • Cache-oblivious layout computation • Do not assume any particular cache block sizes • Designed to work well with various (geometric) block sizes [Yoon and Lindstrom 06] • Two main steps in recursion • Cluster construction w/ parent-child locality • Layout clusters w/ spatial locality
Clustering • Minimize the working set size during collision queries • Maximize the sum of probabilities of nodes in a cluster • NP-complete even for cache-aware layout given a search query [Gil and Itai 99]
Greedy Clustering • Employ top-down greedy clustering • Compute balanced sized clusters • Maintain convexity [Gil and Itai 99] Cluster 0.5 0.9 0.8 0.1
Layout of Clusters • Uses cache-oblivious layouts of meshes • [Yoon et al. 05] Spatial locality
Layout of Clusters • Uses cache-oblivious layouts of meshes • [Yoon et al. 05] Spatial locality
Outline • Probabilistic model • Layout computation • Results
Results • Collision detection • Use oriented bounding box (OBB) [Gottschalk et al. 96] • Breadth-first tree traversal • Ray tracing • Use kd-tree [Wald 04] • Depth-first tree traversal
Collision Detection – Robot and Power Plant Models 20k triangles 1M triangles
Collision Detection – Performance Comparison I 41% ~ 500% performance improvement Collision time (ms/100) Working set size (KB) van Emde Boas layout Our cache-oblivious layout Breadth-first layout Cache-oblivious mesh layout Depth-first layout Different layouts
Collision Detection – Performance Comparison II 35% ~ 2600% performance improvement Collision time (ms/100) Working set size (KB) van Emde Boas layout Our layout Breadth-first layout Cache-oblivious mesh layout Depth-first layout Different layouts
Cache-Oblivious Layout vs Cache-Aware Layout • Cache-aware layouts • Take advantage of block size information (4KB) • Minor performance degradation • 8% compared to cache-aware layouts
Ray Tracing – Lucy Model 28 million triangles Pentium IV with 1GB
Ray Tracing – Performance Comparison 77% ~ 180% performance improvement Working set size (MB) Render time (sec) Our layout van Emde Boas layout Depth-first layout Breadth-first layout Different layouts
Major Differences over Other Layouts • Commonly used layouts • Consider connectivity of trees • Two improvements of our layouts • Probabilistic model based on geometry • Layout method considering two different localities
Limitations • No guarantee that our layout always improves the performance • May not improve the performance of computationally intensive queries (e.g., exact penetration depth computation) • Assumes that collision algorithm does not use front tracking
Advantages • Generality • Works with any geometric hierarchies • Does not require cache parameters • Usability • Can gain performance improvement without modifying codes • Replaces only data layouts
Conclusion • Cache-efficient layouts of BVHs • Probabilistic model • Simple layout construction method • Applied to collision detection and ray tracing
Ongoing and Future Work • Extend to other proximity and LOD queries [Yoon et al. 06] • Investigate other geometric hierarchies • Improve the quality of hierarchies • Apply to deforming models [Lauterbach et al. 06]
Acknowledgements • Model contributors • Funding agencies • Army Research Office • DARPA • Intel • Lawrence Livermore National Laboratory • Microsoft • National Science Foundation • Office of Naval Research • RDECOM
Acknowledgements • Russ Gayle • Ted Kim • Ming Lin • Peter Lindstrom • Brandon Lloyd • Valerio Pascucci • Stephane Redon • LLNL data analysis group members • Anonymous reviewers
Questions? Thanks!
UCRL-PRES-223220 This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-ENG-48. Note: this talk is not supported or sanctioned by DoE, UC, LLNL, CASC
Double eagle tanker (82M triangles) Isosurface (472M) St. Matthew (372M) BVHs of Massive Models • Complex and massive models • High memory requirement • Can have gigabyte data size
Speed Size 100 ns 1KB Register 101 ns 1MB Caches 102ns 1GB Main memory 104ns Disk storage > 1GB Memory Hierarchies