1 / 66

EAVL Extreme-scale Analysis and Visualization Library

EAVL Extreme-scale Analysis and Visualization Library. Jeremy Meredith SDAV Next-Gen Library Meeting September, 2012. History. Originally ORNL LDRD Jeremy Meredith, Sean Ahern, Dave Pugmire plus Rob Sisneros joined as a postdoc

liz
Télécharger la présentation

EAVL Extreme-scale Analysis and Visualization Library

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EAVLExtreme-scale Analysis and Visualization Library Jeremy Meredith SDAV Next-Gen Library Meeting September, 2012

  2. History • Originally ORNL LDRD • Jeremy Meredith, Sean Ahern, Dave Pugmire • plus Rob Sisneros joined as a postdoc • Many hours sitting in conference rooms arguing over things like “what does it mean to have one of your dimensions be unstructured?” • then determine what to do that’s practical without falling off the data modeling deep end . . . . • Exascale focus

  3. Approaching the Exascale Problems • Update traditional data model to handle modern simulation codes and a wider range of data. • Investigate how an updated data and execution model can achieve the necessary computational, I/O, and memory efficiency. • Explore methods for visualization algorithm developers to achieve these efficiency gains and better support exascale architectures.

  4. Data Modeling Challenges

  5. A Traditional Data Set Model Data Set Rectilinear Structured Unstructured

  6. Challenge: Non-Physical Data Analysis • Graph Data • topologically 0D vertices, 1D edges • non-spatial; storing X/Y/Z values is wasted space • Pure Parameter Studies • e.g. reaction rate of combustion • FOUR “spatial” dimensions • e.g. methane concentration vs oxygen concentration vs temperature vs pressure • more complex reaction higher dimensionality methane oxygen pressure temperature

  7. Challenge: Molecular Data(e.g., LAMMPS, VASP) BondStr 1 1 1 1 2 AtomicNum 6 6 1 1 1 1 • To represent using vtkPolyData or vtkUnstructuredGrid: • VTK_VERTEX cells for the atoms • VTK_LINE cells for the bonds • Any field data must exist on both element types • Not only inefficient: • dummy bond strengths on the atoms? • dummy atomic numbers on the bonds? • But also incorrect: • e.g. average(BondStrength) uses dummy values from atoms? H H C C H H

  8. Challenge: Side Sets(e.g. Exodus, flux surfaces) • The flow from A to B is defined on a set of faces • The flux variable is defined only on those faces • do you combine them into a single mesh? • waste space on dummy values, potentially introducing errors • or create a separate mesh and lose the mapping info? • horribly expensive and error-prone to recalculate mapping flux surface lives inside the volumetric mesh A B

  9. Challenge: Dimensionality, Refinement(e.g. GenASiS) • (a) seven (or eight) dimensional mesh • f(x,y,z,ϴ,ϕ,λ,F)=E, plus time • (b) refinement occurs on a per-cell basis • can’t assume per-block refinement • sometimes referred to as “unstructured AMR”

  10. Challenge: Unique Mesh Topologies(e.g. MADNESS) • MADNESS does not have a traditional mesh • Just a quad-tree with polynomial coefficients • Up to 30 refinement levels / tree depth root 1 2 3 4 5 1 5 2 4 3 • spatial structure internal tree representation www.vacet.org

  11. Challenge: Very High Order Fields(e.g. MADNESS) • Legendre polynomial series at each tree node • Each tree node has Kdim coefficients • K can be up to approx. 20 • i.e. 400 coeffs per tree node in 2D, 8000 in 3D (example with K=3, dim=2) 0.834 0.592 0.003 0.592 0.003 0.010 0.003 0.010 0.007 www.vacet.org

  12. The EAVL Data Model

  13. A Traditional Data Set Model (again) Data Set Rectilinear Structured Unstructured

  14. The EAVL Data Set Model CellSet Data Set Explicit Structured QuadTree Subset Coords Field

  15. Example: An Unstructured Grid(with interleaved coordinates) eavlExplicitCellSet eavlDataSet eavlCoordinates eavlField

  16. Example: An Unstructured Grid(with separated coordinates) eavlExplicitCellSet eavlDataSet eavlCoordinates eavlField #0 eavlField #1 eavlField #2

  17. Example: A Curvilinear Grid eavlStructuredCellSet eavlDataSet eavlCoordinates eavlField #0 eavlField #1 eavlField #2

  18. Example: A Rectilinear Grid eavlStructuredCellSet eavlDataSet eavlCoordinates eavlField #0 eavlField #1 eavlField #2

  19. Example: High-Dimensional Grid eavlStructuredCellSet eavlDataSet eavlCoordinates eavlField #0 eavlField #1 eavlField #2 eavlField #3 eavlField #4

  20. Example: Geospatial Data eavlStructuredCellSet eavlDataSet eavlCoordinates eavlCoordinates eavlField #0 eavlField #1 eavlField #2

  21. Example: Molecular Data eavlExplicitCellSet #0 eavlExplicitCellSet #1 eavlDataSet eavlCoordinates eavlField #0 eavlField #1 eavlField #2

  22. Example: Face-centered Data eavlExplicitCellSet eavlAllFacesOfExplicit eavlDataSet eavlCoordinates eavlField #0 eavlField #1

  23. Filtering in EAVL

  24. Data flow networks in EAVL (or not) • A “Filter” is a stage in a data flow network • Creates a new data set from an old one • Many operations do not change a mesh structure (assuming data model is sufficiently descriptive) • Arithmetic expressions: only modifies fields • External facelist: points and structure remain • Feature edges: just a new cell set with old points • Smooth, displace, elevate: only modify coordinates • So: eavlMutatoris an alternative to eavlFilter • Modifies a data set in-place

  25. eavlMutator • In-place data set modification • Support for destructive in-place operation • free memory as you go • Execute multiple mutators simultaneously on the same data set (barring conflicts) • e.g. displace (coords) + threshold (cells) concurrently • How about data flow network support? • encapsulate an eavlMutator through a eavlFilterFromMutator facade • Of course, some operations are natively eavlFilters • can facade through eavlMutatorFromFilter (?)

  26. Example: Thresholding an RGrid (a) • Explicit cells can be combined with structured coordinates. eavlStructuredCellSet eavlExplicitCellSet eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#2 eavlField#0 eavlField#1 eavlField#2

  27. Example: Thresholding an RGrid (b) • A second Cell Set can be added which refers to the first one eavlStructuredCellSet eavlSubset eavlStructuredCellSet eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#2 eavlField#0 eavlField#1 eavlField#2

  28. Example: Structured External Facelist • Add six new subset-cell sets to original mesh x6 eavlStructSubset eavlStructSubset eavlStructuredCellSet eavlStructCellSet eavlStructSubset eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#2 eavlField#0 eavlField#1 eavlField#2

  29. Example: Elevating a Structured Grid • No problem-sized data modifications. • Interleaved and separated coordinates can be used simultaneously. eavlStructuredCellSet eavlStructuredCellSet eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#0 eavlField#1

  30. Example: Elevating a Regular Grid • No problem-sized data modifications. • Some axes on logical dims, with others on the points. eavlStructuredCellSet eavlStructuredCellSet eavlCoordinates eavlCoordinates eavlField#0 eavlField#1 eavlField#2 eavlField#0 eavlField#1 eavlField#2

  31. Dealing With Concurrency

  32. Concurrency at Multiple Levels • Distributed Parallelism • Message passing still works well • Avoid global communication • local domain interconnectivity information • Hybrid (e.g. spatiotemporal) parallelism • Task Parallelism • Fine-grain dependency tracking • e.g. displace (coords) + threshold (cells) concurrently • eavlMutator helps • single eavlDataSet container class helps • Thread Parallelism • Fine-grain data parallelism; CUDA, OpenMP

  33. Data Parallelism for Developers • Functor + iterator paradigm • Iteration patterns for mesh topologies • CUDA + OpenMP execution back-ends

  34. A Simple Data-Parallel Operation void CellToCellDivide(Field &a, Field &b, Field &c) { for_each(i) c[i] = a[i] / b[i]; } void CalculateDensity(...) { //... CellToCellDivide(mass, volume, density); } Internal Library API Provides This Algorithm Developer Writes This

  35. Functor + Iterator Approach void CalculateDensity(...) { //... CellToCellBinaryOp(mass, volume, density, Divide()); } template <class T>void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f) { for_each(i) f(a[i],b[i],c[i]); } structDivide { void operator()(float &a, float &b, float &c) { c = a / b; } }; Internal Library API Provides This Algorithm Developer Writes This

  36. Custom Functor void CalculateDensity(...) { //... CellToCellBinaryOp(mass, volume, density, MyFunctor()); } template <class T>void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f) { for_each(i) f(a[i],b[i],c[i]); } structMyFunctor { void operator()(float &a, float &b, float &c) { c = a + 2*log(b); } }; Internal Library API Provides This Algorithm Developer Writes These

  37. Functor Efficiency on CPU and GPU • Data: noise.silo • Surface normal

  38. Binding Values to Functors structScaleByConst { float scale; ScaleByConst(float s) : scale(s) { } voidoperator()(float &a, float &b) { b = a * scale; } }; voidCalculateDensity(...) { //... cell_volume = mesh_volume / mesh_numcells; CellToCellUnaryOp(mass, density, ScaleByConst(1.0/cell_volume)); }

  39. Data Parallelism Basics

  40. Map with 1 input, 1 output Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 result 6 14 0 2 8 0 0 8 10 6 2 0 structf { float operator()(float x) { return x*2; } };

  41. Map with 2 inputs, 1 output With two input arrays, the functor takes two inputs. You can also have multiple outputs. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 y 2 4 2 1 8 3 9 5 5 1 2 1 result 5 11 2 2 12 3 9 9 10 4 3 1 structf { float operator()(float a, floatb) { return a+b; } };

  42. Scatter with 1 input (and thus 1 output) Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.) Often used in a scatter_if–type construct. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 indices 2 4 1 5 5 0 4 2 1 2 1 4 result 0 1 3 0 4 No functor

  43. Gather with 1 input (and thus 1 output) Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 indices 1 9 6 9 3 result 7 3 0 3 1 No functor

  44. Reduction with 1 input (and thus 1 output) Example: max-reduction. Sum is also common. Often a fat-tree-based implementation. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 result 7 structf { float operator()(float a, floatb) { return a>b ? a : b; } };

  45. Inclusive Prefix Sum (a.k.a. Scan)with 1 input/output Value at result[i] is sum of values x[0]..x[i]. Surprisingly efficient parallel implementation. Basis for many more complex algorithms. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 + + + + + + + + + + + result 3 10 10 11 15 15 15 19 24 27 28 28 No functor.

  46. Exclusive Prefix Sum (a.k.a. Scan)with 1 input/output Initialize with zero, value is sum of only up to x[i-1]. May be more commonly used than inclusive scan. 0 1 2 3 4 5 6 7 8 9 10 11 x 3 7 0 1 4 0 0 4 5 3 1 0 + + + + + + + + + + + 0 result 0 3 10 10 11 15 15 15 19 24 27 28 No functor.

  47. Data Parallelism on Meshes

  48. Example: Surface Normal • For each 2D cell(i.e. each polygon): • Get three adjacent points • Pair-wise vector subtract • Cross product • Data-parallel: • Repeat for all cells

  49. Example: Surface Normal • OUTPUT: • 3-component surface normals array onthe mesh CELLS • example: length = 4 • INPUT: • 3-dimensional coordinates array on the mesh NODES • example: length = 9

  50. Under the Covers: Node-to-Cell on CPU void NodeToCellOp3::ExecuteCPU() { #pragmaomp parallel for for (inti=0; i<input->NumCells(); i++) { // get cell node indices intnNodes, nodeIds[8]; floatnodeValues[3][8]; conn.GetCellNodes(index, nNodes, nodeIds); // get coordinates for nodes for (inti=0; i<nNodes; i++) { nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; } // call functor functor(nodeValues[0], nodeValues[1], nodeValues[2], &out0[i], &out1[i], &out2[i]); } }

More Related