Chombo: A Software Framework for Structured-Grid AMR

I/O for Structured-Grid AMRPhil ColellaLawrence Berkeley National LaboratoryCoordinating PI, APDEC CET

Block-Structured Local Refinement (Berger and Oliger, 1984) Refined regions are organized into rectangular patches. Refinement performed in time as well as in space.

Stakeholders SciDAC projects: • Combustion, astrophysics (cf. John Bell’s talk). • MHD for tokomaks (R. Samtaney). • Wakefield accelerators (W. Mori, E. Esarey). • AMR visualization and analytics collaboration (VACET). • AMR elliptic solver benchmarking / performance collaboration (PERI, TOPS). Other projects: • ESL edge plasma project - 5D gridded data (LLNL, LBNL). • Cosmology - AMR Fluids + PIC (F. Miniati, ETH). • Systems biology - PDE in complex geometry (A. Arkin, LBNL). Larger structured-grid AMR community: Norman (UCSD), Abel (SLAC), Flash (Chicago), SAMRAI (LLNL)… We all talk to each other, have common requirements.

Chombo: a Software Framework for Block-Structured AMRRequirement: to support a wide variety of applications that use block-structured AMR using a common software framework. • Mixed-language model: C++ for higher level data structures, Fortran for regular single-grid calculations. • Reusable components: Component design based on mapping of mathematical abstractions to classes. • Build on public domain standards: MPI, HDF5, VTK. Previous work: BoxLib (LBNL/CCSE), KeLP (Baden, et. al., UCSD), FIDIL (Hilfinger and Colella).

Layered Design • Layer 1. Data and operations on unions of boxes - set calculus, rectangular array library (with interface to Fortran), data on unions of rectangles, with SPMD parallelism implemented by distributing boxes over processors. • Layer 2. Tools for managing interactions between different levels of refinement in an AMR calculation - interpolation, averaging operators, coarse-fine boundary conditions. • Layer 3. Solver libraries - AMR-multigrid solvers, Berger-Oliger time-stepping. • Layer 4. Complete parallel applications. • Utility layer. Support, interoperability libraries - API for HDF5 I/O, visualization package implemented on top of VTK, C API’s.

Distributed Data on Unions of RectanglesProvides a general mechanism for distributing data defined on unions of rectangles onto processors, and communication between processors. • Metadata of which all processors have a copy:BoxLayoutis a collection of Boxes and processor assignment. • template <class T> LevelData<T>and other container classes hold data distributed over multiple processors. For each k=1 ... nGrids, an “array” of type Tcorresponding to the boxBk is located on processor pk. Straightforward API’s for copying, exchanging ghost cell data, iterating over the arrays on your processor in a SPMD manner.

Typical I/O requirements • Loads are balanced to fill available memory on all processors. • Typical output data size corresponding to a single time slice: 10% - 100% of total memory image. • Current problems scale to 100 - 1000 processors. • Combustion and astrophysics simulations write one file / processor; other applications use Chombo API for HDF5.

P0 P1 P2 3 4 7 1 5 6 2 8 9 1 2 3 4 5 6 7 8 9 HDF5 I/O • Disk File “/ “ • Group subdirectory • Attribute, dataset files. Attribute: small metadata that multiple processes in a SPMD program can write out redundantly. Dataset: large data, each processor writes out only what it owns. • Chombo API for HDF5 • Parallel neutral: can change processor layout when re-inputting output data. • Dataset creation is expensive: create only one dataset for each LevelData. The data for each patch is written into offsets from the origin of that dataset.

3 4 7 1 5 6 2 8 9 3 4 7 1 5 6 2 8 9 3 4 7 1 5 6 2 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 Performance Analysis (Shan and Shalf, 2006) • Observed performance of HDF5 applications in Chombo: no (weak) scaling. More detailed measurements indicate two causes: misalignment with disk block boundaries, lack of aggregation.

Future Requirements • Weak scaling to 104 processors. • Need fo finer time resolution will add another 10x in data. • Other data types: sparse data, particles. • One file / processor doesn’t scale. • Interfaces to VACET, FastBit.

Potential for Collaboration with SDM • Common AMR data API developed under SciDAC I. • APDEC weak scaling benchmark for solvers could be extended to I/O. • Minimum buy-in: high-level API, portability, sustained support.

Chombo: A Software Framework for Structured-Grid AMR

Chombo: A Software Framework for Structured-Grid AMR

Presentation Transcript

Structured Programming (Top Down Step Refinement)

Security and Refinement

Local Anesthetics And Peripheral Nerve Block

Steganalysis of Block-Structured Stegotext

‘The Missing Third : Why Economics became a dismal science,

VNX Block Local Replication Principles

Parallel Block Adaptive Mesh Refinement For Multiphase Flows

Scalable Algorithms for Structured Adaptive Mesh Refinement

Urbanization and Land Cover Change in Dakota County, Minnesota

Block 2 – Local Government

Replacement and Refinement

Computational Support for Parallel/Distributed AMR

Stabilization and Refinement

Local Emergency Planning Committees

Refinement

LOCAL SELF-GOVERNMENT IN PORTUGAL

Replacement and Refinement