110 likes | 120 Vues
Chombo is a flexible software framework for block-structured Adaptive Mesh Refinement (AMR) that supports a wide range of applications. It provides reusable components and supports mixed-language models, leveraging public domain standards. Chombo also includes tools for managing interactions between different refinement levels, solver libraries, and complete parallel applications. It offers a distributed data mechanism for efficient data distribution and communication between processors. Chombo supports HDF5 I/O and has potential for collaboration with other projects.
E N D
I/O for Structured-Grid AMRPhil ColellaLawrence Berkeley National LaboratoryCoordinating PI, APDEC CET
Block-Structured Local Refinement (Berger and Oliger, 1984) Refined regions are organized into rectangular patches. Refinement performed in time as well as in space.
Stakeholders SciDAC projects: • Combustion, astrophysics (cf. John Bell’s talk). • MHD for tokomaks (R. Samtaney). • Wakefield accelerators (W. Mori, E. Esarey). • AMR visualization and analytics collaboration (VACET). • AMR elliptic solver benchmarking / performance collaboration (PERI, TOPS). Other projects: • ESL edge plasma project - 5D gridded data (LLNL, LBNL). • Cosmology - AMR Fluids + PIC (F. Miniati, ETH). • Systems biology - PDE in complex geometry (A. Arkin, LBNL). Larger structured-grid AMR community: Norman (UCSD), Abel (SLAC), Flash (Chicago), SAMRAI (LLNL)… We all talk to each other, have common requirements.
Chombo: a Software Framework for Block-Structured AMRRequirement: to support a wide variety of applications that use block-structured AMR using a common software framework. • Mixed-language model: C++ for higher level data structures, Fortran for regular single-grid calculations. • Reusable components: Component design based on mapping of mathematical abstractions to classes. • Build on public domain standards: MPI, HDF5, VTK. Previous work: BoxLib (LBNL/CCSE), KeLP (Baden, et. al., UCSD), FIDIL (Hilfinger and Colella).
Layered Design • Layer 1. Data and operations on unions of boxes - set calculus, rectangular array library (with interface to Fortran), data on unions of rectangles, with SPMD parallelism implemented by distributing boxes over processors. • Layer 2. Tools for managing interactions between different levels of refinement in an AMR calculation - interpolation, averaging operators, coarse-fine boundary conditions. • Layer 3. Solver libraries - AMR-multigrid solvers, Berger-Oliger time-stepping. • Layer 4. Complete parallel applications. • Utility layer. Support, interoperability libraries - API for HDF5 I/O, visualization package implemented on top of VTK, C API’s.
Distributed Data on Unions of RectanglesProvides a general mechanism for distributing data defined on unions of rectangles onto processors, and communication between processors. • Metadata of which all processors have a copy:BoxLayoutis a collection of Boxes and processor assignment. • template <class T> LevelData<T>and other container classes hold data distributed over multiple processors. For each k=1 ... nGrids, an “array” of type Tcorresponding to the boxBk is located on processor pk. Straightforward API’s for copying, exchanging ghost cell data, iterating over the arrays on your processor in a SPMD manner.
Typical I/O requirements • Loads are balanced to fill available memory on all processors. • Typical output data size corresponding to a single time slice: 10% - 100% of total memory image. • Current problems scale to 100 - 1000 processors. • Combustion and astrophysics simulations write one file / processor; other applications use Chombo API for HDF5.
P0 P1 P2 3 4 7 1 5 6 2 8 9 1 2 3 4 5 6 7 8 9 HDF5 I/O • Disk File “/ “ • Group subdirectory • Attribute, dataset files. Attribute: small metadata that multiple processes in a SPMD program can write out redundantly. Dataset: large data, each processor writes out only what it owns. • Chombo API for HDF5 • Parallel neutral: can change processor layout when re-inputting output data. • Dataset creation is expensive: create only one dataset for each LevelData. The data for each patch is written into offsets from the origin of that dataset.
3 4 7 1 5 6 2 8 9 3 4 7 1 5 6 2 8 9 3 4 7 1 5 6 2 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 Performance Analysis (Shan and Shalf, 2006) • Observed performance of HDF5 applications in Chombo: no (weak) scaling. More detailed measurements indicate two causes: misalignment with disk block boundaries, lack of aggregation.
Future Requirements • Weak scaling to 104 processors. • Need fo finer time resolution will add another 10x in data. • Other data types: sparse data, particles. • One file / processor doesn’t scale. • Interfaces to VACET, FastBit.
Potential for Collaboration with SDM • Common AMR data API developed under SciDAC I. • APDEC weak scaling benchmark for solvers could be extended to I/O. • Minimum buy-in: high-level API, portability, sustained support.