1 / 19

Distributed Array Component based on Global Arrays

Distributed Array Component based on Global Arrays. Manoj Krishnan, Jarek Nieplocha High Performance Computing Group Pacific Northwest National Laboratory CCA Forum. Overview. Global Arrays Distributed Array Component Core Capabilities Applications Future Work.

verlee
Télécharger la présentation

Distributed Array Component based on Global Arrays

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Array Component based on Global Arrays Manoj Krishnan, Jarek Nieplocha High Performance Computing Group Pacific Northwest National Laboratory CCA Forum

  2. Overview • Global Arrays • Distributed Array Component • Core Capabilities • Applications • Future Work

  3. physically distributed dense array Global Arrays • shared memory model in context of distributed dense arrays • complete environment for parallel code development • compatible with MPI • ~140 functions • data locality control similar to distributed memory/message passing model single, shared data structure global indexing e.g.,A(4,3) rather than buf(7) on task 2

  4. Shared Object Shared Object 1-sided communication get 1-sided communication put copy to shared object copy to local memory compute/update local memory local memory local memory Global Array Model of Computations

  5. Structure of GA application interfaces Fortran 77, C, C++, Python, SIDL distributed arrays layer memory management, index translation Message Passing process creation, run-time environment ARMCI portable 1-sided communication put,get, locks, etc system specific interfaces LAPI, GM/Myrinet, Elan/Quadrics, threads, VIA,..

  6. GA GA Classic DADF Linear Algebra Distributed Array Component • GAComponent: Classic and SIDL Interfaces • 36+98 (direct+indirect) global arrays classic methods are available through GAClassicPort • GADADFPortprovides methods, proposed by Data Working Group of CCA Forum, for creating array descriptors and templates

  7. GA Classic Port • GAClassicPort • provides public interfaces for creating and accessing distributed arrays i.e.,GlobalArrayobjects • GlobalArray • encapsulate all details of the data distribution, addressing, and data access . • offers a set of operations for • one-sided data transfer operations (get, put, scatter, gather, etc) • collective array operations • supportive operations for data locality control and queries

  8. class GAClassicPort: public virtual ::classic::gov::cca::Port { /* array creation methods, for example */ virtual GlobalArray* createGA(…) = 0; virtual GlobalArray* createGA_Ghosts(…)=0; /* utility operations like reduce, broadcast, etc.,. */ virtual void brdcst(void *buf, int lenbuf, int root)=0; /* cluster & process information e.g. rank, size*/ nnodes(),clusterNnodes(),clusterNodeid(),clusterNprocs /* Interprocess Synchronization: locks, barrier */ lock(), unlock(), sync(), fence(), createMutexes(), … } /* Total: 36 methods available thru’ this port */ Class GlobalArray { /* one-sided communication operations */ put(), get(), accumulate(), scatter, gather, ... /* collective array operations (whole and patch) */ copy(), scale(), add(), gemm(), update_ghosts(), ... /* element wise operations, ghost cell methods, matrix operatios etc… */ } /* Total: 98 methods available */

  9. Core Capabilities • Distributed array • dense arrays 1-7 dimensions • four data types: integer, real,double precision, double complex • global rather than per-task view of data structures • user control over data distribution: regular and irregular • Collective and shared-memory style operations • Support for ghost cells • Interfaces to third party parallel numerical libraries • PeIGS, Scalapack, SUMMA, TAO

  10. GA DADF Port • Provides standard interface for defining, creating and querying distributed arrays • Supports creating, cloning and destruction of arrays, array templates and descriptors • DADF-Distributed Array Descriptor Factory by Data Working Group of CCA forum. • DADF Array • creates a distributed array • DADF Template: • Virtual multi-dimensional array to which one or more actual distributed arrays may be aligned • DADF Descriptor • To query an existing distributed array

  11. class DADFPort: : public virtual ::classic::gov::cca::Port { /* methods to create/clone/destroy dscr,array,templates*/ virtual DistArrayDescriptor * createDescriptor(..) = 0; virtual DistArray * createArray (…) = 0; virtual DistArrayTemplate* createTemplate(…) = 0; ... } class DistArray { /** Set data type. */ virtual int setDataType(const enum DataType type) = 0; /** Associate this data object with distribution template. */ virtual int setTemplate(DistArrayTemplate * & templ) = 0; /** Sets this process's location in the process topology. */ virtual int setMyProcCoords(const int procCoords[] ) = 0; /** Align object to template with identity mapping. */ virtual int setIdentityAlignmentMap() = 0; /** Signal that data object is completely defined. */ virtual int commit() = 0; ... /* set of query & miscellaneous functions */ }

  12. Class Hierarchy DistArrayTemplate DistArray DADFTemplate DADFDescriptor DADFArray GA X Example DAs

  13. GA TAO addProvidesPort registerUsesPort getPort (“ga”) CCA Services CCA Services GA DADF LA LA GA/TAO Interoperability • TAO - optimization component (Toolkit for Advanced Optimization – ANL) provides advanced optimization algorithms • GA provides TAO core linear algebra support for manipulating vectors, matrices, and linear solvers thru’ LinearAlgebraPort(LA)

  14. GA LJMD addProvidesPort registerUsesPort CCA Services GA getPort (“ga”) DADF LA CCA Services GA GA Component in Applications (I) • Lennard Jones Molecular Dynamics • Force decomposition method & dynamic load balancing (improves performance over the traditional message-passing version by S.Plimpton, Sandia) • Component overhead is negligible (<1%) • Good scaling (simulation of 12,000 atoms yields a speedup of 7.86 on 8 processors)

  15. GA Component in Applications (II) • Chemistry: Molecular geometry optimization (between GA and TAO)

  16. CCA Services GA CFD Solver GA registerUsesPort registerUsesPort CCA Services GA addProvidesPort CCA Services GA getPort (“ga”) DADF LA getPort (“ga”) Visualization registerUsesPort getPort (“ga”) CCA Services GA

  17. Applications Areas electronic structure biology glass flow simulation Visualization and image analysis thermal flow simulation material sciences molecular dynamics Others: financial security forecasting, astrophysics, geosciences

  18. Future Work • Additional capabilities in GA component including operations necessary for supporting more TAO optimization algorithms. • will also involve new nonblocking communication interfaces. • Implementation of component that interfaces secondary storage (parallel I/O). • Verify component usability for large apps • Study performance and overhead associated with CCA • ESI (or any generic solver) interfaces to distributed array component

  19. Feedback • Provide a generic distributed array component • We would like to know • Applications that need distributed array components • Functionalities expected from apps • Additions/modifications required • Suggestions to make it more generic • Communication interfaces in DADF (put/get) ..? • Setting up priorities based on feedback

More Related