1 / 91

Overview of the Global Arrays Parallel Software Development Toolkit

Overview of the Global Arrays Parallel Software Development Toolkit. Jarek Nieplocha 1 , Bruce Palmer 1 , Manojkumar Krishnan 1 , Vinod Tipparaju 1, P. Saddayappan 2 1 Pacific Northwest National Laboratory 2 Ohio State University. Outline. Writing, Building, and Running GA Programs

terrancec
Télécharger la présentation

Overview of the Global Arrays Parallel Software Development Toolkit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of the Global ArraysParallel Software Development Toolkit Jarek Nieplocha1 , Bruce Palmer1, Manojkumar Krishnan1, Vinod Tipparaju1, P. Saddayappan2 1Pacific Northwest National Laboratory 2Ohio State University

  2. Outline • Writing, Building, and Running GA Programs • Basic Calls • Intermediate Calls • Advanced Calls Global Arrays Tutorial

  3. Writing, Building and Running GA programs • Writing GA programs • Compiling and linking • Running GA programs • For detailed information • GA Webpage • GA papers, APIs, user manual, etc. • (Google: Global Arrays) • http://www.emsl.pnl.gov/docs/global/ • GA User Manual • http://www.emsl.pnl.gov/docs/global/user.html • GA API Documentation • GA Webpage => User Interface • http://www.emsl.pnl.gov/docs/global/userinterface.html • GA Support/Help • hpctools@pnl.gov or hpctools@emsl.pnl.gov • 2 Mailing lists: GA User Forum, GA Announce Global Arrays Tutorial

  4. Writing GA Programs • GA Definitions and Data types • C programs include files:ga.h, macdecls.h' • Fortran programs should include the files: mafdecls.fh, global.fh. • GA Initialize, GA_Terminate --> initializes and terminates GA library #include <stdio.h> #include "mpi.h“ #include "ga.h" #include "macdecls.h" int main( int argc, char **argv ) { MPI_Init( &argc, &argv ); GA_Initialize(); printf( "Hello world\n" ); GA_Terminate(); MPI_Finalize(); return 0; } Global Arrays Tutorial

  5. Writing GA Programs • GA requires the following functionalities from a message passing library (MPI/TCGMSG) • initialization and termination of processes • Broadcast, Barrier • a function to abort the running parallel job in case of an error • The message-passing library has to be, • initialized before the GA library • terminated after the GA library is terminated • GA Compatible with MPI #include <stdio.h> #include "mpi.h“ #include "ga.h" #include "macdecls.h" int main( int argc, char **argv ) { MPI_Init( &argc, &argv ); GA_Initialize(); printf( "Hello world\n" ); GA_Terminate(); MPI_Finalize(); return 0; } Global Arrays Tutorial

  6. Compiling and Linking GA Programs • 2 ways • GA Makefile in global/testing • Your Makefile • GA Makefile in global/testing • To compile and link your GA based program, for example “app.c” (or “app.f”, ..) • Copy to $GA_DIR/global/testing, and type • make app.x or gmake app.x • Compile any test program in GA testing directory, and use the appropriate compile/link flags in your program Global Arrays Tutorial

  7. Compiling and Linking GA Programs (cont.) • Your Makefile • Please refer to the INCLUDES, FLAGS and LIBS variables, which will be printed at the end of successful GA installation on your platform • You can use these variables in your Makefile • For example: gcc $(INCLUDES) $(FLAGS) –o ga_test ga_test.c $(LIBS) INCLUDES = -I./include -I/msrc/apps/mpich-1.2.6/gcc/ch_shmem/include LIBS = -L/msrc/home/manoj/GA/cvs/lib/LINUX -lglobal -lma -llinalg -larmci -L/msrc/apps/mpich-1.2.6/gcc/ch_shmem/lib -lmpich –lm For Fortran Programs: FLAGS= -g -Wall -funroll-loops -fomit-frame-pointer -malign-double -fno-second-underscore -Wno-globals For C Programs: FLAGS = -g -Wall -funroll-loops -fomit-frame-pointer -malign-double -fno-second-underscore -Wno-globals • NOTE: Please refer to GA user manual chapter 2 for more information Global Arrays Tutorial

  8. Running GA Programs • Example: Running a test program “ga_test” on 2 processes • mpirun -np 2 ga_test • Running GA program is same as MPI Global Arrays Tutorial

  9. Setting GA Memory Limits • GA offers an optional mechanism that allows a programmer to limit aggregate memory consumption used by GA for storing Global Array data • Good for systems with limited-memory (e.g. BlueGene/L) • GA uses Memory Allocator (MA) library for internal temporary buffers • allocated dynamically, and deallocated when operation  completes. • Users should specify memory usage of their application during GA initialization • Can be specified in MA_Init(..) call after GA_Initialize() ! Initialization of MA and setting GA memory limits call ga_initialize() if (ga_uses_ma()) then status = ma_init(MT_DBL, stack, heap+global) else status = ma_init(mt_dbl,stack,heap) call ga_set_memory_limit(ma_sizeof(MT_DBL,global,MT_BYTE)) endif if(.not. status) ... !we got an error condition here Global Arrays Tutorial

  10. Outline • Writing, Building, and Running GA Programs • Basic Calls • Intermediate Calls • Advanced Calls Global Arrays Tutorial

  11. GA Basic Operations • GA programming model is very simple. • Most of the parallel programs can be written with these basic calls • GA_Initialize, GA_Terminate • GA_Nnodes, GA_Nodeid • GA_Create, GA_Destroy • GA_Put, GA_Get • GA_Sync Global Arrays Tutorial

  12. GA Initialization/Termination • There are two functions to initialize GA: • Fortran • subroutine ga_initialize() • subroutine ga_initialize_ltd(limit) • C • void GA_Initialize() • void GA_Initialize_ltd(size_t limit) • To terminate a GA program: • Fortran subroutine ga_terminate() • C void GA_Terminate() integer limit - amount of memory in bytes per process     [input] program main #include “mafdecls.h” #include “global.fh” integer ierr c call mpi_init(ierr) call ga_initialize() c write(6,*) ‘Hello world’ c call ga_terminate() call mpi_finilize() end Global Arrays Tutorial

  13. Parallel Environment - Process Information • Parallel Environment: • how many processes are working together (size) • what their IDs are (ranges from 0 to size-1) • To return the process ID of the current process: • Fortran integer function ga_nodeid() • C int GA_Nodeid() • To determine the number of computing processes: • Fortran integer function ga_nnodes() • C int GA_Nnodes() • To determine the coordinates of a processor: • Fortran subroutine nga_proc_topology(ga, proc, coordinates) • C void NGA_Proc_topology(int g_a, int proc, int coordinates[]) integer g_a [input] integer proc [input] integer coordinates(ndim) coordinates in processor grid [output] Global Arrays Tutorial

  14. Parallel Environment - Process Information (EXAMPLE) program main #include “mafdecls.h” #include “global.fh” integer ierr,me,nproc call mpi_init(ierr) call ga_initialize() me = ga_nodeid() size = ga_nnodes() write(6,*) ‘Hello world: My rank is ’ + me + ‘ out of ‘ + ! size + ‘processes/nodes’ call ga_terminate() call mpi_finilize() end Sample Output: mpirun –np 4 helloworld Hello world: My rank is 0 out of 4 processes/nodes Hello world: My rank is 1 out of 4 processes/nodes Hello world: My rank is 2 out of 4 processes/nodes Hello world: My rank is 3 out of 4 processes/nodes Global Arrays Tutorial

  15. GA Data Types • C Data types • C_INT - int • C_LONG - long • C_FLOAT - float • C_DBL - double • C_DCPL - double complex • Fortran Data types • MT_F_INT - integer (4/8 bytes) • MT_F_REAL - real • MT_F_DBL - double precision • MT_F_DCPL - double complex Global Arrays Tutorial

  16. Creating/Destroying Arrays • To create an array with regular distribution: • Fortran logical function nga_create(type, ndim, dims, name, chunk, g_a) • C int NGA_Create(int type, int ndim, int dims[], char *name, int chunk[]) character*(*) name - a unique character string [input] integer type - GA data type [input] integer dims() - array dimensions [input] integer chunk() - minimum size that dimensions should be chunked into [input] integer g_a - handle for future references [output] dims(1) = 5000 dims(2) = 5000 chunk(1) = -1 !Use defaults chunk(2) = -1 if (.not.nga_create(MT_F_DBL,2,dims,’Array_A’,chunk,g_a)) + call ga_error(“Could not create global array A”,g_a) Global Arrays Tutorial

  17. Creating/Destroying Arrays (cont.) • To create an array with irregular distribution: • Fortran logical function nga_create_irreg (type, ndim, dims, array_name, map, nblock, g_a) • C int NGA_Create_irreg(int type, int ndim, int dims[], int nblock[], int map[]) character*(*) name - a unique character string [input] integer type - GA datatype [input] integer dims (ndim)    - array dimensions [input] integer nblock(ndim) - no. of blocks each dimension is divided into [input] integer map(s) - starting index of each block [input] integer g_a - integer handle for future references [output] Global Arrays Tutorial

  18. Creating/Destroying Arrays (cont.) • Example of irregular distribution: • The distribution is specified as a Cartesian product of distributions for each dimension. The array indices start at 1. • The figure demonstrates distribution of a 2-dimensional array 8x10 on 6 (or more) processors. block[2]={3,2}, the size of map array is s=5 and array map contains the following elements map={1,3,7, 1, 6}. • The distribution is nonuniform because, P1 and P4 get 20 elements each and processors P0,P2,P3, and P5 only 10 elements each. 5 5 P0 P3 2 P1 P4 4 P2 P5 2 block(1) = 3 block(2) = 2 map(1) = 1 map(2) = 3 map(3) = 7 map(4) = 1 map(5) = 6 if (.not.nga_create_irreg(MT_F_DBL,2,dims,’Array_A’,map,block,g_a)) + call ga_error(“Could not create global array A”,g_a) Global Arrays Tutorial

  19. Creating/Destroying Arrays (cont.) • To duplicate an array: • Fortran logical function ga_duplicate(g_a, g_b, name) • C int GA_Duplicate(int g_a, char *name) • Global arrays can be destroyed by calling the function: • Fortran subroutine ga_destroy(g_a)   • C void GA_Destroy(int g_a) integer g_a, g_b; character*(*) name; name - a character string [input] g_a - Integer handle for reference array [input] g_b - Integer handle for new array [output] call nga_create(MT_F_INT,dim,dims, + ‘array_a’,chunk,g_a) call ga_duplicate(g_a,g_b,‘array_b’) call ga_destroy(g_a) Global Arrays Tutorial

  20. Put/Get • Put copies data from the local array to the global array section: • Fortran subroutine nga_put(g_a, lo, hi, buf, ld) • C void NGA_Put(int g_a, int lo[], int hi[], void *buf, int ld[]) • Get copies data from a global array section to the local array: • Fortran subroutine nga_get(g_a, lo, hi, buf, ld) • C void NGA_Get(int g_a, int lo[], int hi[], void *buf, int ld[]) integer g_a global array handle [input] integer lo(),hi() limits on data block to be moved [input] double precision/complex/integer buf local buffer [output] integer ld() array of strides for local buffer [input] Global Arrays Tutorial

  21. Put/Get (cont.) • Example of get operation: • transfer data from the (7:15,1:8) section of a 2-dimensional 15 x10 global array into a local buffer which is a 10 x10 array • lo={7,1}, hi={15,8}, ld={10} global lo local hi double precision buf(10,10) : : call nga_get(g_a,lo,hi,buf,ld) Global Arrays Tutorial

  22. Accumulate • Accumulate combines the data from the local array with data in the global array section: • Fortran subroutine nga_acc(g_a, lo, hi, buf, ld, alpha) • C void NGA_Acc(int g_a, int lo[], int hi[], void *buf, int ld[], void *alpha) integer g_a array handle [input] integer lo(), hi() limits on data block to be moved [input] double precision/complex buf local buffer [input] integer ld() array of strides for local buffer [input] double precision/complex alpha arbitrary scale factor [input] global local ga(i,j) = ga(i,j)+alpha*buf(k,l) Global Arrays Tutorial

  23. Sync • Sync is a collective operation • It acts as a barrier, which synchronizes all the processes and ensures that all the Global Array operations are complete at the call • The functions are: • Fortran subroutine ga_sync() • C void GA_Sync() sync Global Arrays Tutorial

  24. Message-Passing Wrappers to Reduce/Broadcast Operations root • To broadcast from process root to all other processes: • Fortran subroutine ga_brdcst(type, buf, lenbuf, root) • C void GA_Brdcst(void *buf, int lenbuf, int root) integer type       [input] ! message type for broadcast byte buf(lenbuf) [input/output] integer lenbuf [input] integer root [input] double precision buf(lenbuf) : size = 8*lenbuf call ga_brdcst(msg_id,buf,size,0) Global Arrays Tutorial

  25. Message-Passing Wrappers to Reduce/Broadcast Operations (cont.) • To sum elements across all nodes and broadcast result to all nodes: • Fortran • Subroutine ga_igop(type, x, n, op) • Subroutine ga_dgop(type, x, n, op) • C • Void GA_Igop(long x[], int n, char *op) • Void GA_Dgop(double x[], int n, char *op) integer type [input] <type> x(n) vector of elements [input/output] character*(*) OP operation [input] integer n number of elements [input] double precision rbuf(len) : call ga_dgop(msg_id,rbuf,len,’+’) Global Arrays Tutorial

  26. Shared Object Shared Object get copy to local memory copy to shared object put compute/update local memory local memory local memory Global Array Model of Computations Global Arrays Tutorial

  27. Locality Information • Discover array elements held by each processor • Fortran nga_distribution(g_a,proc,lo,hi) • C void NGA_Distribution(int g_a, int proc, int *lo, int *hi) integer g_a array handle [input] integer proc processor ID [input] Integer lo(ndim) lower index [output] Integer hi(ndim) upper index [output] do iproc = 1, nproc write(6,*) ‘Printing g_a info for processor’,iproc call nga_distribution(g_a,iproc,lo,hi) do j = 1, ndim write(6,*) j,lo(j),hi(j) end do dnd do Global Arrays Tutorial

  28. Example: Matrix Multiply global arrays representing matrices = • nga_put nga_get = • dgemm local buffers on the processor Global Arrays Tutorial

  29. Matrix Multiply (a better version) more scalable! (less memory, higher parallelism) = • atomic accumulate get = • dgemm local buffers on the processor Global Arrays Tutorial

  30. Outline • Writing, Building, and Running GA Programs • Basic Calls • Intermediate Calls • Advanced Calls Global Arrays Tutorial

  31. Basic Array Operations • Whole Arrays: • To set all the elements in the array to zero: • Fortran subroutine ga_zero(g_a) • C void GA_Zero(int g_a) • To assign a single value to all the elements in array: • Fortran subroutine ga_fill(g_a, val) • C void GA_Fill(int g_a, void *val) • To scale all the elements in the array by factorval: • Fortran subroutine ga_scale(g_a, val) • C void GA_Scale(int g_a, void *val) Global Arrays Tutorial

  32. 0 1 2 3 4 5 0 1 2 6 7 8 3 4 5 6 7 8 Basic Array Operations (cont.) • Whole Arrays: • To copy data between two arrays: • Fortran subroutine ga_copy(g_a, g_b) • C void GA_Copy(int g_a, int g_b) • Arrays must be same size and dimension • Distribution may be different call ga_create(MT_F_INT,ndim,dims, ‘array_A’,chunk_a,g_a) call nga_create(MT_F_INT,ndim,dims, ‘array_B’,chunk_b,g_b) ... Initialize g_a .... call ga_copy(g_a, g_b) “g_a” “g_b” Global Arrays g_a and g_b distributed on a 3x3 process grid Global Arrays Tutorial

  33. 0 1 2 0 1 2 3 4 5 3 4 5 6 7 8 6 7 8 Basic Array Operations (cont.) • Patch Operations: • The copy patch operation: • Fortran subroutine nga_copy_patch(trans, g_a, alo, ahi, g_b, blo, bhi) • C void NGA_Copy_patch(char trans, int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[]) • Number of elements must match “g_a” “g_b” Copy Global Arrays Tutorial

  34. Basic Array Operations (cont.) • Patches (Cont): • To set only the region defined by lo and hi to zero: • Fortran subroutine nga_zero_patch(g_a, lo, hi) • C void NGA_Zero_patch(int g_a, int lo[] int hi[]) • To assign a single value to all the elements in a patch: • Fortran subroutine nga_fill_patch(g_a, lo, hi, val) • C voidNGA_Fill_patch(int g_a, int lo[] int hi[], void *val) • To scale the patch defined by lo and hi by the factor val: • Fortran subroutine nga_scale_patch(g_a, lo, hi, val) • C voidNGA_Scale_patch(int g_a, int lo[] int hi[], void *val) • The copy patch operation: • Fortran subroutine nga_copy_patch(trans, g_a, alo, ahi, g_b, blo, bhi) • C void NGA_Copy_patch(char trans, int g_a, int alo[], int ahi[], int g_b, int blo[], int bhi[]) Global Arrays Tutorial

  35. Scatter/Gather • Scatter puts array elements into a global array: • Fortran subroutine nga_scatter(g_a, v, subscrpt_array, n) • C void NGA_Scatter(int g_a, void *v, int *subscrpt_array[], int n) • Gather gets the array elements from a global array into a local array: • Fortran subroutine nga_gather(g_a, v, subscrpt_array, n) • C void NGA_Gather(int g_a, void *v, int *subscrpt_array[], int n) integer g_a array handle [input] double precision v(n) array of values [input/output] Integer n number of values [input] integer subscrpt_array location of values in global array [input] Global Arrays Tutorial

  36. Scatter/Gather (cont.) • Example of scatter operation: • Scatter the 5 elements into a 10x10 global array • Element 1 v[0] = 5 subsArray[0][0] = 2 subsArray[0][1] = 3 • Element 2 v[1] = 3 subsArray[1][0] = 3 subsArray[1][1] = 4 • Element 3 v[2] = 8 subsArray[2][0] = 8 subsArray[2][1] = 5 • Element 4 v[3] = 7 subsArray[3][0] = 3 subsArray[3][1] = 7 • Element 5 v[4] = 2 subsArray[4][0] = 6 subsArray[4][1] = 3 • After the scatter operation, the five elements would be scattered into the global array as shown in the figure. integer subscript(ndim,nlen) : call nga_scatter(g_a,v,subscript,nlen) Global Arrays Tutorial

  37. Cluster Information • To return the total number of nodes that the program is running on: • Fortran integer function ga_cluster_nnodes() • C int GA_Cluster_nnodes() • To return the node ID of the process: • Fortran integer function ga_cluster_nodeid() • C int GA_Cluster_nodeid() Global Arrays Tutorial

  38. Cluster Information (cont.) • To return the number of processors available on node inode: • Fortran integer function ga_cluster_nprocs(inode) • C int GA_Cluster_nprocs(int inode). • To return the processor ID associated with node inode and the local processor ID iproc: • Fortran integer function ga_cluster_procid(inode, iproc) • C int GA_Cluster_procid(int inode, int iproc) integer inode [input] inode [input] integer inode,iproc [input] inode,iproc [input] Global Arrays Tutorial

  39. Cluster Information (cont.) • Example: • 2 nodes with 4 processors each. Say, there are 7 processes created. • Assume 4 processes on node 0 and 3 processes on node 1. • In this case: • number of nodes=2, node id is either 0 or 1 (e.x., nodeid of process 2 is 0) • number of processes in node 0 is 4 and node 1 is 3 • The global rank of each process and the local rank (rank of the process within the node.i.e.cluster_procid) is shown in the paranthesis. Global Arrays Tutorial

  40. Read and Increment • Read_inc remotely updates a particular element in an integer global array: • Fortran integer function nga_read_inc(g_a, subscript, inc) • C long NGA_Read_inc(int g_a, int subscript[], long inc) • Applies to integer arrays only • Example: can be used as a global counter integer g_a [input] integer subscript(ndim), inc [input] c Create task counter call nga_create(MT_F_INT,one,one,chunk,g_counter) call ga_zero(g_counter) : itask = nga_read_inc(g_counter,one,one) ... Translate itask into task ... Global Arrays Tutorial

  41. Outline • Writing, Building, and Running GA Programs • Basic Calls • Intermediate Calls • Advanced Calls Global Arrays Tutorial

  42. Access • To provide direct access to local data in the specified patch of the array owned by the calling process: • Fortran subroutine nga_access(g_a, lo, hi, index, ld) • C void NGA_Access(int g_a, int lo[], int hi[], void *ptr, int ld[]) • Processes can access the local position of the global array • Process “0” can access the specified patch of its local position of the array • Avoids memory copy 0 1 2 Access: gives a pointer to this local patch 3 4 5 6 7 8 call nga_create(MT_F_DBL,2,dims,’Array’,chunk,g_a) : call nga_distribution(g_a,me,lo,hi) call nga_access(g_a,lo,hi,index,ld) call do_subroutine_task(dbl_mb(index),ld(1)) subroutine do_subroutine_task(a,ld1) double precision a(ld1,*) Global Arrays Tutorial

  43. Locality Information (cont.) • Global Arrays support abstraction of a distributed array object • Object is represented by an integer handle • A process can access its portion of the data in the global array • To do this, the following steps need to be taken: • Find the distribution of an array, which part of the data the calling process own • Access the data • Operate on the data: read/write • Release the access to the data Global Arrays Tutorial

  44. Non-blocking Operations • The non-blocking APIs are derived from the blocking interface by adding a handle argument that identifies an instance of the non-blocking request. • Fortran • subroutine nga_nbput(g_a, lo, hi, buf, ld, nbhandle) • subroutine nga_nbget(g_a, lo, hi, buf, ld, nbhandle) • subroutine nga_nbacc(g_a, lo, hi, buf, ld, alpha, nbhandle) • subroutine nga_nbwait(nbhandle) • C • void NGA_NbPut(int g_a, int lo[], int hi[], void *buf,int ld[], ga_nbhdl_t* nbhandle) • void NGA_NbGet(int g_a, int lo[], int hi[], void *buf, int ld[], ga_nbhdl_t* nbhandle) • void NGA_NbAcc(int g_a, int lo[], int hi[], void *buf, int ld[], void *alpha, ga_nbhdl_t* nbhandle) • int NGA_NbWait(ga_nbhdl_t* nbhandle) integer nbhandle - non-blocking request handle [output/input] Global Arrays Tutorial

  45. Non-Blocking Operations double precision buf1(nmax,nmax) double precision buf2(nmax,nmax) : call nga_nbget(g_a,lo1,hi1,buf1,ld1,nb1) ncount = 1 do while(.....) if (mod(ncount,2).eq.1) then ... Evaluate lo2, hi2 call nga_nbget(g_a,lo2,hi2,buf2,nb2) call nga_wait(nb1) ... Do work using data in buf1 else ... Evaluate lo1, hi1 call nga_nbget(g_a,lo1,hi1,buf1,nb1) call nga_wait(nb2) ... Do work using data in buf2 endif ncount = ncount + 1 end do Global Arrays Tutorial

  46. SUMMA Matrix Multiplication Issue NB Get A and B blocks do (until last chunk) issue NB Get to the next blocks wait for previous issued call compute A*B (sequential dgemm) NB atomic accumulate into “C” matrix done Computation Comm. (Overlap) C=A.B A B Advantages: - Minimum memory - Highly parallel - Overlaps computation and communication - latency hiding - exploits data locality - patch matrix multiplication (easy to use) - dynamic load balancing = patch matrix multiplication Global Arrays Tutorial

  47. SUMMA Matrix Multiplication:Improvement over PBLAS/ScaLAPACK Global Arrays Tutorial

  48. Processor Groups • To create a new processor group: • Fortran integer function ga_pgroup_create(list, size) • C intGA_Pgroup_create(int *list, int size) • To assign a processor groups: • Fortran subroutine ga_set_pgroup(g_a, p_handle) • C          voidGA_Set_pgroup(int g_a, int p_handle) integer g_a - global array handle [input] integer p_handle - processor group handle [input] list[size] list of processor IDs in group [input] size number of processors in group [input] Global Arrays Tutorial

  49. Processor Groups group A group B group C world group Global Arrays Tutorial

  50. Processor Groups (cont.) • To set the default processor group • Fortran subroutine ga_pgroup_set_default(p_handle) • C voidGA_Pgroup_set_default(int p_handle) • To access information about the processor group: • Fortran • integer function ga_pgroup_nnodes(p_handle) • integer function ga_pgroup_nodeid(p_handle) • C • intGA_Pgroup_nnodes(int p_handle) • intGA_Pgroup_nodeid(int p_handle) integer p_handle - processor group handle [input] Global Arrays Tutorial

More Related