310 likes | 477 Vues
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University. Purpose. To give you: … an overview of some new system-level MPI functions … access to tools that you need to compile and run MPI jobs
E N D
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University
Purpose To give you: • … an overview of some new system-level MPI functions • … access to tools that you need to compile and run MPI jobs • … some instruction in the creation and use of Makefiles • … some instruction on how to tell time in C programs.
Makefiles • GNU Make • Enables the end user to build and install a package without worrying about the details. • Automatically figures out which files it needs to update based on which source files have changed. • Not language dependent • Not limited to building a package; can be used to install or uninstall
Makefile Rules • A rule tells Make how to execute a series of commands in order to build a target from source files. • Specifies a list of dependancies • Dependancies should include ALL files that are dependancies for a target target: dependancies .... commands ...
Example Makefile for C Source CC=gcc CFLAGS=-Wall INCLUDES= BINARIES=rand test .SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $*.c all: $(BINARIES) rand.o: rand.c test.o: test.c rand: rand.o $(CC) $(CFLAGS) -o rand rand.o test: test.o $(CC) $(CFLAGS) -o test test.o clean: rm -f a.out core *.o $(BINARIES)
Example Makefile for C Source CC=gcc CFLAGS=-Wall INCLUDES= BINARIES=rand test • Variables • CC is set to use the GCC compiler • For MPI programs, set it to mpicc, not gcc • CFLAGS: • -c: compile • -Wall: set warnings to all
Example Makefile for C Source clean: rm -f a.out core *.o $(BINARIES) • Target “clean”. Use by typing • make clean • Rule states: • In my current directory, run: • rm -f a.out core *.o $(BINARIES) • rm -f a.out core *.o rand test
Example Makefile for C Source .SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $*.c • Makefile instruction on how to handle .c files and turn them into object (.o) files • Compile using $(CC) value with $(CFLAGS) • Compile each individual file into its appropriate .o file
Example Makefile for C Source rand.o: rand.c test.o: test.c rand: rand.o $(CC) $(CFLAGS) -o rand rand.o test: test.o $(CC) $(CFLAGS) -o test test.o • Target: rand or test • Run $(CC) $(CFLAGS) -o rand rand.o • gcc -Wall -o rand rand.o • If you were going to include external libraries to link, they would be linked at the end of the rule.
Random Generator for Matrices • Rand • -f: filename to which to write the matrix • -c: number of matrix columns • -r: number of matrix rows • -h: help documentation • -s: seed • -m: max integer in matrix cells
Random Generator for Matrices • Completely random generation for an m by n matrix • Uses a random seed to create the matrix • Output file • First line contains the number of rows and the number of columns • Subsequent lines contain matrix cell values, one per line.
Random Generator for Matrices • For a Matrix with row length m, cell A[i,j] is on line: • m * i + j + 2 • Lines are not zero-indexed for the purpose of this calculation. • Therefore, for a 5 x 5 matrix (zero-indexed): • A[0, 0] is on line 2 • A[0, 1] is on line 3 • A[4, 4] is on line 26 • A[2, 3] is on line 15
Calculating Running Time in C #include <stdio.h> #include <sys/time.h> int main() { struct timeval begin, end; double time; gettimeofday(&begin, NULL); sleep(10); gettimeofday(&end, NULL); time = (end.tv_sec - begin.tv_sec) +((end.tv_usec - begin.tv_usec) / 1000000.0); printf("This program ran for %f seconds\n", time); return 0; }
C Time • Includes seconds and microseconds • Used by the gettimeofday() system call • gettimeofday() • Returns the number of seconds (and microseconds) since the UNIX Epoch • Is this completely accurate? • No, but it's VERY close (within a few microseconds).
C Time • You MUST use the timeval struct for the gettimeofday() call • On UNIX systems, you need to include sys/time.h to use this. • Calculation of time is: (end seconds – begin seconds) + ((end microseconds – begin microseconds) / 1000000) • You can calculate: • Program run time • Algorithm execution time
PBS (Torque/Maui) • hpc-class job submission system • qsub • All queues are managed by the scheduler. • PBS scripts can be created at: • http://hpcgroup.public.iastate.edu/HPC/hpc-class/hpc-class_script_writer.html
#!/bin/csh #PBS -o BATCH_OUTPUT #PBS -e BATCH_ERRORS #PBS -lvmem=256Mb,pmem=256Mb,mem=256Mb,nodes=16:ppn=2,cput=2:00:00,walltime=1:00:00 # Change to directory from which qsub was executed cd $PBS_O_WORKDIR time mpirun -np 32 <program> Example script
PBS Variables • -l (resources) • vmem: total virtual memory • pmem: per task memory • mem: total aggregate memory • nodes – total number of nodes • ppn – processors per node • cput – CPU time • walltime – total time for all CPUs
PBS Variables • vmem = pmem = mem • total CPUs = nodes * ppn • cput = walltime * ppn
PBS (Torque/Maui) • Based on the previous script • BATCH_OUTPUT contains the output from the batch job • BATCH_ERRORS contains the error information from the batch job
Some other important information • Max CPU – 32 for classwork • Max memory – 2.0 GB • Max swap – 2.0 GB • Short queue - • 4 nodes per job; 16 total CPUs • 1 hour per job • 2 total jobs per user
MPI Communication • Blocking Communication: • MPI_Send • MPI_Recv • MPI_Send → Basic blocking send operation. Routine returns only after the application buffer in the sending task is free for reuse. • MPI_Recv → Receive a message and block until the requested data is available in the application buffer in the receiving task.
MPI Communication • Non-blocking Communication • MPI_Isend | MPI_Irecv • MPI_Wait | MPI_Test • MPI_Isend → Identifies an area in memory to serve as a send buffer. Processing continues without waiting for the message to be copied out from the buffer. • MPI_Irecv → Identifies an area in memory to serve as a receive buffer. Processing continues immediately without waiting for the message to be received and copied into the the buffer. • MPI_Test → check the status of a non-blocking send or receive • MPI_Wait → block until a specified non-blocking send or receive operation has completed
Why non-blocking communication? • In some cases, it can increase performance. • If there is an expensive operation you need to do, it helps speed up the program • Disk I/O • Heavy processing on already received data • BE CAREFUL!!! • If you try to access a buffer when it isn't there, your program WILL fail.
int main (int argc, char **argv) { int myRank; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myRank); if (myRank == 0) master(); else slave(); MPI_Finalize(); return 0; } int master() { int i, size, my_answer = 0, their_work = 0; MPI_Status status; MPI_Comm_size(MPI_COMM_WORLD, &size); for (i = 1; i < size; i++) { MPI_Recv ( &their_work, 1, MPI_INT, i, TAG, MPI_COMM_WORLD, &status); my_answer += their_work; } printf("The answer is: %d\n", my_answer); return 0; }
int slave() { int i, myRank, size, namelength, work = 0; char name[MPI_MAX_PROCESSOR_NAME]; MPI_Comm_rank(MPI_COMM_WORLD, &myRank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Get_processor_name(name, &namelength); printf("[%s]: Adding the nubmers %d to %d = ", name, (100 / (size-1)) * (myRank-1) + 1 , (100 / (size-1)) * myRank); for (i = (100 / (size-1)) * (myRank-1) + 1; i <= myRank * (100 / (size-1)); i++) { work = work + i; } printf("%d\n", work); MPI_Send(&work, 1, MPI_INT, 0, TAG, MPI_COMM_WORLD); return 0; }