1 / 31

MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator

MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University. Purpose. To give you: … an overview of some new system-level MPI functions … access to tools that you need to compile and run MPI jobs

havily
Télécharger la présentation

MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University

  2. Purpose To give you: • … an overview of some new system-level MPI functions • … access to tools that you need to compile and run MPI jobs • … some instruction in the creation and use of Makefiles • … some instruction on how to tell time in C programs.

  3. Makefiles

  4. Makefiles • GNU Make • Enables the end user to build and install a package without worrying about the details. • Automatically figures out which files it needs to update based on which source files have changed. • Not language dependent • Not limited to building a package; can be used to install or uninstall

  5. Makefile Rules • A rule tells Make how to execute a series of commands in order to build a target from source files. • Specifies a list of dependancies • Dependancies should include ALL files that are dependancies for a target target: dependancies .... commands ...

  6. Example Makefile for C Source CC=gcc CFLAGS=-Wall INCLUDES= BINARIES=rand test .SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $*.c all: $(BINARIES) rand.o: rand.c test.o: test.c rand: rand.o $(CC) $(CFLAGS) -o rand rand.o test: test.o $(CC) $(CFLAGS) -o test test.o clean: rm -f a.out core *.o $(BINARIES)

  7. Example Makefile for C Source CC=gcc CFLAGS=-Wall INCLUDES= BINARIES=rand test • Variables • CC is set to use the GCC compiler • For MPI programs, set it to mpicc, not gcc • CFLAGS: • -c: compile • -Wall: set warnings to all

  8. Example Makefile for C Source clean: rm -f a.out core *.o $(BINARIES) • Target “clean”. Use by typing • make clean • Rule states: • In my current directory, run: • rm -f a.out core *.o $(BINARIES) • rm -f a.out core *.o rand test

  9. Example Makefile for C Source .SUFFIXES: .c .o .c.o: $(CC) $(CFLAGS) -c $*.c • Makefile instruction on how to handle .c files and turn them into object (.o) files • Compile using $(CC) value with $(CFLAGS) • Compile each individual file into its appropriate .o file

  10. Example Makefile for C Source rand.o: rand.c test.o: test.c rand: rand.o $(CC) $(CFLAGS) -o rand rand.o test: test.o $(CC) $(CFLAGS) -o test test.o • Target: rand or test • Run $(CC) $(CFLAGS) -o rand rand.o • gcc -Wall -o rand rand.o • If you were going to include external libraries to link, they would be linked at the end of the rule.

  11. Random Matrix Generation

  12. Random Generator for Matrices • Rand • -f: filename to which to write the matrix • -c: number of matrix columns • -r: number of matrix rows • -h: help documentation • -s: seed • -m: max integer in matrix cells

  13. Random Generator for Matrices • Completely random generation for an m by n matrix • Uses a random seed to create the matrix • Output file • First line contains the number of rows and the number of columns • Subsequent lines contain matrix cell values, one per line.

  14. Random Generator for Matrices • For a Matrix with row length m, cell A[i,j] is on line: • m * i + j + 2 • Lines are not zero-indexed for the purpose of this calculation. • Therefore, for a 5 x 5 matrix (zero-indexed): • A[0, 0] is on line 2 • A[0, 1] is on line 3 • A[4, 4] is on line 26 • A[2, 3] is on line 15

  15. Calculating Run Time in C

  16. Calculating Running Time in C #include <stdio.h> #include <sys/time.h> int main() { struct timeval begin, end; double time; gettimeofday(&begin, NULL); sleep(10); gettimeofday(&end, NULL); time = (end.tv_sec - begin.tv_sec) +((end.tv_usec - begin.tv_usec) / 1000000.0); printf("This program ran for %f seconds\n", time); return 0; }

  17. C Time • Includes seconds and microseconds • Used by the gettimeofday() system call • gettimeofday() • Returns the number of seconds (and microseconds) since the UNIX Epoch • Is this completely accurate? • No, but it's VERY close (within a few microseconds).

  18. C Time • You MUST use the timeval struct for the gettimeofday() call • On UNIX systems, you need to include sys/time.h to use this. • Calculation of time is: (end seconds – begin seconds) + ((end microseconds – begin microseconds) / 1000000) • You can calculate: • Program run time • Algorithm execution time

  19. Using the PBS Job Submission System

  20. PBS (Torque/Maui) • hpc-class job submission system • qsub • All queues are managed by the scheduler. • PBS scripts can be created at: • http://hpcgroup.public.iastate.edu/HPC/hpc-class/hpc-class_script_writer.html

  21. #!/bin/csh #PBS -o BATCH_OUTPUT #PBS -e BATCH_ERRORS #PBS -lvmem=256Mb,pmem=256Mb,mem=256Mb,nodes=16:ppn=2,cput=2:00:00,walltime=1:00:00 # Change to directory from which qsub was executed cd $PBS_O_WORKDIR time mpirun -np 32 <program> Example script

  22. PBS Variables • -l (resources) • vmem: total virtual memory • pmem: per task memory • mem: total aggregate memory • nodes – total number of nodes • ppn – processors per node • cput – CPU time • walltime – total time for all CPUs

  23. PBS Variables • vmem = pmem = mem • total CPUs = nodes * ppn • cput = walltime * ppn

  24. PBS (Torque/Maui) • Based on the previous script • BATCH_OUTPUT contains the output from the batch job • BATCH_ERRORS contains the error information from the batch job

  25. Some other important information • Max CPU – 32 for classwork • Max memory – 2.0 GB • Max swap – 2.0 GB • Short queue - • 4 nodes per job; 16 total CPUs • 1 hour per job • 2 total jobs per user

  26. MPI Blocking vs. Non-Blocking Communication

  27. MPI Communication • Blocking Communication: • MPI_Send • MPI_Recv • MPI_Send → Basic blocking send operation. Routine returns only after the application buffer in the sending task is free for reuse. • MPI_Recv → Receive a message and block until the requested data is available in the application buffer in the receiving task.

  28. MPI Communication • Non-blocking Communication • MPI_Isend | MPI_Irecv • MPI_Wait | MPI_Test • MPI_Isend → Identifies an area in memory to serve as a send buffer. Processing continues without waiting for the message to be copied out from the buffer. • MPI_Irecv → Identifies an area in memory to serve as a receive buffer. Processing continues immediately without waiting for the message to be received and copied into the the buffer. • MPI_Test → check the status of a non-blocking send or receive • MPI_Wait → block until a specified non-blocking send or receive operation has completed

  29. Why non-blocking communication? • In some cases, it can increase performance. • If there is an expensive operation you need to do, it helps speed up the program • Disk I/O • Heavy processing on already received data • BE CAREFUL!!! • If you try to access a buffer when it isn't there, your program WILL fail.

  30. int main (int argc, char **argv) { int myRank; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myRank); if (myRank == 0) master(); else slave(); MPI_Finalize(); return 0; } int master() { int i, size, my_answer = 0, their_work = 0; MPI_Status status; MPI_Comm_size(MPI_COMM_WORLD, &size); for (i = 1; i < size; i++) { MPI_Recv ( &their_work, 1, MPI_INT, i, TAG, MPI_COMM_WORLD, &status); my_answer += their_work; } printf("The answer is: %d\n", my_answer); return 0; }

  31. int slave() { int i, myRank, size, namelength, work = 0; char name[MPI_MAX_PROCESSOR_NAME]; MPI_Comm_rank(MPI_COMM_WORLD, &myRank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Get_processor_name(name, &namelength); printf("[%s]: Adding the nubmers %d to %d = ", name, (100 / (size-1)) * (myRank-1) + 1 , (100 / (size-1)) * myRank); for (i = (100 / (size-1)) * (myRank-1) + 1; i <= myRank * (100 / (size-1)); i++) { work = work + i; } printf("%d\n", work); MPI_Send(&work, 1, MPI_INT, 0, TAG, MPI_COMM_WORLD); return 0; }

More Related