1 / 25

Parallel Computing @ ISDS

Parallel Computing @ ISDS. Chris Hans 29 November 2004. Organization. Basics of Parallel Computing Structural Computational Coding for Distributed Computing Examples Resources at Duke CSEM Cluster. Basics of Distributed Computing. Online Tutorial (Livermore Nat. Lab.)

Pat_Xavi
Télécharger la présentation

Parallel Computing @ ISDS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Computing @ ISDS Chris Hans 29 November 2004

  2. Organization • Basics of Parallel Computing • Structural • Computational • Coding for Distributed Computing • Examples • Resources at Duke • CSEM Cluster Parallel Computing @ ISDS

  3. Basics of Distributed Computing • Online Tutorial (Livermore Nat. Lab.) http://www.llnl.gov/computing/tutorials/parallel_comp/ • Serial Computing: one computer, one CPU • Parallel Computing: multiple computers working at the same time Parallel Computing @ ISDS

  4. Various Setups • Collection of Workstations • PVM (Parallel Virtual Machine) • R (rpvm, Rmpi, SNOW) • LAM-MPI (Local Area Multicomputer) • Matlab • Dedicated Cluster • Master/Slave model • MPI Parallel Computing @ ISDS

  5. Network Layout • Basic layout: Each node has: CPU(s), memory Parallel Computing @ ISDS

  6. Designing a Parallel Program Do the nodes need to interact? • “Embarrassingly Parallel” Very little communication -- Monte Carlo • “Shamelessly Parallel” Communication Needed -- Heat diffusion models -- Spatial models Parallel Computing @ ISDS

  7. Message Passing Interface • “MPI” Standard Easy to use functions that manage the communication between nodes. Parallel Computing @ ISDS

  8. Master/Slave Model • Organize the layout: Slaves Parallel Computing @ ISDS

  9. Master/Slave Model • “Master” divides the task into pieces… • …while slaves “listen” to the network, waiting for work. • Master sends out the work to be done… • …slaves do the work… • …while the Master waits for the answers. • Slaves return the results. Parallel Computing @ ISDS

  10. Example: Monte Carlo Parallel Computing @ ISDS

  11. Example: Monte Carlo Parallel Computing @ ISDS

  12. Code • Write ONE program • Same for master and slaves • Run program on EACH node • Each program has to figure out if it’s the master or a slave • MPI Parallel Computing @ ISDS

  13. Master Slave 1 Slave 2 Slave 3 LOAD DATA; GET ID; IF(ID==MASTER) { MASTER(); } ELSE { SLAVE(); } ... LOAD DATA; GET ID; IF(ID==MASTER) { MASTER(); } ELSE { SLAVE(); } ... LOAD DATA; GET ID; IF(ID==MASTER) { MASTER(); } ELSE { SLAVE(); } ... LOAD DATA; GET ID; IF(ID==MASTER) { MASTER(); } ELSE { SLAVE(); } ... Pseudo Code Parallel Computing @ ISDS

  14. Master Slave 1 Slave 2 Slave 3 MASTER { // Find # of nodes GET NP; for(i in 1:NP) { “Tell process i to compute the mean for 1000 samples” } RET=RES=0; // Wait for results WHILE(RET<NP){ ANS = RECEIVE(); RES+=ANS/NP; RET++; } SLAVE { ANS = 0; // Wait for orders // Receive NREPS for(i in 1:NREPS) { ANS += DRAW(); } SEND(ANS/NREPS); RETURN TO MAIN; }; SLAVE { ANS = 0; // Wait for orders // Receive NREPS for(i in 1:NREPS) { ANS += DRAW(); } SEND(ANS/NREPS); RETURN TO MAIN; }; SLAVE { ANS = 0; // Wait for orders // Receive NREPS for(i in 1:NREPS) { ANS += DRAW(); } SEND(ANS/NREPS); RETURN TO MAIN }; Parallel Computing @ ISDS

  15. Master Slave 1 Slave 2 Slave 3 PRINT RES; RETURN TO MAIN; } ... FINALIZE(); } ... FINALIZE(); } ... FINALIZE(); } ... FINALIZE(); } Pseudo Code Parallel Computing @ ISDS

  16. Example: C++ Code • Large dataset • N=40, p = 1,000 • One outcome variable, y • Calculate R2 for all 1-var regression • parallel_R2.cpp Parallel Computing @ ISDS

  17. CSEM Cluster @ Duke • Computational Science, Engineering and Medicine Cluster http://www.csem.duke.edu/Cluster/clustermain.htm • Shared machines • Core Nodes • Contributed Nodes Parallel Computing @ ISDS

  18. CSEM Cluster Details • 4 Dual processing head nodes • 64 Dual processing shared nodes • Intel Xeon 2.8 GHz • 40 Dual processing “stat” nodes • Intel Xeon 3.1 GHz • 161 Dual processing “other” nodes • Owners get priority Parallel Computing @ ISDS

  19. Parallel Computing @ ISDS

  20. Using the Cluster • Access limited • ssh –l cmh27 cluster1.csem.duke.edu • Data stored locally on cluster • Compile using mpicc , mpif77, mpif90 • Cluster uses SGE Queuing System Parallel Computing @ ISDS

  21. Queuing System • Submit your job with requests • memory usage • number of CPUs (nodes) • Assigns nodes and schedules job • Least-loaded machines fitting requirements • Jobs run outside of SGE are killed Parallel Computing @ ISDS

  22. Compiling Linked libraries, etc… g++ -c parallel_R2.cpp -I/opt/mpich-1.2.5/include -L/opt/mpich-1.2.5/lib mpicc parallel_R2.o -o parallel_R2.exe -lstdc++ -lg2c -lm Parallel Computing @ ISDS

  23. Submitting a Job • Create a queue script: Email me when the job is submitted and when it finishes #!/bin/tcsh # #$ -S /bin/tcsh -cwd #$ -M username@stat.duke.edu -m b #$ -M username@stat.duke.edu -m e #$ -o parallel_R2.out -j y #$ -pe low-all 5 cd /home/stat/username/ mpirun -np $NSLOTS -machinefile $TMPDIR/machines parallel_R2.exe Output file Priority and number of nodes requested Your Home Directory on CSEM Cluster Parallel Computing @ ISDS

  24. Submitting a Job • Type: [cmh27@head1 cmh27]$ qsub parallel_R2.q Your job 93734 ("parallel_R2.q") has been submitted. • Monitoring http://clustermon.csem.duke.edu/ganglia/ Parallel Computing @ ISDS

  25. Downloadable Files You can download the slides, example C++ code and queuing script at: http://www.isds.duke.edu/~hans/tutorials.html Parallel Computing @ ISDS

More Related