1 / 75

High Performance Computing Workshop (Statistics) HPC 101

High Performance Computing Workshop (Statistics) HPC 101. Dr. Charles J Antonelli LSAIT ARS January, 2013. Credits. Contributors: Brock Palen ( CoE -IT CAC) Jeremy Hallum (MSIS) Tony Markel (MSIS) Bennet Fauber ( CoE -IT CAC) LSAIT ARS UM CoE -IT CAC. Roadmap. Flux Mechanics

abrial
Télécharger la présentation

High Performance Computing Workshop (Statistics) HPC 101

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High PerformanceComputing Workshop(Statistics)HPC 101 Dr. Charles J Antonelli LSAIT ARSJanuary, 2013

  2. Credits • Contributors: • Brock Palen (CoE-IT CAC) • Jeremy Hallum (MSIS) • Tony Markel (MSIS) • BennetFauber (CoE-IT CAC) • LSAIT ARS • UM CoE-IT CAC cja 2013

  3. Roadmap • Flux Mechanics • High Performance Computing • Flux Architecture • Flux Batch Operations • Introduction to Scheduling cja 2013

  4. Flux Mechanics cja 2013

  5. Using Flux • Three basic requirements to use Flux: • A Flux account • A Flux allocation • An MToken (or a Software Token) cja 2013

  6. Using Flux • A Flux account • Allows login to the Flux login nodes • Develop, compile, and test code • Available to members of U-M community, free • Get an account by visiting https://www.engin.umich.edu/form/cacaccountapplication cja 2013

  7. Using Flux • A Flux allocation • Allows you to run jobs on the compute nodes • Current rates: • $18 per core-month for Standard Flux • $24.35 per core-month for BigMem Flux • $8 subsidy per core month for LSA and Engineering • Details at http://www.engin.umich.edu/caen/hpc/planning/costing.html • To inquire about Flux allocations please email flux-support@umich.edu cja 2013

  8. Using Flux • An MToken (or a Software Token) • Required for access to the login nodes • Improves cluster security by requiring a second means of proving your identity • You can use either an MToken or an application for your mobile device (called a Software Token) for this • Information on obtaining and using these tokens at http://cac.engin.umich.edu/resources/loginnodes/twofactor.html cja 2013

  9. Logging in to Flux • ssh flux-login.engin.umich.edu • MToken (or Software Token) required • You will be randomly connected a Flux login node • Currently flux-login1 or flux-login2 • Firewalls restrict access to flux-login.To connect successfully, either • Physically connect your ssh client platform to the U-M campus wired network, or • Use VPN software on your client platform, or • Use ssh to login to an ITS login node, and ssh to flux-login from there cja 2013

  10. Modules • The module command allows you to specify what versions of software you want to use module list -- Show loaded modulesmodule loadname-- Load module name for usemodule avail -- Show all available modulesmodule avail name -- Show versions of module name*module unload name -- Unload module namemodule -- List all options • Enter these commands at any time during your session • A configuration file allows default module commands to be executed at login • Put module commands in file ~/privatemodules/default • Don’t put module commands in your .bashrc / .bash_profile cja 2013

  11. Flux environment • The Flux login nodes have the standard GNU/Linux toolkit: • make, autoconf, awk, sed, perl, python, java, emacs, vi, nano, … • Watch out for source code or data files written on non-Linux systems • Use these tools to analyze and convert source files to Linux format • file • dos2unix, mac2unix cja 2013

  12. Lab 1 Task: Invoke R interactively on the login node • module load Rmodule list • R q() • Please run only very small computations on the Flux login nodes, e.g., for testing cja 2013

  13. Lab 2 Task: Run R in batch mode • module load R • Copy sample code to your login directorycd cp~cja/stats-sample-code.tar.gz . tar -zxvf stats-sample-code.tar.gz cd ./stats-sample-code • Examine lab2.pbs and lab2.R • Edit lab2.pbs with your favorite Linux editor • Change #PBS -Memail address to your own cja 2013

  14. Lab 2 Task: Run R in batch mode • Submit your job to Fluxqsub lab2.pbs • Watch the progress of your job qstat -u uniqname where uniqname is your own uniqname • When complete, look at the job’s outputless lab2.out cja 2013

  15. Lab 3 Task: Use the multicore package in R The multicore package allows you to use multiple cores on a single node • module load Rcd ~/stats-sample-code • Examine lab3.pbsand lab3.R • Edit lab3.pbs with your favorite Linux editor • Change #PBS -Memail address to your own cja 2013

  16. Lab 3 Task: Use the multicore package in R • Submit your job to Fluxqsub lab3.pbs • Watch the progress of your job qstat -u uniqname where uniqname is your own uniqname • When complete, look at the job’s outputless lab3.out cja 2013

  17. Lab 4 Task: Another multicore example in R • module load Rcd ~/stats-sample-code • Examine lab4.pbsand lab4.R • Edit lab4.pbs with your favorite Linux editor • Change #PBS -Memail address to your own cja 2013

  18. Lab 4 Task: Another multicore example in R • Submit your job to Fluxqsub lab4.pbs • Watch the progress of your job qstat -u uniqname where uniqname is your own uniqname • When complete, look at the job’s outputless lab4.out cja 2013

  19. Lab 5 Task: Run snow interactively in R The snow package allows you to use cores on multiple nodes • module load Rcd ~/stats-sample-code • Examine lab5.R • Start an interactive PBS sessionqsub -I -V -l procs=3 -l walltime=30:00 -A stats_flux -l qos=flux -q flux cja 2013

  20. Lab 5 Task: Run snow interactively in R • cd $PBS_O_WORKDIR • Run snow in the interactive PBS sessionR CMD BATCH --vanilla lab5.R lab5.out … ignore any “Connection to lifeline lost” message cja 2013

  21. Lab 6 Task: Run snowfall in R The snowfall package is similar to snow, and allows you to change the number of cores used without modifying your R code • module load Rcd ~/stats-sample-code • Examine lab6.pbs and lab6.R • Edit lab6.pbswith your favorite Linux editor • Change #PBS -Memail address to your own cja 2013

  22. Lab 6 Task: Run snowfall in R • Submit your job to Fluxqsublab6.pbs • Watch the progress of your job qstat -u uniqname where uniqname is your own uniqname • When complete, look at the job’s outputless lab6.out cja 2013

  23. Lab 7 Task: Run parallel MATLAB Distribute parfor iterations over multiple cores on multiple nodes • Do this once: mkdir ~/matlab/ cd ~/matlab wgethttp://cac.engin.umich.edu/resources/software/matlabdct/mpiLibConf.m cja 2013

  24. Lab 7 Task: Run parallel MATLAB • Start an interactive PBS sessionmodule load matlab qsub-I -V -l nodes=2:ppn=3 -l walltime=30:00 -A stats_flux -l qos=flux -q flux • Start MATLAB matlab-nodisplay cja 2013

  25. Lab 7 Task: Run parallel MATLAB • Set up a matlabpool sched = findResource('scheduler', 'type', 'mpiexec') set(sched, 'MpiexecFileName', '/home/software/rhel6/mpiexec/bin/mpiexec') set(sched, 'EnvironmentSetMethod', 'setenv') %use the 'sched' object when calling matlabpool %the syntax for matlabpool must use the (sched, N) format matlabpool (sched, 6) … ignore “Found pre-existing parallel job(s)” warnings cja 2013

  26. Lab 7 Task: Run parallel MATLAB • Run a simple parfor tic x=0; parfori=1:100000000 x=x+i; end toc • Close the matlabpoolmatlabpool close cja 2013

  27. Compiling Code • Assuming default module settings • Use mpicc/mpiCC/mpif90 for MPI code • Use icc/icpc/ifort with -mp for OpenMP code • Serial code, Fortran 90:ifort -O3 -ipo -no-prec-div –xHost -o prog prog.f90 • Serial code, C:icc -O3 -ipo -no-prec-div –xHost –o progprog.c • MPI parallel code:mpicc -O3 -ipo -no-prec-div –xHost -o progprog.cmpirun -np 2 ./prog cja 2013

  28. Lab Task: compile and execute simple programs on the Flux login node • Copy sample code to your login directory:cd cp~brockp/cac-intro-code.tar.gz. tar -xvzfcac-intro-code.tar.gz cd ./cac-intro-code • Examine, compile & execute helloworld.f90: ifort-O3 -ipo -no-prec-div -xHost -o f90hello helloworld.f90 ./f90hello • Examine, compile & execute helloworld.c: icc-O3 -ipo -no-prec-div -xHost -o chellohelloworld.c ./chello • Examine, compile & execute MPI parallel code: mpicc-O3 -ipo -no-prec-div -xHost -o c_ex01 c_ex01.c … ignore the “feupdateenv is not implemented and will always fail” warning mpirun-np 2 ./c_ex01 … ignore runtime complaints about missing NICs cja 2013

  29. Makefiles • The make command automates your code compilation process • Uses a makefile to specify dependencies between source and object files • The sample directory contains a sample makefile • To compile c_ex01: make c_ex01 • To compile all programs in the directory make • To remove all compiled programs make clean • To make all the programs using 8 compiles in parallel make -j8 cja 2013

  30. High Performance Computing cja 2013

  31. Advantages of HPC • Cheaper than the mainframe • More scalable than your laptop • Buy or rent only what you need • COTS hardware • COTS software • COTS expertise cja 2013

  32. Disadvantages of HPC • Serial applications • Tightly-coupled applications • Truly massive I/O or memory requirements • Difficulty/impossibility of porting software • No COTS expertise cja 2013

  33. Programming Models • Two basic parallel programming models • Message-passingThe application consists of several processes running on different nodes and communicating with each other over the network • Used when the data are too large to fit on a single node, and simple synchronization is adequate • “Coarse parallelism” • Implemented using MPI (Message Passing Interface) libraries • Multi-threadedThe application consists of a single process containing several parallel threads that communicate with each other using synchronization primitives • Used when the data can fit into a single process, and the communications overhead of the message-passing model is intolerable • “Fine-grained parallelism” or “shared-memory parallelism” • Implemented using OpenMP (Open Multi-Processing) compilers and libraries • Both cja 2013

  34. Good parallel • Embarrassingly parallel • Folding@home, RSA Challenges, password cracking, … • Regular structures • Divide&conquer, e.g. Quicksort • Pipelined: N-body problems, matrix multiplication • O(n2) -> O(n) cja 2013

  35. Less good parallel • Serial algorithms • Those that don’t parallelize easily • Irregular data & communications structures • E.g., surface/subsurface water hydrology modeling • Tightly-coupled algorithms • Unbalanced algorithms • Master/worker algorithms, where the worker load is uneven cja 2013

  36. Amdahl’s Law If you enhance a fraction f of a computation by a speedup S, the overall speedup is: cja 2013

  37. Amdahl’s Law cja 2013

  38. Flux Architecture cja 2013

  39. The Flux cluster Login nodes Compute nodes Data transfernode Storage … cja 2013

  40. Behind the curtain Login nodes Compute nodes nyx Data transfernode flux Storage … shared cja 2013

  41. A Flux node 48 GB RAM 12 Intel cores Local disk Ethernet InfiniBand cja 2013

  42. A Newer Flux node 64 GB RAM 16 Intel cores Local disk Ethernet InfiniBand cja 2013

  43. A Flux BigMem node 1 TB RAM 40 Intel cores Local disk Ethernet InfiniBand cja 2013

  44. Flux hardware(January 2012) • 8,016 Intel cores 200 Intel BigMem cores632 Flux nodes 5 Flux BigMem nodes • 48 GB RAM/node 1 TB RAM/ BigMem node4 GB RAM/core (average) 25 GB RAM/BigMem core • 4X Infiniband network (interconnects all nodes) • 40 Gbps, <2 us latency • Latency an order of magnitude less than Ethernet • Lustre Filesystem • Scalable, high-performance, open • Supports MPI-IO for MPI jobs • Mounted on all login and compute nodes cja 2013

  45. Flux software • Default Software: • Intel Compilers with OpenMPI for Fortran and C • Optional software: • PGI Compilers • Unix/GNU tools • gcc/g++/gfortran • Licensed software: • Abacus, ANSYS, Mathematica, Matlab, R, STATA SE, … • See http://cac.engin.umich.edu/resources/software/index.html • You can choose software using the module command cja 2013

  46. Flux network • All Flux nodes are interconnected via Infiniband and a campus-wide private Ethernet network • The Flux login nodes are also connected to the campus backbone network • The Flux data transfer node will soon be connected over a 10 Gbps connection to the campus backbone network • This means • The Flux login nodes can access the Internet • The Flux compute nodes cannot • If Infiniband is not available for a compute node, code on that node will fall back to Ethernet communications cja 2013

  47. Flux data • Lustre filesystem mounted on /scratch on all login, compute, and transfer nodes • 342TB of short-term storage for batch jobs • Large, fast, short-term • NFS filesystems mounted on /home and /home2 on all nodes • 40 GB of storage per user for development & testing • Small, slow, long-term cja 2013

  48. Flux data • Flux does not provide large, long-term storage • Alternatives: • ITS Value Storage • Departmental server • CAEN can mount your storage on the login nodes • Issue df–kh command on a login node to see what other groups have mounted cja 2013

  49. Globus Online • Features • High-speed data transfer, much faster than SCP or SFTP • Reliable & persistent • Minimal client software: Mac OS X, Linux, Windows • GridFTP Endpoints • Gateways through which data flow • Exist for XSEDE, OSG, … • UMich: umich#flux, umich#nyx • Add your own server endpoint: contact flux-support • Add your own client endpoint! • More information • http://cac.engin.umich.edu/resources/loginnodes/globus.html cja 2013

  50. Flux Batch Operations cja 2013

More Related