Getting Started with HPC On Iceberg

Getting Started with HPC On Iceberg Michael Griffiths Corporate Information and Computing Services The University of Sheffield Email m.griffiths@sheffield.ac.uk

Overview of grid compuitng • Accessing • Resources • Running Programs and Managing Jobs • Getting Help Outline

Types of Grids • Cluster Grid • Beowulf clusters • Enterprise Grid, Campus Grid, Intra-Grid • Departmental clusters, • servers and PC network • Cloud,Utility Grid • Access resources over internet on demand • Global Grid, Inter-grid • White Rose Grid, National Grid Service, Particle physics data grid

‘iceberg’ the HPC Cluster at the Computer Centre • Processor Cores : 1544 + 8GPU’s • Performance: 14TFLOPs • Main Memory: 4448GB • User filestore: 45TB • Temporary file store: 80TB

‘iceberg’ the HPC Cluster at the Computer Centre • AMD-based cluster containing; • 96 nodes each with 4 cores and 16 GB of memory • 31 nodes each with 8 cores and 32 GB of memory • TOTAL AMD CORES = 632, TOTAL MEMORY = 2528 GBThe 8-core nodes are connected to each other via 16 GBits/sec infiniband for MPI jobsThe 4-core nodes are connected via the much slower "1 Gbits/sec" ethernet connections for MPI jobsScratch space on each node is 400 GBytes • Intel Westmere based, supplied by Dell and integrated by ALCES • 71 nodes each with 12 cores and 24 GB of memory ( i.e. 2 * 6-core Intel X5650 ) • 5 nodes each with 12 cores and 48 GB of memory • 8 Nvidia Tesla Fermi M2070s GPU units for GPU programming • TOTAL INTEL CPU CORES = 912 , TOTAL MEMORY = 1920 GBScratch space on each node is 400 GBTotal GPU memory = 48 GBEach GPU unit is capable of about 1TFlop of single precision floating point performance, or 0.5TFlops at double precision. Hence yielding maximum GPU processing power of upto 8 TFlops in total.

Iceberg Cluster There are two head-nodes for the cluster login login login HEAD NODE1 Iceberg(1) HEAD NODE2 Iceberg(2) qsh,qsub,qrsh qsh,qsub,qrsh Worker node Worker node Worker node Worker node Worker node Worker node There are 203 worker machines in the cluster All workers share the same user filestore Worker node Worker node Worker node Worker node Worker node Worker node

Review: Software 1 Ganglia Portland, GNU Sun Grid Engine v6 Redhat 64bit Scientific Linux OpenMPI AMD Opteron/Intel Westmere

Review: Software 2 • Maths and Statistical • Matlab, scilab • R+ • Engineering and Finite Element • Fluent, gambit, fidap and tgrid • Ansys • Abaqus • DYNA • Visualisation • IDL 6.1 • Paraview • OpenDX

Review:Software 3 • Development • MPI, • mvapich2, • openmpi • OpenMP • Nag, 20

Software 4: Compilers • C and Fortran programs may be compiled using the GNU or Portland Group. The invoking of these compilers is summarized in the following table:

Accessing 1: Registration • Registration • Details at http://www.shef.ac.uk/wrgrid/register • All staff can have an account on iceberg by simply emailing ucards-reg@sheffield.ac.uk .

Managing Your Password • You must synchronise you passwords so that your iceberg password is the same as your campus password • Visit the CICS password management page • http://www.shef.ac.uk/cics/password • Login using your campus username and password • Options on CICS password management page • Display account information (see if you have an iceberg account) • Synchronize passwords (make your iceberg password the same as your campus password) • Change password

Working Remotely Unlike the Managed Windows machines, you can access and use iceberg remotely from any location. Line mode access to iceberg from all platforms or full graphical mode access from Apple or Linux platforms or Windows machines using Exceed do not require a VPN connection. Currently, remote access via the web browser (SGD) necessitates a VPN connection but we are hoping to remove this restriction shortly. See the following URLs for further details:http://www.sheffield.ac.uk/wrgrid/using/access

Accessing 2: Logging in • ssh client • putty, SSH Secure Shell Client Sun Global Desktop X-Windows • Exceed 3d (just start exceed and login using ssh client) • Cygwin • Note: when using SSH secure shell client • From menu: edit-> settings • Select: Connection->tunneling • Tick Tunnel X11 connections

Accessing 3:Linux • For end users, things are much the same • RedHat Enterprise 5 (Scientific Linux) • BASH is default shell (use up and down key for history, type “history” , use tab for auto-completion • Setting Aliases for BASH is like • “export $environment_var=“setting”

Setting Environment for Different Applications • The modules command enables your environment to be correctly configured for the applications you want to run on iceberg • module avail • This commands shows the different applications which re available • module add • Sets your environment a particular application • e.g. module add compilers/pgi/10.2 • Sets you environment to use the PGI compilers version 10.2 • module list • Shows you the list of modules you currently have installed

Working with files To copy a file: cp my_file my_new_file To move ( or rename ) a file : mv my_file my_new_file To delete a file : rmmy_file To list the contents of a file : less file_name To make a new directory( i.e. folder) : mkdir new_directory To copy a file into another directory: cp my_file other_directory To move a file into another directory: mv my_file other_directory To remove a directory and all its contents!!!: rm –R directory ( use with care ) Wildcards : * means matching any sequence of characters. For example: cp *.dat my_directory

Resources 1: Filestore • Two areas of filestore available on Iceberg. • A permanent, secure, backed up area in your home directory /home/username • data directory /data/username • Not backed up to tape • Data is mirrored on the storage server

Resources2: Scratch area • Temporary data storage on local compute nodes • I/O much faster than NFS mounted /home and /data • /fastdata uses the “lustre” based parallel file system • Data not visible to other worker nodes and not backed up • Create a directory using your username in /scratch on a worker and work from this directory • The data in the /scratch area is deleted periodically when the worker is not accessed by any processor or jobs

Resources 3: Storage Allocations • Storage allocations for each area are as follows: • Your home directory has a filestore of 5 GB,but you can request additional space. • If you change directory to /data you will see a directory labelled by your username. • In /data you can store 50GB of files you can request additional space. • /fastdata is not under the quota system. However, data older than 90days will be deleted, by the housekeeeping application

Resources 4: Important Notes • The data area is not backed up. • Check quota regularly if you go over quota the account will become frozen you’ll need to contact iceberg-admins • Check quota using the command quota • If you exceed your quota use the RM command • Note upper case

Resources 5 : Transferring Data • It is always advisable to have backup of your files on a media that is in a physically separate location. • It is also necessary to copy files between different platforms so as to use them with software that does not exist or practical to use on all the platforms. • This can be done from your own machines, lap-tops etc. or from the Managed Windows machines in the IT centres. • Command line tools such as scp, sftp • Use sftp tools such as winscp for windows gftp for linux • http://winscp.net/eng/index.php • If you decide to transfer files between your own workstation and iceberg, there are plenty of choices of file-transfer programs available to you. See URL: http://www.sheffield.ac.uk/wrgrid/using/access

Running programs on iceberg • Iceberg is the gateway to the cluster of worker nodes and the only one where direct logging in is allowed. • Iceberg’s main purpose is to allow access to the worker nodes but NOT to run cpu intensive programs. • All cpu intensive computations must be performed on the worker nodes. This is achieved by the qsh command for the interactive jobs and qsub command for the batch jobs. • Once you log into iceberg, taking advantage of the power of a worker node for interactive work is done simply by typing qsh and working in the new shell window that is opened. This what appears to be a trivial task has would in fact have queried all the worker nodes for you and started a session on the least loaded worker in the cluster. • However, if you come in via the Web Interface ( i.e. SGD ) you are put straight onto one of the worker nodes • The next set of slides assume that you are already working on one of the worker nodes (qsh session).

Sun Global Desktop –The MYApps Portal Start session on headnode Start an interactive session on a worker Start Application on a Worker Help

Managing Jobs 1:Sun Grid Engine Overview • Resource management system, job scheduler, batch system… • Can schedule Serial and Parallel jobs • Serial jobs run in individual host queues • Parallel jobs must include a parallel environment request (-pe <pe_name> N)

B Slot 1 B Slot1 C Slot 1 A Slot 1 A Slot 1 B Slot 2 B Slot 1 A Slot 2 C Slot1 C Slot 2 B Slot 1 C Slot 3 C Slot 1 C Slot 2 B Slot 3 JOB N JOB O JOB U JOB Y JOB Z JOB X Job scheduling on the cluster SGE workernode SGE workernode SGE workernode SGE workernode SGE workernode Queue-A Queue-B Queue-C SGE MASTERnode • Queues • Policies • Priorities • Share/Tickets • Resources • Users/Projects

Managing Jobs 2: Job Scheduling • Job schedulers work predominantly with “batch” jobs - require no user input or intervention once started • Installation here also supports interactive use via “qsh”

Managing Jobs 3: Working with SGE jobs • There are a number of commands for querying and modifying the status of a job running or queued by SGE • qsub (submit a job to SGE) • qstat (query job status) • qdel (delete a job)

Managing Jobs 4: Submitting Serial Jobs • Create a submit script (example.sh): #!/bin/bash # Scalar benchmark echo ``This code is running on`` /bin/hostname /bin/date • The job is submitted to SGE using the qsub command: $ qsub example.sh

Managing Jobs 5: Options Used with SGE

Managing Jobs 6: Options Used with SGE

Managing Jobs 7: qsub qsub arguments: qsub –o outputfile –j y –cwd ./submit.sh OR in submit script: $!/bin/bash $# -o outputfile $# -j y $# -cwd /home/horace/my_app

Managing Jobs 8: Interactive Use • Interactive but with a dedicated resource • “qsh” • Then use as your desktop machine • Fluent, matlab…

Managing Jobs 9: Deleting Jobs with qdel • Individual Job $ qdel 151 gertrude has registered the job 151 for deletion • List of Jobs $ qdel 151 152 153 • All Jobs running under a given username $qdel –u <username>

Managing Jobs 9:Monitoring Jobs with qstat • To list the status and node properties of all nodes: qstat (add –f to get a full listing) • Information about users' own jobs and queues is provided by the qstat -u usersname command. e.g qstat -u fred Monitor job and show memory usage qstat –f -jjobid | grep usage

Managing Jobs 10:qstat Example job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 206951 0.51000 INTERACTIV bo1mrl r 07/05/2005 09:30:20 bigmem.q@comp58.iceberg.shef.a 1 206933 0.51000 do_batch4 pc1mdh r 07/04/2005 16:28:20 long.q@comp04.iceberg.shef.ac. 1 206700 0.51000 l100-100.m mb1nam r 07/04/2005 13:30:14 long.q@comp05.iceberg.shef.ac. 1 206698 0.51000 l50-100.ma mb1nam r 07/04/2005 13:29:44 long.q@comp12.iceberg.shef.ac. 1 206697 0.51000 l24-200.ma mb1nam r 07/04/2005 13:29:29 long.q@comp17.iceberg.shef.ac. 1 206943 0.51000 do_batch1 pc1mdh r 07/04/2005 17:49:45 long.q@comp20.iceberg.shef.ac. 1 206701 0.51000 l100-200.m mb1nam r 07/04/2005 13:30:44 long.q@comp22.iceberg.shef.ac. 1 206705 0.51000 l100-100sp mb1nam r 07/04/2005 13:42:07 long.q@comp28.iceberg.shef.ac. 1 206699 0.51000 l50-200.ma mb1nam r 07/04/2005 13:29:59 long.q@comp30.iceberg.shef.ac. 1 206632 0.56764 job_optim2 mep02wsw r 07/03/2005 22:55:30 parallel.q@comp43.iceberg.shef 18 206600 0.61000 mrbayes.sh bo1nsh r 07/02/2005 11:22:19 parallel.q@comp51.iceberg.shef 24 206911 0.51918 fluent cpp02cg r 07/04/2005 14:19:06 parallel.q@comp52.iceberg.shef 4 206954 0.51000 INTERACTIV php04awb r 07/05/2005 10:06:17 short.q@comp01.iceberg.shef.ac 1

Managing Jobs 11:Monitoring Job Output • The following is an example of submitting a SGE job and checking the output produced qsub –pe mpich 8 myjob.sh job <131> submitted qstat –f (is job running ?) tail –f myjob.sh.o.131

Managing Jobs 12:SGE Job Output • When a job is queued it is allocated a job number. Once it starts to run output usually sent to standard error and output are spooled to files called • <script>.o<jobid> • <script>.e<jobid>

Managing Jobs 13:Reasons for Job Failures • SGE cannot find the binary file specified in the job script • Required input files are missing from the startup directory • Environment variable is not set (LM_LICENSE_FILE etc) • Hardware failure (eg. mpi ch_p4 or ch_gm errors)

Managing Jobs 14:SGE Job Arrays • Add to qsub command or script file (with #$ at beginning of line) • “ –t 1-10 “ • Would create 10 tasks from one job • Each task has $SGE_TASK_ID set in the environment

Specifying The Memory Requirements of a Job • Policies that apply to queues • Default memory requirement for each job is 4GB • Jobs will be killed if memory exceeds amount requested • Determine memory requirements for a job as follows • qstat –f –j jobid | grep mem • The reported figures will indicate- the currently used memory ( vmem )- Maximum memory needed since startup ( maxvmem)- cumulative memory_usage*seconds ( mem ) • When you run the job next you need to use the reported value of vmem to specify the memory requirement • The qtop command has been provided to identify how much resource your current jobs are using

Memory Issues • Programs using <2GB require no modification • Large memory associated with heap or data memory segment, if this exceeds 2GB use following compiler flags • C/C++ compilers • pgcc –mcmodel=medium • Fortran compilers • pgf77/pgf90/pgf95 –mcmodel=medium • g77 –mcmodel=medium

Useful Links for Memory Issues • 64 bit programming memory issues • http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/64-bit.html • Understanding Memory • http://www.ualberta.ca/CNS/RESEARCH/LinuxClusters/mem.html

Managing Jobs 15:SGE Parallel Environments • Parallel environments on Iceberg • ompigige • openmp • openmpi-ib • mvapich2-ib • See later

Managing Jobs 16:Job Queues on Iceberg

Managing Jobs 17:Interactive Computing • Software that runs interactively should not be run on the head node. • Instead you must run interactive jobs on an execution node (see ‘qsh’ command below). • The time limit for interactive work is 8 hours. • Interactive work on the head node will be killed off.

Getting help • Web site • http://www.shef.ac.uk/wrgrid/ • Documentation • http://www.shef.ac.uk/wrgrid/using • Training (also uses the learning management system) • http://www.shef.ac.uk/wrgrid/training • Contacts • http://www.shef.ac.uk/wrgrid/contacts.html

Thank you for your attention Any questions ?

Getting Started with HPC On Iceberg