110 likes | 234 Vues
In a cluster computing environment, users engage with a myriad of machines, each potentially running different software and hardware configurations. This can lead to challenges such as slow responses or machine downtimes. A portable batch system (PBS) effectively bridges the gap between users and the computing resources. Through user interfaces, job queuing, scheduling, and monitoring, users can submit tasks and manage jobs efficiently across uniform hardware and software. With components like user commands and system daemons, PBS ensures optimized resource management and task execution in interactive, batch, and parallel computing scenarios.
E N D
Using Clusters -User Perspective
Pre-cluster scenario • So many different computers: prithvi, apah, tejas, vayu, akash, agni, aatish, falaq, narad, qasid … • Different S/W on each of them • Different H/W capabilities • The desired one may be down • Only few are in the top bracket, so response may be slow
Cluster • Only one machine for so many computers • Same S/W everywhere • Same H/W • Few systems down is no problem • One can use the m/c as Interactive Server, Batch Sever, Sequential m/c, Parallel m/c
User Interface to Cluster • Like OS is between m/c and user • This interface is between user and a chunk of m/c s • Users Interface m/c s
Components • Q ing: Collection of user jobs/requests in the form of batch jobs • Scheduling: Selecting user jobs to run and m/c s to run on • Monitoring: Usage policy implementation, Job and m/c status track
Portable Batch System (PBS) • Two components: User Commands and System Daemons • User commands eqv. GUI is also available • User commands are for: submit, monitor, modify, delete etc. tasks. • Daemons: Server for managing resources of the whole cluster • Scheduler Selects the executer and its resources
Executer some node and some processor selected by the scheduler • Running a job: • 1- Create a file having OS and PBS commands: ./a.out • #PBS –l ncpus=4 • 2- Submitting a job: Use the command qsub <file_containing_OS/PBS commands> [options]
-I option creates an interative session • -q option selects the Q • Checking the status of a job • Tracejob job_number
9/05/2006 20:19:36 S Job Queued at request of santh@hncn17, owner = • santh@hncn17, job name = SCR_LB70-m5stat, queue = • workq • 9/05/2006 20:19:36 S Job Modified at request of Scheduler@hncn17 • 9/05/2006 20:19:36 S enqueuing into workq, state 1 hop 1 • 9/05/2006 20:19:36 A queue=workq • 9/05/2006 22:39:36 L Considering job to run • 9/05/2006 22:39:36 L Not enough of the right type of nodes available
Modifying a job: qalter –l walltime=20:00 • Deleting a job: qdel 17 • Sending signals: qsig –s signal job_identifier • Job movement between Qs is possible • Parallel jobs are run through the command: mpirun • Check pointing is possible
pbs_server, pbs_mom, pbs_scheduler are the three daemons • Compute node runs only pbs_mom