1 / 17

The Grid

The Grid. Constantinos Kourouyiannis Ξ Architecture Group. Contents. What is the Grid? Commands of the Grid Description of the commands Basic operations – Scripts Job submission Job status Job cancellation Sites that Grid uses. What is the Grid?.

vondra
Télécharger la présentation

The Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Grid Constantinos Kourouyiannis Ξ Architecture Group

  2. Contents • What is the Grid? • Commands of the Grid • Description of the commands • Basic operations – Scripts • Job submission • Job status • Job cancellation • Sites that Grid uses

  3. What is the Grid? • service for sharing computer power and data storage capacity over the Internet • "work in progress", with the underlying technology still in a prototype phase, and being developed by hundreds of researchers and software engineers around the world.

  4. Resource sharing • Problem: coordinated resource sharingand problem solving in dynamic • Sharing: • direct access to computers, software,data, and other resources • highly controlled, with resource providers and consumers defining clearly and carefully just whatis shared, who is allowed to share, and the conditions under which sharing occurs. A set ofindividuals and/or institutions defined by such sharing rules form what we call a virtualorganization (VO).

  5. Commands • edg-job-submit <jdl _file> • edg-job-status <job_ID> • edg-job-get-output <job_ID> • edg-job-list-match <jdl_file> • edg-job-cancel <job_ID>

  6. Description of commands • edg-job-submit This is the command to submit a job to the grid. The command requires as input a JDF (Job Description File) and returns a job ID (edg_jobId). • edg-job-status This command prints the status of a job previously submitted using edg-job-submit. The job status request is sent to the LB (Logging and Bookkeeping service) that provides the requested information.

  7. Description of commands (cont.) • edg-job-get-output This command can be used to retrieve the output files of a job that has been submitted through the edg-job-submit command with a job description file including the OutputSandbox attribute. After the submission, when the job has terminated its execution, the user can download the files generated by the job and temporarily stored on the RB (Resource Broker) machine as specified by the OutputSandbox attribute, issuing the edg-job-get-output with as input the job ID returned by the edg-job-submit. • edg-job-list-match Displays the list of identifiers of the resources on which the user is authorized and satisfying the job requirements included in the JDF.

  8. Description of commands (cont.) • edg-job-cancel This command cancels a job previously submitted using edg-job-submit. Before cancellation, it prompts the user for confirmation. The cancel request is sent to the Network Server.

  9. The Job Description File • This file describes the necessary inputs, generated outputs and resource requirements of a job using the JDL (Job Description Language). • Example Executable = "/bin/echo"; Arguments = "Hello World"; StdOutput = "message.txt"; StdError = "stderror"; OutputSandbox = {"message.txt", "stderror"}; Requirements = other.LRMSType=="PBS" Rank = other.FreeCPUs;

  10. The Job Description File (cont.) • The parameters Requirements and Rank control the resource matching for the job. • The expression given for the requirements specifies the constraints necessary for a job to run. The job will only be submitted to resources which satisfy this condition. • If more than one resource matches, then the rank is used to determine which is the most desirable resource and hence the one to which the job is submitted (Higher values are more desirable).

  11. edg-job-list-match • This command returns a list of available computing elements ids (CEIds) E.g. edg-job-list-match HelloWorld.jdlreturns: COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: *CEId* grid20.bo.ingv.it:2119/jobmanager-pbs-infinite grid20.bo.ingv.it:2119/jobmanager-pbs-long grid20.bo.ingv.it:2119/jobmanager-pbs-short gridba2.ba.infn.it:2119/jobmanager-lcgpbs-infinite gridba2.ba.infn.it:2119/jobmanager-lcgpbs-long

  12. edg-job-submit • edg-job-submit HelloWorld.jdl returns: • *********************************************************** JOB SUBMIT OUTCOME The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is: - https://edt003.cnaf.infn.it:9000/NyIYE_a8igk4f0CLX ***********************************************************

  13. edg-job-get-output • When the status was requested, the job had finished and the output had been pushed back to the resource broker. States seen in the normal processing of jobs are: Accepted, Waiting, Running, Done, and OutputReady. Abnormal execution usually ends with an Aborted status. • edg-job-get-output https://edt003.cnaf.infn.it:9000/NyIYE_a8igk4f0CLX Retrieving files from host edt003.cnaf.infn.it **************************************************************************JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https://edt003.cnaf.infn.it:9000/NyIYE_a8igk4f0CLX have been successfully retrieved and stored in the directory: /tmp/jobOutput/NyIYrqE_a8igk4f0CLXNKA **************************************************************************

  14. Script for job submission • The following script puts a job to run • #!/bin/bash ./submit.sh -a "-config largeConfig.dat -intervalCycles 0 go.ss 9 9" -b go.ss go.jdl go.out go.err exit • The above script creates a jdl file for the job, which contains the necessary information about the job. Then, the jdl file is submitted on Grid machines with the command edg-job-submit. When the job is submitted, a unique address is created for the job and is written in a file specified by the user with the –o option of the edg-job-submit command: • edg-job-submit -o jobIDs.lst --config-vo edg_wl_ui.conf --config edg_wl_ui_cmd_var.conf $1

  15. Command to get the status of the jobs • We can check the status of the jobs, by using the command edg-job-status –i jobIDs.lst, where the jobIDs.lst is the file containing the unique addresses of the jobs. • A list is shown to the user with the jobs numbered, and the user is asked to choose if he wants the status of a specific job or the status of all the jobs. If he wants all the jobs he types “a”, else he types the number of the job which he wants to see the status for. • The command is automated for all the jobs by using: edg-job-status –i jobIDs.lst <<EOF >jobStatus.txt a EOF • The above command will generate the status for all the jobs contained in the file jobIDs.lst and save it in the file jobStatus.txt

  16. Command to cancel jobs • To cancel the jobs we previously submitted, we use the command: edg-job-cancel –i jobIDs.lst • This command deletes every job contained in the file jobIDs.lst. Before the deletion, a message is shown to the user to confirm the deletion of the jobs.

  17. Sites that Grid uses • http://goc.grid.sinica.edu.tw/gstat/SouthEasternEurope.html • The above webpage shows the 16 grid sites that the virtual organization SEE (SEE-VO) supports. By clicking a site name, the user can check out the number of total CPUs and free CPUs.

More Related