1 / 11

Glexec/SCAS Pilot: IN2P3-CC status

This status report discusses the initial plan and setting-up issues for the pilot deployment of Glexec/Scas at IN2P3-CC. It also includes the overview of grid job management and last BQS JM enhancements.

jameshall
Télécharger la présentation

Glexec/SCAS Pilot: IN2P3-CC status

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Glexec/SCAS Pilot:IN2P3-CC status Pierre Girard CCIN2P3 T1-T2 2009-02-03

  2. Content • Grid deployment at CCIN2P3 • Initial plan for pilot of Glexec/Scas • Setting-up issues • Conclusion Pierre Girard - Glexec/SCAS: IN2P3-CC status

  3. Grid Job Management at CCIN2P3 • Several Grid WN versions at time AFS Computing Element Computing Element Computing Element Computing Element Glite-WN-3.1.26-glexec Glite-WN-3.1.26-prod BQS Glite-WN-3.1.19-prod Anastasie Glite-WN-3.1.666-pps No MW locally on worker WN WN WN WN WN WN WN WN Globus4-WN Shared FS (afs.in2p3.fr) Computing Pierre Girard - Glexec/SCAS: IN2P3-CC status

  4. Overview of grid job submission Grid Job Credentials 1 RSL WN Submit U-job Glite-WN Computing Element lcg0507012233-1234.sh U-job SL4.5 4 Job Manager spawn 2 Local Job Wrapping lcg0507012233-1234.sh 3 BQS #!/bin/sh #PBS -q T #PBS -l M=2200MB #PBS -l T=3801600 #PBS -l scratch=16250MB #PBS -l platform=LINUX #PBS --share T1prod … qsub U-job Pierre Girard - Glexec/SCAS: IN2P3-CC status

  5. Glite-WN-3.1.26-glexec Glite-WN-3.1.26-prod Glite-WN-3.1.19-prod Glite-WN-3.1.666-pps Globus4-WN WN profile selection by BQS JobManager Grid Job Credentials 1 RSL WN Submit U-job lcg0507012233-1234.sh BQS-JM config Glite-WN Computing Element 6 U-job Dynamically link to WN profile SL4.5 BQS JM 5 rules spawn 2 Local Job Wrapping lcg0507012233-1234.sh 3 4 BQS Set WN profile qsub Glite-WN-3.1.26-glexec U-job AFS Pierre Girard - Glexec/SCAS: IN2P3-CC status

  6. Last BQS JM enhancements • BQS JM control • Submission policy (deny, accept) • Forbearance management if BQS becomes unresponsive • BQS JM Outputs • BQS submission parameters • Class: A (=short), G (=Medium), T (=Long), J (=verylong) • Amount of {Mem, CPU, Scratch} • Farm name • Platform (SL3, SL4, SL5) • Logical resources (list of) • u_dcache_atlas, u_dcache_alice, u_OracleStress_atlas, … • VO Share • Wrapped data • WN profile to be used profilesDirectory = /afs/in2p3.fr/grid/profiles/glite/3.1.25-0/SL4_64/WN32 • Site Name • AFS token (or not) Pierre Girard - Glexec/SCAS: IN2P3-CC status

  7. Last BQS JM enhancements • BQS JM configuration capabilities • (Most of) BQS JM outputs are determined according to configuration rules • A rule is basically an assignment Ex.: SubmissionPolicy = ACCEPT • But can be conditionned depending on some job input data (in the precedence order) • Mapped account • Mapped group • CE queue Ex.: UserSubmissionPolicy_atlas050 = DENY # Specific requirements for ATLAS with queue verylong GroupVirtualQueueMaxMem_atlas_verylong = default GroupVirtualQueueMaxCPU_atlas_verylong = max GroupVirtualQueueMaxScratch_atlas_verylong = default • Configuration syntax • Is quite ugly • Makes the condition combination not possible • But, seems enough for now Pierre Girard - Glexec/SCAS: IN2P3-CC status

  8. Glexec deployment at IN2P3-CC • Glexec is a tool to be deployed on the WN • to be used by the VOs to manage the « real user jobs » within a job pilot • With a setuid capability (job pilot forks the « real user job » by using another account) • Site authorization by « real user job » based on real user proxy • How the deployment was planned • Deploy the Glite-WN/Glexec relocated on AFS • Use the configuration capabilities to redirect the pilot jobs to this deployment profilesDirectory = /afs/in2p3.fr/grid/profiles/glite/3.1.25-0/SL4_64/WN32 UserProfilesDirectory_dteam049 = /afs/in2p3.fr/grid/profiles/glite/3.1.25-0/SL4_32/WN32_GLEXEC • Sounded easy… Pierre Girard - Glexec/SCAS: IN2P3-CC status

  9. Glexec deployment Issues at IN2P3-CC • Glexec requires to be locally installed on Worker • Configuration file absolute path hardcoded • /opt/glite/etc/glite.conf • Only one MW configuration possible • Dynamic library configuration (due to « setuid ») • /etc/ld.so.conf • Only one MW installation possible • Log configuration (syslog) • Not so problematic for now Pierre Girard - Glexec/SCAS: IN2P3-CC status

  10. Glexec deployment in use at IN2P3-CC • We are part of the « SCAS Pilot Service » • Asked to provide SCAS/glexec in production • Load test for SCAS services by Atlas and Lhcb • Deployment done • Useable by both LHCb and Dteam • Through the T1 CEs • According to specific VOMS roles/groups • But • Deployment issues • Break down our WN setup strategy • Relocatable distribution was not ready (home-made) • First tests with LHCb • Were not satisfactory • Raised some questions Pierre Girard - Glexec/SCAS: IN2P3-CC status

  11. Glite-3.2.0 (SL5) at IN2P3-CC • Glite-WN only • Deployed on AFS • Tested with a test CE on BQS Farm « lcg » • Will be activated • as soon as SL5 workers enter the production (done) • A queue will be added to the T2 (T1?) CEs Pierre Girard - Glexec/SCAS: IN2P3-CC status

More Related