WMS priorities for TCG work in progress

WMS priorities for TCGwork in progress Laura, Stefano & Francesco

Layout • Take into account also VOMS and Accounting related to Job Management • Look first at 3.0 then beyond, estimate FTE per issue, try time-priority at the end • Glexec at the very beginning as not really WMS • FTE estimations and assessments of the potential value of the features not directly present in the user requests (e.g. most of gLite CE) come from Francesco (and some educated guesses), of course once priorities are defined JRA1 should come back with real schedule

Glexec on WN • Not really a gLite WMS issue • Jra1/security development at Nikhef • Fabric issue • Deploy a SUID executable on all WN • No work from WMS or DGAS • APEL to charge resources to pilot submitter • Times (J.White + our guess) • Tool by June • For PPS by July

WMS 3.0: list for “production” • gLite RB talking to LCG CE first priority. Test priorities: • Single jobs a’ la LCG-RB (SC4 requisite) • Shallow resubmission for those • Bulk submission • gLite CE next. Motivations: • more performance and stability • New Condor (clustered connections, lease for job delegation) • not using GRAM for single jobs • Needed for “requirements passing to batch” (via BLAH, will transition to CREAM smootly) • gLite CE: large issue of interface/support with Condor: need SA3 dedicated effort (integration with external components moves from JRA1 to SA3 in EGEE-2) • gLite CE can not be made to work by JRA1 alone

Beyond 3.0 • No all encompassing 3.1, 3.2, 3.3 • Adding features with new releases of single components • Starting from what is already coded and tagged for 3.1

Beyond 3.0 by Group • WMS scalability • WMS high-avaibility • WMS failure tracking/reduction • Job accounting • VOMS aware WMS • SDJ • MPI • friendliness * miscellanea

WMS scalability (304, 306) • High priority, but not critical for SC4 • WMS+Atlas+CMS brainstorming 21/12/05 • Most promising mid-term ideas • Relevant attribute in matchmaking (bulk mm) • 6 FTE*months (JRA1/WM) for possilbe large factors (x10 ?) • Route sandboxes away from RB (original plan anyhow) • Sandox as URL (gsiftp) in gLite 3.0 documentation • Adding http opens to Squid. Add encription ? • Maybe http there already via g-u-c ? • User’s http server, or WMS service (not RB) ? • 2 FTE*months (JRA1/UI) ? • Encourage output sandboxes away from RB (SA1 ?) • User storage on SE • “tactical” SE dedicated to RB • VO to provide solution ? • Cream : Simpler, lighter, faster ?

WMS (RB) high avaibility • Hot standby (303) • “standard” linux HA : available now • Two machines, one dies, one takes up • Multiple RB plus network file system: • N RB’s using NFS/AFS shared disk, hot swap RB’s to replace failed ones with same name, IP, certs, status, jobs within minutes • 1 FTE*month (JRA1/WM) + N (~3) months test • One network alias for a pool of RB (302) • Design required • Major issue • No estimate

Jobs failures tracking/reduction • Prolog (in gLite 3.1 now) (user’ sanity checks, shallow resubmission if failure) • Epilog (same logic, semantic to be defined) ?? • 1 FTE*month (JRA1/WM) (when defined) • Sanity cheks in job wrapper (from WMS point of view, e.g. gridftp to RB, prevent Maradona error) • 1 FTE*month (JRA1/WM) + test • Misconfiguration report • When sanity check fails, drop a line to someone • Design required, SA1 to be involved • 1 FTE*month (JRA1/WM) + test • Better error report from UI (in gLite 3.1 now) • CREAM: remove GRAM pains, better maintainability

Job Accounting • DGAS + APEL • All in 3.1 • High priority • No new development required • APEL extensively used on most sites already • DGAS in use in INFN sites already • Both however used without the mod.s for gLite CE

“VOMS”: scheduling & priorities • Shorter latencies for shorter jobs (307) • To be studied, possibly solved with requirements passthrough via Blah and/or VoView • Voview (glue 1.2) (in 3.1, high priority) • Pass requirements to batch systems • Need adaptor for PBS, MAUI, BQS… • 1 FTE*monts JRA1/RB for each (?) • G-Pbox (in 3.1, after VoView)

SDJ • JDL mods in UI: done • SDJ CE : effort = question for SA3 • Optional config for willing sites, should not be a serious problem • Glogin : effort = SA3 • Security/deployment issue • No way to limit usage once advertised • Very dangerous • Should probably not encourage use by deploying it on UI • GRAM fork queue on “to be removed” list anyhow

MPI • ~All in 3.0 + 3.1 as far as we can tell • Prolog+epilog to provide pre/post exec for MPI. Epilog to be put (see job failures, slide 9) • But ATLAS-CMS no MPI experts: input from other NA4 may be needed here…

Friendliness - miscellanea • Smarter UI round robin (301) • 1 FTE*month JRA1/UI (clear specs. Needed) • File perusal on WN directory (partly in 3.0) • Interactive access to running jobs (top, ls) (310) • 1 FTE*month JRA1/RB partial functionality • undefined for full (need design) • (VO solution ?) • 311/312 = CEMON+CREAM • Code ready JRA1/CREAM • Need a place where users can look at it

Simplifications • Items to be removed from “Markus list” because absorbed into others • 305 • 309 • 409

Priorities • gLite 3.0 to work • gLite RB single jobs to LCG CE • gLite RB shallow resubmission • MPI support • gLite RB bulk submission • gLite CE • Add from current gLite 3.1 tag (next slide) • New development (next-next slide)

3.1 priorities for test/integration • JRA1/security (glexec) • JRA1/UI • SDJ support • MaxCPUperJobs • Better error reporting • Sandboxes by URL • JRA1/RB • New Condor • Information pass-thru to LRMS ?? • JRA1/CE-Blah • Support for accounting • JRA1/WM • Glue 1.2 (VoView) • Job prolog (also UI) • shallow resubmission of DAG • Independent groups in JRA1 involved, could proceed in parallel. • From integration (SA3) standpoint this means: • DGAS+APEL first • then gLite 3.1 WMS RB • new condor in RB and gLite CE a soon as available

3.1 not ready for PPS • Cream: never exposed to users • JobProvenance: never exposed to users, do VO’s really (still) want it ? Anyhow will improve LB performace • G-Pbox: use case to be understood, esp. with VOview in, very limited exposure to users • Information pass-thru to LMRS, will be available, but until usage is better understood may be in flux and need evolution •  • preview test bed • Further thinking • NEED A PREVIEW TEST BED, CAN’T HAVE PPS ONLY

JRA1 UI 1M better RR 1M epilog 1M (squid) ? JRA1 WM 1M (HA) 1M (sanity checks) 1M epilog 1M (misconf. report) 6M (bulk mm) WMS Develop threads (prioritized) JRA1 RB/CE • 1~3M req.to lrms • 1M epilog • 1M file perusal JRA1 WMS longer term (not really serial) • CREAM integration • Top/ps/ls on running jobs • G-Pbox • DNS alias plus scalable pool (?)

WMS priorities for TCG work in progress