60 likes | 167 Vues
This document reflects on the DELPHI experiment's significant transition from IBM mainframe systems to RISC/UNIX computing platforms, which took place around 1996. It covers the challenges faced, such as limited VAX resources for physics analysis, and the various tasks undertaken to facilitate the migration, including cross-compilation, software certification, and user support. The retrospective draws on personal notes from the era, acknowledging potential errors while highlighting key teams and motivations during the transition.
E N D
DELPHI experience • I left collaboration in 1997 so I won’t talk about data preservation but another costly effort • Change of offline compute platform from IBM mainframe RISC UNIX • Caveat: this recollection covers a period 20 years ago and is reconstructed from my own notes and mails from that period. Any error is entirely mine…
Compute platform • Up to ~1992 main computing platform @CERN was the IBM mainframe CERNVM • As a physicist one quickly learned about • VM/CMS, REXX, XEDIT • DAQ and production learned VAX VMS • All offline code cross-compiled and certified • YPATCHY with CARds and CRAdles to control the versioning and platform differences • DELPHI’s own farms in p8 and basement of bat27 was entirely VAX based, at least to begin with • But for normal physics analysis there were never enough VAX compute resources to motivate a serious take-up
DELPHI offline in numbers • 16 sub-detectors • ~1Hz trigger rate, average event size 50kB, peak event size 200kB (1990) • Yearly total DST (Data Summary Tape) volume • 1989: 3x IBM 3480 cartridge • 1990: 38x • 1991: 146x • Source code was pretty FORTRAN 77 formatted for punch-card compatibility • Main packages • DELSIM: 0.5-1M of SLOC (DELPHI didn’t use Geant3) • DELANA: ~>1M SLOC
Migrate to RISC/UNIX • Compute cycles on CERNVM were precious • If you wanted run >1hrs job • go to offline computing office and give your justification • Priority was given to: • Higgs search and Z0 resonance precision measurements • QCD studies (like mine) were queued behind Strong motivation to seek for other resources
Benchmark that convinced me 2x60 = 120 CPUs/event 2x38x24x60x60/55’000 119 CPUs/event
Moving to RISC/UNIX • Core teams (~1993): • DELPHI • Alfonso Lopez-Fernandez (CERN) (later Gilbert Grosdidier) • D. M. Edsall (Iowa State University Ames HEP Group) • Luiz Mundim(LAFEX-CBPF/PUC) • O. Bärring (CERN) • CN (IT) SHIFT • Frédéric Hemmer • Christian Boissat • CN Public Login (+AFS): Tony Cass • Tasks • Create an attractive and reliable compute and storage resource • More for less ($/IBM cycle $/Risc cycle) • Cross-compile and certify • Mirror production and analysis software repos • Nightly builds + tests • Facilitate migration (REXX sh, csh) • Write targeted user guides (UNIX, SHIFT, FATMEN, …) • User support, public presentations and publicity • Rough cost: • Hardware (provided by CN): negligible (probably ~10-2 x IBM) • People: DELPHI ~6 person years for the full migration • Dedicated efforts completed in ~1995-6. Daily tasks (support etc.) integrated with general offline activities