1 / 15

HEPiX/HEPNT report

HEPiX/HEPNT report. Helge Meinhard, Alberto Pace, Denise Heagerty / CERN-IT Computing Seminar 05 November 2003. HEPiX/HEPNT Autumn 2003 (1). Held 20 – 24 October at TRIUMF, Vancouver Format: Mon – Wed Site reports, HEPiX and HEPNT talks Thu: Large Cluster SIG on security issues

filia
Télécharger la présentation

HEPiX/HEPNT report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HEPiX/HEPNT report Helge Meinhard, Alberto Pace, Denise Heagerty / CERN-IT Computing Seminar 05 November 2003

  2. HEPiX/HEPNT Autumn 2003 (1) • Held 20 – 24 October at TRIUMF, Vancouver • Format: • Mon – Wed Site reports, HEPiX and HEPNT talks • Thu: Large Cluster SIG on security issues • Fri am: Parallel sessions on storage, security and Windows issues • Excellent organisation by Corrie Kost / TRIUMF • Weather not too tempting to skip sessions • Full details: http://www.triumf.ca/hepix2003/

  3. HEPiX/HEPNT Autumn 2003 (2) • 76 participants, of which 11 from CERN • Barring, Durand, Heagerty, Iven, Kleinwort, Lopienski, Meinhard, Neilson, Pace, Silverman, D Smith • 59 talks, of which 19 from CERN • Vendor presence (Ibrix, Panasas, RedHat, Microsoft) • Friday pm: WestGrid • Next meetings: • Spring: May 24th to 28th in Edinburgh • Autumn: BNL expressing interest

  4. Highlights • Unix-related (me) • Windows-related (Alberto Pace) • Security-related (Denise Heagerty)

  5. Site reports: Hardware (1) • Major investments: Xeons, Solaris, IBM SP, Athlon MP • Disappointing experience with HT • Increasing interest: • Blades (e.g. WestGrid – 14 blades with 2 x Xeon 3.06 GHz each in 7U chassis) • AMD Opteron • US sites require cluster mgmt software with HW acquisitions

  6. Site reports: Hardware (2) • Physical limits becoming ever more important • Floor space • UPS • Cooling power • Weight capacity per unit of floor space • Disk storage • Some reports of bad experience with IDE-based file servers • No clear tendency

  7. Site reports: Software (1) • RedHat 6.x diminishing, but still in production use at many sites • Solaris 9 being rolled out • Multiple compilers needed on Linux (IN2P3: 6), but not considered a big problem • SLAC looking at Solarix/x86 • AFS not considered a problem at all • SLAC organising a ‘best practices’ workshop (complementing LISA and USENIX workshops) – see http://www.slac.stanford.edu/~alfw/OpenAFS_Best.pdf)

  8. Site reports: Software (2) • NFS in use at large scale • Kerberos 5: No clear preference for MIT vs. Heimdal vs. Microsoft; lots of home bricolage around to have them synchronise • Reports about migrating out of Remedy • DESY and GSI happy with SuSE and Debian (except for laptops) • Condor getting more popular, considered as LSF replacement; Sun GridEngine mentioned as well

  9. CERN talks • Castor evolution (Durand) • Fabric mgmt tools (Kleinwort) • CVS status and tools (Lopienski) • Solaris service update (Lopienski) • Console management (Meinhard) • ADC tests and benchmarks (Iven) • New HEPiX scripts (Iven) • LCG deployment status and issues (Neilson) • LCG scalability issues (D Smith) • Windows and/or security related

  10. RedHat support (1) • Tue: Talk by Don Langley/RedHat • Described new model, and technical features of RHEL 3 released the day after • RHEL releases every 12…18 months, with guaranteed support for 5 years • Yearly subscriptions (per machine) grant access to sources, binaries, updates, and to support (different levels) • Said that RedHat would be able to find the right model for HEP • Reactions: Not everyone was convinced, no clear commitment to react to our needs, not the right level

  11. RedHat support (2) • Wed: Interactive discussion • Labs currently using RH wish to stay and go for RHEL; HEP-wide agreement preferred • High level of HEP-internal Linux support must be taken into account by RH • HEP- or site-wide licences much preferred over per-node • SLAC, FNAL and CERN contact RedHat in common in order to negotiate for HEP • Other HEP sites should be able to join if they so wish

  12. Other highlights (1) • PDSF Host Database project (Shane Canon) • Inventory mgmt, purchase information, technical details, connectivity, … • Similar objectives to some combination of BIS/CDB, HMS, LanDB, … • Unix and AFS backup at FNAL (Jack Schmidt) • Investigated TSM, Veritas, Amanda, some smaller vendors • Decided to go for TiBS (True incremental Backup System - Carnegie-Mellon offspring) – 1.6 TB in 5 hours • Large disk cache of backup data on server

  13. Other highlights (2) • Mosix and PBS clustering at TRIUMF (Steve Mcdonald) • Challenge: provide interactive and batch services with little budget • 7 dual-proc systems, three running OpenMosix all the time (one of them serving as a head node), rest running OpenPBS if jobs, migrating to Mosix if no jobs

  14. Mass storage workshop • A meta-meeting… • Discussed what and how to discuss • Joining in by VRVS: FNAL, RAL, IN2P3, DESY, FZK, … • Launched forum for MSS and their interoperability • E-mail list: hep-forum-mss@cern.ch • Each site to report (to Don Petravick/FNAL) about capabilities and needs conerning WAN interfaces, security, monitoring and protocols, file transfer protocols, mgmt protocols, replica system • Next meeting: VRVS conference in December • Next HEPiX: LCSIG will be on storage

  15. My personal comments • Excellent means of building relationships of trust between centres • No impression of cheating by anybody • Clear concrete steps towards sharing tools… • LAL using CERN printer wizard • CERN using SLAC console software • A lot of interest in ELFms • … and even when not sharing implementations, sharing ideas and information is very beneficial

More Related