1 / 18

Tier1A Status

Tier1A Status. Andrew Sansum 30 January 2003. Overview. Systems Staff Projects. Lots of Services. TESTBEDS. CPU FARM. CDF. Babar Suns. AFS. DISK FARM. Datastore. Core Services. Support Systems. Lots of Operating Systems. Production Farm Redhat 6.2 (Close to end of life)

reya
Télécharger la présentation

Tier1A Status

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tier1A Status Andrew Sansum 30 January 2003

  2. Overview • Systems • Staff • Projects

  3. Lots of Services TESTBEDS CPU FARM CDF Babar Suns AFS DISK FARM Datastore Core Services Support Systems

  4. Lots of Operating Systems • Production Farm • Redhat 6.2 (Close to end of life) • Redhat 7.2 (In production/ Babar) • Redhat 7.3 (close to Trial Service: For LHC) • CDF Service • Redhat 7.1 (Kerberised Fermi Distribution) • Redhat 7.3 (Possible Future release) • Solaris Service • Solaris 2.6/Solaris 8 • EDG Testbed(s) - Redhat 6.2 -> Redhat 7.3

  5. Lots of EDG Testbeds! • Production Testbed (CE, SE, 3*WN+NM) • Development Testbed (CE, SE, 1*WN) • RGMA Testbed (CE, SE, WN and RB) • WP5 SE • WP3/WP5 development systems • EDG UI • CE for REDhat 7.2 service

  6. Babar Tier1A SAMGrid Lots of Grid Testbeds!

  7. New Hardware • Disk • Expect 40TB • Continue with existing IDE technology, but different manufacturer. • CPU • Expect 100 CPUs • Move to Pentium 4 or possible AMD

  8. Some New Staff Users Experiment Support Staff (RAL and elsewhere) GridPP Staff: Traylen, Radden, Bly ESC/PPD System Staff: Wheeler, White, Sansum, Saunders, Ross, Folkes, Strong Management: Kelsey, Gordon, Sansum, ... BITD Support: Networking, Operations, User Reg, AFS

  9. Lots of New Projects • Basic fabric performance monitoring (ganglia) • Resource CPU accounting (based on PBS accounts/mysql) • New CA in production • New batch scheduler (MAUI) • Deploy new helpdesk (end March) • Network Performance tests (CERN/Bristol - also maybe WP7) • Get ready for LCG (February deployment?)

  10. Ganglia Monitoring • Urgently needed live performance and utilisation monitoring • RAL Ganglia Monitoring (live) • RAL Ganglia Monitoring (Static) • Scalable solution based on multicast • Very rapidly deployable - reasonable support on all Tier1A Hardware • See: http://ganglia.sourceforge.net/

  11. New CA Deployed • Now fully deployed by E-Science Centre (Jens+Alastair Mills) • In use in UK core GRID • Several PP have RA’s defined • Approved by EDG - not yet in distribution. • Once in EDG - termination date for old CA will be set.

  12. New Scheduler (MAUI) • With Redhat 7.2 now using MAUI Scheduler over PBS • Some problems with MAUI scheduling on wallclock time - now corrected. • Testing algorithms, but essentially have a range of strategies we can apply. • Will make changes to queue structure in due course

  13. New Helpdesk Software • Old helpdesk (Remedy) - mail based, unfriendly. • With additional staff, urgently need to deploy new solution. • Expect new system to be based on free software (Bugzilla, Request Tracker …) • Hope that deployed system will also meet needs of Testbed and Tier 2 sites. • Expect deployment by end of March.

  14. Network Performance Tests • Simon Metson, Nick White, +…. • Preparing for CMS production. Must be able to move data to CERN at 100-200Mbit/second. • Currently aggregate 350Mbit/s to Bristol - but under 100Mbit/s to CERN. • Main problem seems to be within CMS infrastructure

  15. BaBar Batch CPU Use at RAL MOU

  16. Successes (2002) • Five additional staff online since January 2002. • Fully engaged in EDG testbed. Making an impact in EDG: Steve • Tier1A installation went very well in March/April/May • Tier A service ramp up excellent: • Most successful of the Tier A services. SLAC seem pleased - so far.

  17. Challenges • Complete 2002/2003 tender/deployment • Carry out major EU tenders for 2003/2004 • Expand use of Tier 1 • Need to evolve strategy to cope with diversity of requirements • Deploy the LCG Testbed (What/When?) • Enhance automation / out of hours cover • Improve reporting to GridPP - accountability

More Related