310 likes | 397 Vues
Explore the current and future LCG deployment in the UK, covering Tier 1 and Tier 2 centers, user support, middleware development, network operations, and security measures. Learn about the GridPP initiative and the integration with international projects like EGEE.
E N D
LCG Deployment in the UK John Gordon GridPP10
You’ve heard about LCG… • … so what’s happening in the UK? • LCG Deployment, now and future • The wider UK picture • ….and what’s this EGEE? • The Plan
A. Management Structure LCG ARDA EGEE Expmts CB PMB Deployment Board User Board Tier1/Tier2, Testbeds, Rollout Service specification & provision Requirements Application Development User feedback Metadata Storage Workload Network Security Info. Mon. In LCG Context
Recent LCG ScotGrid NorthGrid SouthGrid London Grid • Tier1 +10 other sites • DCs • Tier2 structure • Support structure • GOC Monitoring • LCG Accounting
GridPP Summary: From Prototype to Production BaBarGrid BaBar EGEE SAMGrid CDF D0 ATLAS EDG LHCb ARDA GANGA LCG ALICE CMS LCG CERN Tier-0 Centre CERN Prototype Tier-0 Centre CERN Computer Centre UK Tier-1/A Centre UK Prototype Tier-1/A Centre RAL Computer Centre 4 UK Tier-2 Centres 19 UK Institutes 4 UK Prototype Tier-2 Centres Separate Experiments, Resources, Multiple Accounts Prototype Grids 'One' Production Grid 2004 2007 2001
Vision • GridPP2 should deliver a production quality grid • Meeting the computing needs of UK Particle Physics • Autonomous and self-supporting with its own identity • Participating in LCG, EGEE, BaBarGrid, SAMGrid, and any others desired by its members • Part of an integrated UK Grid • Independent but integrated, separate but seamless
Delivery Plans • Keep up with LCG • Participate in LHC Data Challenges • TierA for BaBar and BaBarGrid • Participate in LCG Service Challenges • Use by other VOs • Put in place the structure to deliver this • …..and more
Production Team • Deployment • User Support • Middleware Support • Applications Support • Network Support • Security • Operations
UK Tier-2 Centres ScotGrid NorthGrid SouthGrid London Grid NorthGrid **** Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid * Birmingham, Bristol, Cambridge, Oxford, RAL PPD, Warwick ScotGrid * Durham, Edinburgh, Glasgow LondonGrid *** Brunel, Imperial, QMUL, RHUL, UCL Current UK Status: 11 Sites via LCG
Tier2 Centres ScotGrid NorthGrid SouthGrid London Grid • UK model of distributed Tier2 Centres • Managerial and organisational ‘centre’ • Tier2 is free to organise internally • so I cannot describe yet • Tier2 is smaller than an EGEE Region • but some aspects of the model may be useful (their own VO? own RB?) • May hide some of the internal structure CE, GIIS?
Deployment • A Team to roll out software across UK • Software release certification, installation support, site certification • Specialist support for sysadmins • Consists of staff from T1 + T2
User Support • Migrate from mailing list to problem-tracking • From sysadmin support to user support • Managed Helpdesk • for assignment, tracking, escalation • We already have a lot of experience • we haven’t encapsulated it in FAQs etc
Middleware, Security and Network Development Security Middleware Networking Network Monitoring Configuration Management Grid Data Management Storage Interfaces Information Services Security M/S/N builds upon UK strengths as part of International development
Middleware Support • GridPP2 Middleware development should have an emphasis on delivery and support • Middleware teams should support their software area • T2 assigned 5 specialist support posts • Integrate support effort into Production Team
Applications Support • Stephen Burke – roaming support • 2 T1 experiment-facing people • UK experiments • Get deployment and middleware support working with experiments • to ensure successful UK involvement in experiments’ use of Grid.
Network Support • Mark Leese (CCLRC-DL) • Rolled out network monitoring to UK Core e-Science programme • GridPP2 role in network support • Network optimisation • Participation in service challenges • Hopefully using lightpaths
Security • New Security Officer (to be appointed) • Security operations • Consultants • Kelsey - Joint EGEE-LCG Security • Jensen – technical advice to CA/ middleware • McNab – e-Science Security Centre • Track UK developments (Permis, Shibboleth)
GOC Secure Database Management via HTTPS / X.509 GOC GridSite MySQL Monitoring Resource Centre Resources & Site Information EDG, LCG-1, LCG-2, … bdii ce se rb RC
Operations • LCG Operations centre • EGEE ROC • Monitor GridPP (and NGS and GridIreland) • Developed tools for LCG, reuse for GridPP • Continue developing for EGEE • EGEE CIC running grid-wide services • Accounting
Wider Support • GSC • UK helpdesk • UK E-Science CA • Training • Our own and EGEE(NeSC)
Other UK Grids • NGS • National Grid Service • 4 large clusters + 2 UK Supercomputers • Already using VDT and BDII • ETF • Developing UK OGSA/WSRF Grid • UK Grid Operations Centre Director • Speaking next • Should all be part of EGEE
EGEE • UK/I Region in EGEE covers GridPP, NGS, and Grid Ireland – one of 10 regions • EGEE’s aim is to integrate national grids • Not to interfere or impose limits on them • All of the work I have described, short of actually running the Resource Centres, is EGEE work • Many sites are actually signed up to EGEE so we can report it formally as such • Many of you will be asked to report work to EGEE (timesheets, quarterly reports) but this shouldn’t be an imposition • The development of GridPP will be aligned with EGEE • But EGEE is not well defined, so we plan GridPP and participate in the developing EGEE to learn, adopt, and influence.
EGEE Issues • EGEE=LCG? • non-European sites in LCG • non-LCG sites in EGEE • Platform Support • non-Linux, free linux (cf RHEL) • Integrated user support • Support for new VOs • Security, security, security
The Next Steps • Just appointed Jeremy Coles • as GridPP Production Manager • Grid Definition • define GridPP, • get buy-in of stakeholders • Production Team • build the team • Workplan
Production Manager Tasks • Develop work plan (deliverables/milestones) • Compile problems and issues list (implement tracking) • Organise a GridPP deployment group workshop • Better establish GridPP identity – address UK specific needs • Review/develop operating procedures to maintain GridPP service • Get GridPP more involved at UK/experiment software meetings • Coordinate UK Tier-2 resource input to LCG and EGEE • Work with other grids to establish a single production grid.
Running a production service: areas to be reviewed and developed Main areas to be considered (transparency, control, accountability, security, improvement) • Grid accounting • Who needs to know what and in what form? Where are the gaps in LCG accounting? • Grid monitoring • Service-level management tools. Efficiency of resource usage. Replication issues. • Detailed metrics to be agreed • Real-time notification and problem resolution • Management & reporting • Grid management: VO setup procedures; adding new Tier-2 resources • Frequency, structure and content of reports to be agreed (e.g. resource usage, job success rates against targets) • Security • Processes and procedures (e.g. incident handling) • Mechanics of trust model defined: identity, privacy, policy and authority. (e.g how are rights revoked. Appeals.) • Misuse of resources (intrusion), user & usage audits • Support • Installation (joining) requirements/guidelines • integration & helpdesk requirements • Library – deployment documentation. User feedback – mechanism to inform future developments • Training • For new GridPP users and new operations staff • Middleware release strategy (and stabilisation!) • Tier-2 management • Service levels (SLAs/MoUs to be developed) • Resource, quota and priority handling • Resource • Maintenance plans • Audit • Of Grid usage by user/VO
Vision • GridPP2 should deliver a production quality grid • Meeting the computing needs of UK Particle Physics • Autonomous and self-supporting with its own identity • Participating in LCG, EGEE, BaBarGrid, SAMGrid, and any others desired by its members • Part of an integrated UK Grid • Independent but integrated, separate but seamless
Challenge • LCG has given us a good base • We now have a critical mass based on LCG2 • Make it production quality grid • Attract the satellite grids UKQCD, BaBar, • And bring in other experiments • Participate fully in LCG and EGEE • Without alienating non LHC experiments
Can we do it? Yes, we can!