1 / 31

Update on EU DataGrid progress and plans for EGEE

Update on EU DataGrid progress and plans for EGEE. Fabrizio Gagliardi EU DataGrid Project Leader Fabrizio.Gagliardi@cern.ch www.edg.org. Overview. Project Outline Atlas task force CMS stress test Tutorials Relationships with other grid projects Future Directions (EGEE) Summary.

Télécharger la présentation

Update on EU DataGrid progress and plans for EGEE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Update on EU DataGrid progress and plans for EGEE Fabrizio Gagliardi EU DataGrid Project Leader Fabrizio.Gagliardi@cern.ch www.edg.org

  2. Overview • Project Outline • Atlas task force • CMS stress test • Tutorials • Relationships with other grid projects • Future Directions (EGEE) • Summary EDG/EGEE status

  3. The Project • 9.8 M Euros EU funding over 3 years • 90% for middleware and applications (HEP, Earth Obs. and Bio Med.) • Three year phased developments & demos (2001-2003) • Total of 21 partners • Research and Academic institutes as well as industrial companies • Related projects and activities: • DataTAG (2002-2003) • CrossGrid (2002-2004) • GRIDSTART (2002-2004) • Grace (2002-2004) EDG/EGEE status

  4. DataGRID project priorities After initial middleware development and testbed deployment effort has been refocused on quality and stability • Quality Policy Statement published http://eu-datagrid.web.cern.ch/eu-datagrid/WP12/default.htm • List of priorities defined at the project retreat http://documents.cern.ch/age?a021130 • Followed-up at project conference http://www.tomiexpress.hu/datagrid/ • Show-stoppers found by users on the application testbed are the highest priority • Incremental improvements to current release driven by the needs of the applications (HEPCAL) EDG/EGEE status

  5. ATLAS-EDG Task Force (by Oxana Smirnova) • ATLAS is eager to use Grid tools for the Data Challenges • ATLAS Data Challenges are already on the Grid (NorduGrid, USA) • The DC1/phase2 (to start in October) is expected to be done using the Grid tools to a bigger extent • ATLAS-EDG Task Force was put together in August with the aims: • To assess the usability of the EDG testbed for the immediate production tasks • To introduce the Grid awareness to the ATLAS collaboration • The Task Force has representatives both from ATLAS and EDG: 40 members (!) on the mailing list, ca 10 of them working nearly full-time • The initial task: to process 5 input partitions of the Dataset 2000 at the EDG Testbed + one non-EDG site (Karlsruhe); if this works, continue with other datasets EDG/EGEE status

  6. Achievements (by Oxana Smirnova): • A team of hard-working people across the Europe • ATLAS software (release 3.2.1) is packaged into relocatable RPMs, distributed and validated elsewhere • DC1 production script is “gridified”, submission script is produced • User-friendly testbed status monitor deployed • 5 Dataset 2000 input files are replicated to 5 sites (2 @ each) • After fixing the “long jobs” problem, 50% of the planned challenge is performed (5 researchers × 10 jobs) – unfortunately, only CERN testbed was fully available • With the rest of the testbed being fixed, jobs are getting scheduled and executed elsewhere • Second test: 4 input files (ca 400 MB each) replicated to 4 sites; 250 jobs submitted, adjusted to run ca 4 hours each. The jobs were distributed across all the testbed by the Resource Broker EDG/EGEE status

  7. Summary ( by Oxana Smirnova ) • Advantages of the Grid: • Possibility to execute tasks and move files over a distributed computing infrastructure by using one single personal certificate (no need to memorize dozens of passwords) • Possibility do distribute the workload adequately and automatically, without logging in explicitly to each remote system • Possibility to do worldwide production in a perfectly coordinated way, using identical software (RPMs), scripts and databases • Where we are now: • Several Grid toolkits are on the market • EDG – probably the most elaborated, but still in development • This development goes way faster with the help of the users running real applications • Common efforts of the ATLAS-EDG Task Force proved that it is possible to execute real tasks on the EDG Testbed already now • Thanks all the members for the efforts so far, but there’s more to be done! EDG/EGEE status

  8. CMS/EDG stress test status Andrea Sciabà on behalf of CMS & EDG collaboration CCS general meeting December 3, 2002

  9. Sites and resources EDG/EGEE status

  10. CMSIM events vs. time EDG/EGEE status

  11. Current issues • The biggest problems related to the Information System: • Symptom: no resources are foundCause: instability of the MDS when it is overloaded • Solution: submitting jobs at a lower rate improves the chances of success • Symptom: the RB gets stuck (no job ever starts)Cause: investigating... • Symptom: grid elements disappear from the IICause: services on some machines stopped workingSolution: restart the services • Symptom: timeouts when copying the input sandbox • Symptom: log file lost (“Stdout does not contain useful data”)Cause: several (no free files/inodes, broken connect. between CE & RB, …) • Problems related to the replica manager: • Symptom: file registration in the RC fails from time to time EDG/EGEE status

  12. Current issues • None of these problems is a show-stopper and they happen just in a fraction of the jobs! • Fixes are already there for some of them (but not yet deployed) EDG/EGEE status

  13. Conclusions • 50000 events (FZ files) produced in ~ 2 days! • The CMS Task Force has made impressive progress and the first results are promising. A few issues have been identified and solutions are being worked out/applied • The entire task force shows a fruitful cooperation between CMS and EDG! EDG/EGEE status

  14. DAY1 Introduction to Grid computing and overview of the DataGrid project Security Testbed overview Job Submission lunch hands-on exercises: job submission Tutorials The tutorials are aimed at users wishing to "gridify" their applications using EDG software and are organized over 2 full consecutive days. Approx. 100 people have followed the tutorial since August. October: 3 & 4 – CERN 31 & Nov 1 - CERN December 2 & 3 – Edinburgh 5 & 6 - Italy 9 & 10 – NIKHEF 12 - Cracow More sessions will be organised in the future http://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/ DAY2 • Data Management • LCFG, fabric mgmt & sw distribution & installation • Applications and Use cases • Future Directions lunch • hands-on exercises: data mgmt EDG/EGEE status

  15. GriPhyN PPDG iVDGL Related Grid projects Through links with sister projects, there is the potential for a truely global scientific applications grid EDG/EGEE status

  16. CrossGrid Using the same security certs. Testbed sites install EDG software Extending it for needs of intensive interactive applications Participating in the EDG testing activities Representatives in each projects architecture & management groups DataTAG (EDT) EDT is deploying EDG sw to investigate inter-operability with US projects (iVDGL, GriPhyN, PPDG) Results feedback into EDG software releases (e.g. GLUE compatible information providers/consumers) NorduGrid Using the same security certs. Involved in EDG architecture work Good ideas for gatekeeper and MDS configuration Helped develop GDMP and GSI extensions for Replica Catalog Involved in Glue schema work Security policy Mware testing Working in WP8 (HEP applications) iVDGL/GriPhyN/PPDG US members in EDG architecture group Looking for common packaging and toolkit usage solutions GriPhyN PPDG iVDGL Interaction with sister projects No strict boundaries with a large cross-fertilization of ideas, software and people DataGRID is learning from the experiences in these projects EDG/EGEE status

  17. Plans for the future • Further development in 2003 • Further iterative improvements to middleware driven by LCG and users needs • More extensive testbeds providing more computing resources • Prepare EDG software for future migration to Open Grid Services Architecture • Interaction with LCG • LCG intends to make use of the DataGRID middleware • LCG is contributing to DataGRID • Testbed support and infrastructure • Get access to more computing resources in HEP computing centres • Testing and verification • Reinforce the testing group and maintain a certification testbed • Fabric management and middleware development • New EU project (EGEE) • Make plans to preserve current major asset of the project: probably the largest Grid development team in the world • EoI for FP6 ( www.cern.ch/egee-ei ) EDG/EGEE status

  18. EGEE vision Enabling Grids for E-science in Europe • Goal • create a general European Grid production quality infrastructure on top of present and future EU RN infrastructure • Build on • EU and EU member states major investment in Grid Technology • Several pioneering prototype results • Largest Grid development team in the world • Goal can be achieved for about €100m/4 years on top of the national and regional initiatives • Approach • Leverage current and planned national and regional Grid programmes (e.g. LCG) • Work closely with relevant industrial Grid developers, NRNs and US Applications EGEE Geant network EDG/EGEE status

  19. Work done so far • EoI for FP6: www.cern.ch/egee-ei submitted on June 7th • Several follow up meetings • An editorial board and an Interim Task Force established to prepare a position paper and a presentation for a EU Grid workshop in Brussels on October 3-4 • Both bodies extended to follow-up with the EU (IST02, ER02, individual contacts) EDG/EGEE status

  20. GÉANT and GRIDs: The model GRIDs use GÉANT infrastructure Application areas GÉANT profits from technological innovation GRIDs empowered GÉANT R&D on GRIDs GRIDs platforms GÉANT network International dimension EDG/EGEE status

  21. Instruments Research Infrastructures IST Programme Structuring the ERA Programme 665 M Euro GÉANT, GRIDs, other ICT-RI 100 + 200 M Euro 2.655 M Euro K. Baxevanidis EU 3.825 M Euro • Integrated Projects • Networks of Excellence • Specific Targeted Projects • Coordinated actions • Support actions • Integrated Infrastructure Initiatives • Coordinated actions • Support actions • More info on: http://www.cordis.lu/ist/fp6/activities.htm Separate calls for proposals! EDG/EGEE status

  22. Communication Network Development Call • 45-47 Million Euros available in the first EU call (Dec 17th, 2002) • Hard to get the whole budget, we will need to share with one, two, more projects and a lot of competition to be expected (1200 EoIs received in this area!) • Focus on support and integration of already established Grid infrastructures • Build a Grid production layer on top of the EU RN infrastructure • No major funds for H/W, CS research or application development (in a first approximation) EDG/EGEE status

  23. Integrated Infrastructure Initiative (I3) • Three lines of funding supported (with possible budget breakdown): • Networking activities (nothing to do with networks…): • This is the overhead: management, coordination, dissemination and outreach (7-10% of the total funding) • Specific service activities: • Provision and procurement of Grid services (60% of total funding) • Joint research activity • Engineering development to improve the services provided by the Grid infrastructure (20% of total funding) • Application support and focused R&D (10% of total funding) EDG/EGEE status

  24. Networking activities • Coordination and management of the participating Grid infrastructures • Management structure to be defined • Dissemination, training and outreach • Leverage EDG and other project tutorials • Proposal from Terena received • User clubs, industry forum etc. EDG/EGEE status

  25. Specific service activities • Integration of major national and international Grid infrastructures • Two tier structure: • 1st Tier: Major Grid centres (6-8). Must satisfy minimum level of Grid resources and staffing • 2nd Tier: POPs in all other Geant supported countries • EU resources for doubling the 1st tier centres Grid support staff, a central operation centre and a distributed call and support centre • Interface to Geant follow-on project • Mostly staff and overhead (computer fabrics and storage provided by the partners) EDG/EGEE status

  26. Joint research activity • Focus on hardening and re-engineering of Middleware • Leverage current EU Grid projects and international Grid technology developers (large and established M/W development community) • 8-10 WPs with critical mass in a single geographical center, dedicated WP managers hired by the project and reporting to the project technical management (possible international and industrial participation) • Quality assurance group, integration, certification and distribution group with industrial quality • International senior advisory group for project review, long term technology development and direction EDG/EGEE status

  27. Additional activities • Application support: • high level interface and portals • user requirements (a la HEPCAL) • CS focused activity: • Long term CS issues for production quality Grids EDG/EGEE status

  28. Distribution of responsibilities Motivation: provide transparent, effective process for proposal preparation EGEE Executive Committee: • Responsible for defining Work Packages and setting up Task Forces to deliver technical content for proposal. Max ~10 persons for effective process • Should represent stakeholders with major, proven computer and human resources to contribute to EGEE • US has observer status (Ian Foster) EGEE technical advisory board: • Advise the Executive Committee on the overall architecture and specific technical issues • US participation confirmed EDG/EGEE status

  29. Distribution of responsibilities EGEE Editorial board: • Responsible for gathering input from taskforces, overall editing of proposal, filling out administrative forms and maintaining timeline EGEE National Partners board: • Responsible for coordination and communicating with interested parties on national/regional level. Ideally one person per country/region • Consulted by Executive Committee during preparation of proposal, to ensure adequate transparency – must be seen as impartial EGEE interest group: • All institutes, companies, organisations interested to remain informed about progress of EGEE proposal. Includes potential subcontractors for different workpackages EDG/EGEE status

  30. EGEE proposal timeline Tentative Schedule (continued) • EU call out on Dec 17th • Draft 1: overall project structure end of February 2003 • Draft 2: with detailed workpackages end of March 2003 • Final proposal including admin and management end of April 2003 • Submission by May 6th 2003 • First feedback from EU in June-July • Contract negotiation late summer, fall ’03 • Contract signature by the end of ’03 • Start of project Q1-Q2 ‘04 EDG/EGEE status

  31. Summary • ATLAS/DataGRID task force has been a successful experience for EDG • CMS stress test still on-going is a major advance on production quality performance in view of next EU EDG review on February 4-5 • Deployment of a very large production Grid testbed being explored with the EU (EGEE) • This needs to be done in close collaboration with LCG and the US Grid developers for the maximum benefit of the LCG experiments and potential application to other international scientific communities (also good for long term future of HEP…) EDG/EGEE status

More Related