1 / 31

ATLAS Software & Computing Status and Plans

DOE/NSF Review – January 2003, LBNL. ATLAS Software & Computing Status and Plans. Dario Barberis University of Genoa (Italy). Foreword.

nellis
Télécharger la présentation

ATLAS Software & Computing Status and Plans

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DOE/NSF Review – January 2003, LBNL ATLAS Software & ComputingStatus and Plans Dario Barberis University of Genoa (Italy) Dario Barberis – Università e INFN, Genova

  2. Foreword • I have been designated by the ATLAS Management to be the next Computing Coordinator, and the ATLAS Collaboration Board has been asked to endorse this proposal (e-mail vote by Collaboration Board in process) • Main parts of this talk were prepared with contributions of the outgoing Computing Coordinator, N. McCubbin, and several other members of the Computing Steering Group • Organizational changes outlined at the end of this talk are still proposals being discussed within the ATLAS Collaboration Dario Barberis – Università e INFN, Genova

  3. Outline • Data Challenges • GRID • Geant4 • LCG • Computing Organization • Software development plans Dario Barberis – Università e INFN, Genova

  4. DC0: readiness & continuity tests(December 2001 – June 2002) • “3 lines” for “full” simulation • 1) Full chain with new geometry (as of January 2002) Generator->(Objy)->Geant3->(Zebra->Objy)->Athena recon.->(Objy)->Analysis • 2) Reconstruction of ‘Physics TDR’ data within Athena (Zebra->Objy)->Athena rec.-> (Objy) -> Simple analysis • 3) Geant4 robustness test Generator-> (Objy)->Geant4->(Objy) • “1 line” for “fast” simulation Generator-> (Objy) -> Atlfast -> (Objy) Continuity test: Everything from the same release for the full chain (3.0.2) • we learnt a lot • we underestimated the implications of that statement • completed in June 2002 Dario Barberis – Università e INFN, Genova

  5. ATLAS Computing: DC1 • The ‘Phase1’ (G3) simulation (Jul-Aug 2002) was a highly successful world-wide exercise from which we learned a lot, e.g. software distribution, importance of validation, etc. • Grid tools were used in Scandinavia (‘NorduGrid’) for their full share of DC1, and in USA for a significant fraction of theirs. Grid tools have also been used for an extensive ATLAS-EDG test involving 6 sites, aimed at repeating ~1% of ‘European’ DC1 share. • We have launched end Nov 2002 ‘Phase 2’, i.e. the “pile-up” (2x1033 and 1034) exercise, following ‘site validation’ (55 sites) and ‘physics validation’. The HLT community specifies details of what samples are to be piled up. Most sites completed by mid-December, last few jobs running right now. • About the same CPU neeed as for phase 1 • 70 Tbyte, 100 000 files • Additional countries/institutes joined in • Large scale GRID test in since end November in preparation for reconstruction • Reconstruction February-March 2003 • using ATHENA. CPU needed <10% than in simulation, but 30 TB of data collected in 7 simulation production sites. Dario Barberis – Università e INFN, Genova

  6. ATLAS DC1 Phase 1: July-August 2002 3200 CPU‘s 110 kSI95 71000 CPU days 39 Institutes in 18 Countries • Australia • Austria • Canada • CERN • Czech Republic • France • Germany • Israel • Italy • Japan • Nordic • Russia • Spain • Taiwan • UK • USA grid tools used at 11 sites 5*10*7 events generated 1*10*7 events simulated 3*10*7 single particles 30 Tbytes 35 000 files Dario Barberis – Università e INFN, Genova

  7. ATLAS Computing: DC1 WGs & people (under the responsibility of the Data Challenge Coordinator, G.Poulard) • A-Wp11: Tools: • Bookkeeping & cataloguing (S. Albrand, L. Goossens + 7 other physicists/engineers) • ProductionWG: L. Goossens, P. Nevski, S. Vaniachine • + Virtual Data catalog (S. Vaniachine, P. Nevski) • + Grid Tools providers (NorduGrid & US) • Organisation & DocumentationWG: A. Nairz,N. Benekos+ AMI and Magda people (in close connection with bookkeeping & cataloguing WG) • A-Wp12: Teams • "Site" validation (J-F. Laporte) • All local managers from collaborating institutes • Physics Validation (J-F. Laporte, F. Gianotti + representatives of HLT and Physics WG’s) • Production • WG: P. Nevski, S. O'Neale, L. Goossens,Y. Smirnov, S. Vaniachine • + local production managers • (39 sites for DC1/1 and 56 sitesfor DC1/2) • + ATLAS-Grid people • A-Wp1: Event Generator (I. Hinchliffe + 8 physicists) • A-Wp2: Geant3 Simulation (P. Nevski) • A-Wp3: Geant4 Simulation (A. Dell'Acqua) • A-Wp4: Pile-up (M. Wielers) • "Atlsim" framework (P. Nevski) • "Athena" framework (P. Calafiura) • A-Wp5: Detector response • (not active for DC1) • A-Wp6: Data Conversion (RD Schaffer + DataBase group) • Additional people were active for DC0 • + people involved in AthenaRootI/O conversion • A-Wp7: Event Filtering (M. Wielers) • A-Wp8: Reconstruction (D. Rousseau) • A-Wp9: Analysis (F. Gianotti) • A-Wp10: Data Management (D. Malon) • A-Wp13: Tier centres (A. Putzer) • WG: responsible of production centers • + contact person in each country • A-Wp14: Fast simulation (P. Sherwood) • WG: E. Richter-Was, J. Couchman Success of DC1 due to effort and commitment of many world-wide sites, actively organized by A. Putzer Dario Barberis – Università e INFN, Genova

  8. ATLAS Computing: DC1 • Currently we are preparing (validating) the Athena-based reconstruction step: Software Release 6 (end January). Aim is that we can launch wide-scale reconstruction a.s.a.p. after Release 6, possibly with wide use of some GRID tools. [The actual reconstruction, which will probably be (re-)done on various sub-samples over the first few months of next year is not strictly part of DC1.] • Note that our present scheduling of software releases is driven entirely by HLT (High Level Trigger) requirements and schedule. For example, when Release 5 slipped in Fall 2002 by ~1 month compared to the original schedule, we issued two intermediate releases (adding ‘ByteStream’ [raw data format] capability) to minimise effects of delay on HLT schedule. Dario Barberis – Università e INFN, Genova

  9. ATLAS Computing: DC1/HLT/EDM • In fact, one of the most important benefits of DC1 has been the much enhanced collaboration between the HLT and ‘off-line’ communities, most prominently in the development of the raw-data part of the Event Data Model. (‘ByteStream’, Raw Data Objects, etc.) • We have not yet focussed on the reconstruction part of the Event Data Model to the same extent, but an assessment of what we have got ‘today’ and (re-)design where appropriate is ongoing. Dario Barberis – Università e INFN, Genova

  10. DC2-3-4-… • DC2: Q4/2003 – Q2/2004 • Goals • Full deployment of Event Data Model & Detector Description • Geant4 becomes the main simulation engine • Pile-up in Athena • Test the calibration and alignment procedures • Use LCG common software • Use widely GRID middleware • Perform large scale physics analysis • Further tests of the computing model • Scale • As for DC1: ~ 107 fully simulated events (pile-up too) • DC3, DC4... • yearly increase in scale and scope • increasing use of Grid • testing rate capability • testing physics analysis strategy Dario Barberis – Università e INFN, Genova

  11. ATLAS and GRID • Atlas has already used GRID for producing DC1 simulations • Production distributed on 39 sites, GRID used for ~5% of the total amount of data by: • NorduGrid (8 sites), who produced all their data using GRID • US Grid Testbed (Arlington, LBNL, Oklahoma), where GRID was used for ~10% of their DC1 share (10%=30k hours) • EU-DataGrid re-ran 350 DC1 jobs (~ 10k hours) in some Tier1 prototype sites: CERN, CNAF (Italy), Lyon, RAL, NIKHEF e Karlsruhe (CrossGrid site): this last production was done in the first half of September and was made possible by the work of the ATLAS-EDG task force Dario Barberis – Università e INFN, Genova

  12. ATLAS GRID plans for the near future • In preparation for the reconstruction phase (spring 2003) we performed further Grid tests in Nov/Dec. • Extend the EDG to more ATLAS sites, not only in Europe. • Test a basic implementation of a worldwide Grid. • Test the inter-operability between the different Grid flavors. • Inter-operation = submit a job in region A, the job is run in region B if the input data are in B; the produced data are stored; the job log is made available to the submitter. • The EU project DataTag has a Work Package devoted specifically to interoperation in collaboration with US iVDGL project: the results of the work of these projects is expected to be taken up by LCG (GLUE framework) • ATLAS has collaborated with DataTag-iVDGL for interoperability demonstrations in November-December 2002. • The DC1 data will be reconstructed (using Athena) early 2003: the scope and way of using Grids for distributed reconstruction will depend on the results of the tests started in Nov/December and still on-going. • ATLAS is fully committed to LCG and to its Grid middleware selection process: our “early tester” role has been recognized to be very useful for EDG: we are confident that it will be the same for LCG products Dario Barberis – Università e INFN, Genova

  13. ATLAS Long Term GRID Planning • Worldwide GRID tests are essential to define in detail the ATLAS distributed Computing Model. • The principles of the cost and resource sharing are described in a paper and were presented in the last ATLAS week (October 2002) and endorsed by the ATLAS Collaboration Board: PRINCIPLES OF COST SHARING FOR THE ATLAS OFFLINECOMPUTING RESOURCES Prepared by: R. Jones, N. McCubbin, M. Nordberg, L. Perini, G. Poulard, and A. Putzer • Main implementation of cost sharing is foreseen through in-kind contributions of resources in regional centres, made available for the common ATLAS computing infrastructure Dario Barberis – Università e INFN, Genova

  14. ATLAS Computing: Geant4 evaluation and integration programme • ATLAS has invested and is investing substantial effort into evaluation of G4, in close collaboration with G4 itself • Involves essentially all ATLAS sub-detectors • Provides reference against which any future simulation will have to compare • Provides (sufficiently well-) tested code that should, in principle, integrate with no difficulty into a complete detector simulation suite: • Striving for: • Minimal inter-detector coupling • Minimal coupling between framework and users code. • With this approach we are finding no problem in interfacing different detectors • Further integration issues (framework, detector clashes, memory, performance) are being checked. Dario Barberis – Università e INFN, Genova

  15. 2 0 -2 -4 -6 2 0 -2 -4 GEANT4 GEANT4 -6 GEANT3 GEANT3 data data 0 1 2 3 4 5 GEANT4 GEANT4 GEANT3 GEANT3 0.2 0.3 0.4 0.5 9 9.2 9.4 9.6 Example: Geant4 Electron Response in ATLAS Calorimetry • Overall signal characteristics: • Geant4 reproduces the average electron signal as func- • tion of the incident energy in all ATLAS calorimeters • very well (testbeam setup or analysis induced non-line- • arities typically within ±1%)… • …but average signal • can be smaller than in G3 • and data (1-3% for 20- • 700 μm range cut in HEC); • signal fluctuations in EMB • very well simulated; • electromagnetic FCal: • high energy limit of reso- • lution function ~5% in G4, • ~ 4% in data and G3; FCal Electron Response EMB Electron Energy Resolution ΔErec MC-Data [%] • TileCal: stochastic term • 22%GeV1/2 G4/G3, 26%GeV1/2 • data; high energy limit very • comparable. (thanks to P.Loch) Dario Barberis – Università e INFN, Genova

  16. Conclusions on ATLAS Geant4 Physics validation • Geant4 can simulate relevant features of muon, electron and pion signals in various ATLAS detectors, often better than Geant3; • remaining discrepancies, especially for hadrons, are addressed and progress can be expected in the near future; • ATLAS has a huge amount of the right testbeam data for the calorimeters, inner detector modules, and the muon detectors to evaluate the Geant4 Physics models in detail; • feedback loops to Geant4 team are for most systems established since quite some time; communication is not a problem. Dario Barberis – Università e INFN, Genova

  17. G4 simulation of full ATLAS detector • DC0 (end 2001): robustness test with complete Muons, simplified InDet and Calorimeters • 105 events, no crash! • Now basically all detectors available • Some parts of the detectors (dead material, toroids) are not there and are being worked on • Combined simulation starting now • Full geometry usable early February • Beta version of the full simulation program to be ready end January, to be tested in realistic production. Dario Barberis – Università e INFN, Genova

  18. ATLAS Computing: Interactions with the LCG Project • The LCG project is completely central to ATLAS computing. We are committed to it, and, in our planning, we rely on it: • Participation in RTAGs; ATLAS has provided the convenors for two major RTAGs (Persistency and Simulation); • Commitment of ATLAS effort into POOL (‘persistency’) project: • The POOL project is the ATLAS data persistency project! • LCG products and the release and deployment of the first LCG GRID infrastructure (‘LCG-1’) are now in our baseline planning: • LCG-1 must be used for our DC2 production end 2003 – early 2004 Dario Barberis – Università e INFN, Genova

  19. ATLAS Computing organization (1999-2002) Comp. Oversight Board National Comp. Board Comp. Steering Group Physics Technical Group Event filter QA group simulation reconstruction database Arch. team simulation reconstruction database coordinator Detector system Dario Barberis – Università e INFN, Genova

  20. Key ATLAS Computing bodies • Computing Oversight Board (COB): ATLAS Spokesperson and Deputy, Computing Coordinator, Physics Coordinator, T-DAQ Project Leader. Role: oversight, not executive. Meets ~monthly. • Computing Steering Group (CSG): Membership first row and first column of Detector/Task Matrix, plus Data Challenge Co-ordinator, Software Controller, Chief Architect, Chair NCB, GRID Coordinator. The top executive body for ATLAS computing. Meets ~monthly. • National Computing Board (NCB): Representatives of all regions and/or funding agencies, GRID-coordinator and Atlas Management ex-officio. Responsible for all issues which bear on national resources: notably provision of resources for World Wide Computing. Meets every two/three months. Dario Barberis – Università e INFN, Genova

  21. ATLAS Detector/Task matrix( CSG members) Dario Barberis – Università e INFN, Genova

  22. Other ATLAS key post-holders • Computing Steering Group: • Chief Architect: D.Quarrie (LBNL) • Physics Co-ordinator: F.Gianotti (CERN) • Planning Officer: T.Wenaus (BNL/CERN) • Chair NCB: A.Putzer (Heidelberg) • GRID Coordinator: L.Perini (Milan) • Data Challenge Coordinator: G.Poulard (CERN) • Software ‘Controller’: J-F.Laporte (Saclay) • Software Infrastructure Team: • Software Librarians: S.O’Neale (Birmingham), A.Undrus (BNL) • Release Co-ordinator (rotating): D.Barberis (Genoa) • Release tools: Ch.Arnault (Orsay), J.Fulachier (Grenoble) • Quality Assurance: S.Albrand (Grenoble), P.Sherwood (UCL) • LCG ATLAS representatives: • POB (Project Oversight Board): T.Åkesson (Deputy Spokesperson), J.Huth (USA), P.Eerola (Nordic Cluster), H.Sakamoto (Japan) • SC2 (Software & Computing Committee): N.McCubbin (Computing Coordinator) and D.Froidevaux • PEB (Project Execution Board): G.Poulard (Data Challenge Coordinator) • GDB (Grid Deployment Board): N.McCubbin (Computing Coordinator), G.Poulard (Data Challenge Coordinator), L.Perini (Grid Coordinator, Deputy) Dario Barberis – Università e INFN, Genova

  23. Proposed new computing organization DRAFT FOR DISCUSSION Dario Barberis – Università e INFN, Genova

  24. Main positions in proposed new computing organization • Computing Coordinator • Leads and coordinates the developments of ATLAS computing in all itsaspects: software, infrastructure, planning, resources. • Coordinates development activities with the TDAQ Project Leader(s), thePhysics Coordinator and the Technical Coordinator through the Executive Board and theappropriate boards (COB and TTCC). • Represents ATLAS computing in the LCG management structure (SC2 andother committees) and at LHC level (LHCC and LHC-4). • Chairs the Computing Management Board. • Software Project Leader • Leads the developments of ATLAS software, as the Chief Architect of theSoftware Project. • Is member of the ATLAS Executive Board and COB. • Participates in the LCG Architects Forum and other LCG activities. • Chairs the Software Project Management Board and the Architecture Team. Dario Barberis – Università e INFN, Genova

  25. Main boards in proposed newcomputing organization (1) • Computing Management Board (CMB): • Computing Coordinator (chair) • Software Project Leader • TDAQ Liaison • Physics Coordinator • NCB Chair • GRID & Operations Coordinator • Planning & Resources Coordinator • Responsibilities: coordinate and manage computing activities. Setpriorities and take executive decisions. • Meetings: bi-weekly. Dario Barberis – Università e INFN, Genova

  26. Main boards in proposed newcomputing organization (2) • Software Project Management Board (SPMB): • Software Project Leader (chair) • Computing Coordinator (ex officio) • Simulation Coordinator • Reconstruction, HLT Algorithms & Analysis Tools Coordinator(s) • Core Services Coordinator • Software Infrastructure Team Coordinator • LCG Applications Liaison • Calibration/Alignment Coordinator • Sub-detector Software Coordinators • Responsibilities: coordinate the coherent development of software (bothinfrastructure and applications). • Meetings: bi-weekly. Dario Barberis – Università e INFN, Genova

  27. Development plan (1) • early 2003: • completion of the first development cycle of OO/C++ software: • Framework • Fast Simulation • Event Data Model • Geometry • Reconstruction • implementation of the complete simulation in Geant4 and integration Geant4/Athena • reminder: first cycle of OO development had to prove that “new s/w can do at least as well as old one” and was based on “translation” of algorithm and data structures from Fortran to C++ Dario Barberis – Università e INFN, Genova

  28. Development plan (2) • 2003 – 2005: • Second cycle of OO software development (proper design is needed of several components): • Event Data Model and Geometry: • coherent design across all detectors and data types • optimization of data access in memory and on disk • Integrated development of alignment/calibration procedures • Development and integration of the Conditions Data Base • Simulation: • optimization of Geant4 (geometry and physics) • optimization of detector response • On-line/off-line integration: Trigger and Event Filter software • Reconstruction: development of a global strategy, based on modular interchangeable components Dario Barberis – Università e INFN, Genova

  29. Major Milestones Dario Barberis – Università e INFN, Genova

  30. Major Milestones Green: Done Gray: Original date Blue: Current date Dario Barberis – Università e INFN, Genova

  31. Perspectives • This plan of action is realistic and can succeed if: • there are sufficient Human Resources • there is a “critical mass” of people working together in a few key institutions, first of all at CERN • there is general consensus on where we are heading, and by which means (not always true in the past) Dario Barberis – Università e INFN, Genova

More Related