230 likes | 385 Vues
ATLAS computing in Russia. A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 10/03/08. ATLAS RuTier-2 tasks.
E N D
ATLAS computing in Russia A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 10/03/08 A.Minaenko
ATLAS RuTier-2 tasks • Russian Tier-2 (RuTier-2)computing facility is planned to supply with computing resources all 4 LHC experiments including ATLAS. It is a distributed computing center including at the moment computing farms of 6 institutions: ITEP, KI, SINP (all Moscow), IHEP (Protvino), JINR (Dubna), PNPI (St.Petersburg) • The main RuTier-2 task is providing facilities for physics analysis using AOD, DPD and user derived data formats as ROOT trees. • Full current AOD and 30% of previous AOD version should be available • Development of reconstruction algorithms should be possible which require some subsets of ESD and Raw data • All the data used for analysis should be stored on disk servers (SE) and some unique data (user, group DPD) to be saved on tapes also as well as previous AOD/DPD version • The second important task is production and storage of MC simulated data • The planned RuTier-2 resources should supply the fulfilment of these goals A.Minaenko
ATLAS RuTier-2 resource evolution • The table above was included in the table of Russia pledge to LCG and it illustrates our current understanding of the resources needed. It can be corrected in future when we’ll understand our needs better • Not taken into account: AOD increase due to inclusive streaming, change of rate MC events (30% instead of 20%), possible increase of AOD event size (taken 100 KB), increase of total DPD size (taken 0.5 of AOD) A.Minaenko
Current RuTier-2 resources for all experiments • Red – will be available in 1-2 month • ATLAS request for 2008 = 780 kSI2k, 280 TB A.Minaenko
Normalized CPU time (hour*kSI2k) A.Minaenko
RuTier-2 for ATLAS in 2007 ATLAS – 21% ATLAS – 846 kh*kSI2k A.Minaenko
Site contributions in ATALAS in 2007 A.Minaenko
ATLAS RuTier-2 in the SARA cloud • The sites of RuTier-2 are associated with ATLAS Tier-1 SARA • Now 5 sites IHEP, ITEP, JINR, SINP, PNPI are included in TiersOfAtlas list and FTS channels are tuned for the sites • 4 sites (IHEP, ITEP, JINR, PNPI) successfully participated in 2007 in data transfer functional tests (next slide). This is a coherent data transfer test Tier-0 →Tiers-1→Tiers-2 for all clouds, using existing SW to generate and replicate data and to monitor data flow. • Other 2007 ATLAS activity is replication of produced MC AOD from Tiers-1 to Tiers-2 according to ATLAS computing model. It is done using FTS and subscription mechanism. RuTier-2 sites (except ITEP) did not participate in the activity because of the severe lack of a free disk space • 4 sites (IHEP(15%), ITEP(20%), JINR(100%), PNPI(20%)) participated in replication of M4 data. Here percentage of requested for replication data is shown. Only JINR obtained all the data, the other sites were limited by the size of free disk space • During one week M4 exercises (Aug-Sep07) about two millions of real muon events were detected, written down on disks and tapes and reconstructed in ATLAS Tier-0. Then the reconstructed data (ESD) in quasi-real time were exported to Tiers-1 and their associated Tiers-2. All the chain was working as it should be during real LHC data taking. This was the first successful experience of this sort for ATLAS • Two slides (10, 11) illustrate the M4 exercises and the 2nd one shows results for the SARA cloud: practically all subscribed data were successfully transmitted A.Minaenko
Activities. Functional Tests 10 Tier-1s and 46 Tier-2s participated Sep 06 Oct 06 Nov 06 Sep 07 Oct 07 New DQ2 SW release . Oct 2007, DQ2 0.4 New DQ2 SW release . Jun 2007, DQ2 0.3 New DQ2 SW release (0.2.12) A.Minaenko
M4 Data Replication Activity Summary for All Sites Summary for all Tier-1 sites Summary for all Tier-2 sites IHEP, ITEP, JINR, PNPI Complete replicas Incomplete replicas Datasets subscribed A.Minaenko
Transfer status: IHEP: 1 trouble file (0.2%)JINR: 1 trouble file (0.3%) ITEP: no troublesPNPI: no troubles ESD data only M4 Data Replication Activity Summaryfor SARA Cloud A.Minaenko
M5 Data Replication Activity Summary ITEP,IHEP, JINR,PNPI participated Delay in replication < 24h Total subscriptions Completed Transfers IHEP, ITEP, JINR,PNPI A.Minaenko
Russian contribution to the central ATLAS sw/computing • Russia contribution to ATLAS M&O budget of Category A has amounted 0.5 FTE this year. Two our colleagues (I.Kachaev, V.Kabachenko) were involved in central ATLAS activities at CERN concerning Core sw maintenance. They fulfilled a number of tasks: • Support of atlas-support@cern.ch list, i.e. managing user quotas, scratch space distribution, user requests/questions concerning AFS space, access rights etc. • Support of atlas-sw-cvsmanagers@cern.ch list, i.e. managing central ALAS CVS • Official ATLAS sw release builds: releases 13.0.20, 13.0.26, 13.0.28, 13.0.30 have been build and 13.0.40 is under construction • Corresponding documentation update: release pages, librarian documentation • ATLAS AFS management • a lot of scripts have been written to support release builds, release copy and move, command line interface to TagCollector, cvs tags search and comparison in the TagCollector, etc. A.Minaenko
Russian contribution to the central ATLAS sw/computing • Two our colleagues (A.Zaytsev, S.Pirogov) were visiting CERN (4+4 month) to make contribution to the activity of ATLAS Distributed Data Management (DDM) group. Their tasks included corresponding sw development as well as participation in central ATLAS DDM operations like support of data transfer functional tests, M4 exercises, etc. Special attention were given to SARA cloud to which Russian sites are attached • During the visit the following main tasks were fulfilled: • Development of the LFC/LRC Test Suite and applyingit to measuring performance of the updated version of the production LFC server and a new GSI enabled LRC testbed • Extending functionality and documenting the DDM Data Transfer Request Web Interface • Installing and configuring a complete PanDA server and a new implementation of PanDA Scheduler Server (Autopilot) at CERN and assisting LYON Tier-1site to do the same • Contributing to the recent DDM/DQ2 Functional Tests (Aug 2007) activity, developing tools for statistical analysis of the results and applying them to the data gathered during the tests • All the results were reported at the ATLAS internal meetings and at the computing conference CHEP2007 • Part of the activity (0.3 FTE) was accounted as Russia contribution to ATLAS M&O Category A budget (Central Operations part) A.Minaenko
Challenges in 2008 • FDR-1 • 10 hrs. data taking @200 Hz a few days in a row • CCRC-1 • 4 weeks operation of full Computing Model • All 4 LHC experiments simultaneously • Sub detector runs • M6 • First week of March • FDR-2 Simulation Production • 100M events in 90 days plus merging • Using new release • CCRC-2 • Like CCRC-1 but the whole month of May • FDR-2 • Like FDR-1 but at higher luminosity • Timing uncertain now • M7 ? A.Minaenko
Planned ATLAS activity in 2008 A.Minaenko
ATLAS Production Tiers (Feb 08. Full Dress Rehearsal) status 10 Tier-1s and 56 “Tier-2s” Metrics for T1 success : 100% data transferred (from CERN, from Tier-1s and to Tier-2s) Metrics for T2/T3 success : 95+% data transferred (transfer within cloud) Metrics for cloud success : 75% of sites participated in the test and 75% passed the test done part failed No test A.Minaenko
CCRC08-1 results at RuTier-2 Activity Summary ('2008-02-24 08:50' to '2008-03-01 12:50') A.Minaenko
Structure of ATLAS data used for physics analysis • The streaming of ATLAS data is under discussion now and final decision is not accepted yet • Streaming is based on trigger decision and the assignment of a given event to a stream can not change over time (does not depend on offline procedures) • There will be 4-7 RAW/ESD physics streams • One or a few AOD streams per a ESD stream, with of about 10 final AOD streams • There are two possible types of streaming • Inclusive streaming – one and the same event can be assigned to different streams if it has corresponding trigger types • Exclusive streaming – a given event can be assigned to only one stream; if it has signatures permitting to assign it to more than one stream it goes to special overlap stream • Now the inclusive streaming is considered as preferable • A given DPD is intended for a given type(s) of analysis and it can collect events from different streams. A DPD contains only needed for a given analysis set of events and only needed part of event information • Physics analysis will be carried out using AOD streams and (mainly) different DPDs including specific user created formats (as ROOT trees) A.Minaenko
Possible scenarios of data distribution and analysis in RuTier-2 • Scenario A: a given AOD stream (or DPD) is thoroughly kept at a given Tier-2 site: • advantage – can be easily done from the technical point of view using present ATLAS DDM and analysis tools • disadvantage – very hard to supply uniform CPU load. At some sites (with “popular” data) CPUs will be overloaded but at other there will be idle CPUs • Scenario B: each AOD stream (large DPD) is split between all the sites: • advantage – uniform CPU load • disadvantage – i) possible difficulties with subscription providing automated splitting of data (?); ii) will be analysis grid sub-jobs able to find sites with needed data (?) • From the point of view of functionality scenario B is more preferable but the question is: do existing ATLAS tools permit to realize the scenario (present answer – yes, but it is necessary to test this practically) • AOD and DPD to be distributed proportionally to the CPU (kSI2k) between the participating sites A.Minaenko