The ATLAS Computing Model

The ATLAS Computing Model Roger Jones Lancaster University ACAT07 Amsterdam, 26/4/06

LHC data LHC will produce 40 million collisions per second per experiment After filtering, ~100 collisions per second will be of interest A Megabyte of data digitised for each collision = recording rate of 0.1 Gigabytes/sec 1010 collisions recorded each year = 10 Petabytes/year of data RWL Jones 26 April 2007 ACAT07, Amsterdam

Heavy Ions SUSY Top quark Exotics Higgs Standard Model Computing for ATLAS B Physics RWL Jones 26 April 2007 ACAT07, Amsterdam

ATLAS Requirements start 2008, 2010 • Note the high ratio of disk to cpu in the Tier 2s • Not yet realised • May require adjustments RWL Jones 26 April 2007 ACAT07, Amsterdam

Lab m Uni x USA Brookhaven Lancs UK Taipei ASCC Lab a France Tier 1 Physics Department Uni n CERN Tier2 ……… Italy Desktop Lab b Germany NL Lab c  Uni y Uni b  LondonGrid SouthGrid NorthGrid ScotGrid  The Grid Model with Tiers The LHC Computing Facility RWL Jones 26 April 2007 ACAT07, Amsterdam

Roles at CERN • Tier-0: • Prompt first pass processing on express/calibration & physics streams with old calibrations - calibration, monitoring • Calibrations tasks on prompt data • 24-48 hours later, process full physics data streams with reasonable calibrations • Implies large data movement from T0 →T1s • CERN Analysis Facility • Access to ESD and RAW/calibration data on demand • Essential for early calibration • Detector optimisation/algorithmic development RWL Jones 26 April 2007 ACAT07, Amsterdam

Roles Away from CERN • Tier-1: • Reprocess 1-2 months after arrival with better calibrations • Reprocess all resident RAW at year end with improved calibration and software • Implies large data movement from T1↔T1 and T1 → T2 • ~30 Tier 2 Centers distributed worldwide Monte Carlo Simulation, producing ESD, AOD, ESD, AOD  Tier 1 centers • On demand user physics analysis of shared datasets • Limited access to ESD and RAW data sets • Simulation (some at Tier 1s in early years) • Implies ESD, AOD, ESD, AOD  Tier 1 centers • Tier 3 Centers distributed worldwide • Physics analysis • Data private and local - summary datasets RWL Jones 26 April 2007 ACAT07, Amsterdam

Evolution There is a growing Tier 2 capacity shortfall with time We need to be careful to avoid wasted event copies RWL Jones 26 April 2007 ACAT07, Amsterdam

General Comments • The Tier 1s and Tier 2s are collective - if the data is on disk, you (@ a T2) or your group (@ a T1) can run on it • For any substantial data access, jobs go to the data • Users initially thought data goes to the job! Cannot be sustained • Better for network better for job efficiency • Data for Tier 3s should be pulled from Tier 2s using ATLAS tools • Tier 3s need to ensure adequate networking • We need to monitor (and potentially control) traffic RWL Jones 26 April 2007 ACAT07, Amsterdam

Analysis computing model Analysis model broken into two components • @ Tier 1: Scheduled central production of augmented AOD, tuples & TAG collections from ESD • Derived files moved to other T1s and to T2s • @ Tier 2: On-demand user analysis of augmented AOD streams, tuples, new selections etc and individual user simulation and CPU-bound tasks matching the official MC production • Modest job traffic between T2s • Tier 2 files are not private, but may be for small sub-groups in physics/detector groups • Limited individual space, copy to Tier3s RWL Jones 26 April 2007 ACAT07, Amsterdam

Group Analysis • Group analysis will produce • Deep copies of subsets • Dataset definitions • TAG selections • Characterised by access to full ESD and sometimes RAW • This is resource intensive • Must be a scheduled activity • Can back-navigate from AOD to ESD at same site • Can harvest small samples of ESD (and some RAW) to be sent to Tier 2s • Must be agreed by physics and detector groups • Big Trains etc • Efficiency and scheduling gains access. Some form of co-ordination is needed • If analyses are blocked into a ‘big train’; • Each wagon (group) has a wagon master )production manager • Must ensure will not derail the train • Train must run often enough (every ~2 weeks) • Trains can also harvest ESD and RAW samples for Tier 2s (but we should try to anticipate and place these subsets) RWL Jones 26 April 2007 ACAT07, Amsterdam

Group Analysis @ Tier 1 • Profile of per user resource assumed for all working groups • 1000 passes through 1/10th of ESD sample (most will be on sub-streams) or 100 passes through the full ESD RWL Jones 26 April 2007 ACAT07, Amsterdam

T1 Data • Tier 1 cloud (10 sites of very different size) contains: • 10% of RAW on disk, the rest on tape • 2 full copies of current ESD on disk an 1 copy of previous • A full AOD/TAG at each Tier 1 • A full set of group DPD • Access is scheduled, through groups RWL Jones 26 April 2007 ACAT07, Amsterdam

On-demand Analysis • Restricted Tier 2s and CAF • Can specialise some Tier 2s for some groups • ALL Tier 2s are for ATLAS-wide usage • Most ATLAS Tier 2 data should be ‘placed’ with lifetime ~ months • Job must go to the data • Tier 2 bandwidth is vastly lower, job efficiency higher • Back navigation requires AOD/ESD to be co-located • Role and group based quotas are essential • Quotas to be determined per group not per user • User files can be garbage collected - effectively ~SCR$MONTH • Data Selection • Over small samples with Tier-2 file-based TAG and AMI dataset selector • TAG queries over larger samples by batch job to database TAG at Tier-0 and maybe others - see later • What data? • Group-derived formats • Subsets of ESD and RAW • Pre-selected or selected via a Big Train run by working group • No back-navigation between sites, formats should be co-located RWL Jones 26 April 2007 ACAT07, Amsterdam

T2 Disk • Tier 2 cloud (~30 sites of very, very different size) contains: • There is some ESD and RAW • In 2007: 10% of RAW and 30% of ESD in Tier 2 cloud • In 2008: 30% of RAW and 150% of ESD in Tier 2 cloud • In 2009 and after: 10% of RAW and 30% of ESD in Tier 2 cloud • Additional access to ESD and RAW in CAF • 1/18 RAW and 10% ESD • 10 copies of full AOD on disk • A full set of official group DPD • Lots of small group DPD and user data • Access is ‘on demand’ RWL Jones 26 April 2007 ACAT07, Amsterdam

Real-world Comments • We cannot keep all RAW data on disk, and • We cannot sustain random access to RAW on tape • Modification for early running: • We have some flexibility to increase RAW and ESD on disk temporarily in all Tiers • The fraction also decreases with year of data taking • The disk RAW data is be pre-selected as far as possible • ~50% RAW and ESD at Tier 2s must also be preselected • Any additional needed later, requests above ~20Gbytes/day need to be requested, not grabbed • ESD can be delivered in a few hours • RAW on tape may take ~week, but can be prioritised • All Raw from tape must be requested RWL Jones 26 April 2007 ACAT07, Amsterdam

User Analysis @ Tier 2 • Profile of per user resource assumed with time for 700 active users • Assume in 2007/2008, much of work done through group (to get data in shape for other work) • 25 passes through user sample RWL Jones 26 April 2007 ACAT07, Amsterdam

Optimised Access • RAW, ESD and AOD will be streamed to optimise access • The selection and direct access to individual events is via a TAG database • TAG is a keyed list of variables/event • Overhead of file opens is acceptable in many scenarios • Works very well with pre-streamed data • Two roles • Direct access to event in file via pointer • Data collection definition function • Two formats, file and database • Now believe large queries require full database • Multi-TB relational database; at least one, but number to be determined from tests • Restricted it to Tier0 and a few other sites • Does not support complex ‘physics’ queries • File-based TAG allows direct access to events in files (pointers) • Ordinary Tier2s hold file-based primary TAG corresponding to locally-held datasets • Supports ‘physics’ queries RWL Jones 26 April 2007 ACAT07, Amsterdam

Are You Local? • The Grid must avoid central points of congestion • Present coherent local interfaces, but reduce global state • This also applies to contacts with sites • ‘Users’ / ‘experiments’ must work with sites directly • This requires real effort from the local experiment group • We should do human communication with the fewest ‘hops’, not up and down a chain • NB: also applies to ‘strategic’ discussions and resource planning RWL Jones 26 April 2007 ACAT07, Amsterdam

…Ask what you can do for the Grid! LHC computing is a team sport • But which one? • A horrendous cliché, but the key to our success! • We are all working towards the same goal… Thanks to Dave Newbold! RWL Jones 26 April 2007 ACAT07, Amsterdam

Service versus Development No more clever developments for the next 18 months! • Focus now must be integration, deployment, testing, documentation • The excitement will come from the physics! But also many ‘big unsolved problems’ for later: • How can we store data more efficiently? • How can we compute more efficiently? • How should we use virtualisation? • How do we use really high-speed comms? RWL Jones 26 April 2007 ACAT07, Amsterdam

Summary • We have come a long way • Grids are not an option, they are a necessity • Scheduled production is largely solved • Chaotic analysis, data management & serving many users are the mountains we are climbing now • Users are important to getting everything working • ‘No Pain, No Gain!’ RWL Jones 26 April 2007 ACAT07, Amsterdam

GANGA – Single front-end, Multiple back-ends • Three Interfaces: • GUI, Python Command Line Interface (c.f. pAthena), Python scripts • GUI aids job preparation • Job splitting and merging works • Plug-ins allow definition of applications • E.G.: Athena and Gaudi, AthenaMC, generic executable, root • And back-ends • Currently: Fork, LSF, PBS, Condor, LCG/gLite, DIAL, DIRAC & PANDA AthenaMC RWL Jones 26 April 2007 ACAT07, Amsterdam

Common Project: GANGA • GANGA User Interface to Grid for ATLAS and LHCb experiments • Configures and submits applications to the Grid. • Tailored for programmes in experiments Gaudi/Athena software framework but easily adapted for others (e.g. BaBar experiment) • Typical applications are private simulation and reconstruction jobs, analysis packages for event selection and physics analysis of distributed data RWL Jones 26 April 2007 ACAT07, Amsterdam

The ATLAS Computing Model