230 likes | 249 Vues
This document provides an overview of ALICE's computing operations in 2012, including data taking, job profile, resource use, analysis trains, data access, and efficiency improvements. It also outlines future plans for grid upgrades and the upcoming Long Shutdown 1 in 2013-2014.
E N D
ALICE Computing : 2012 operation & future plans Rencontre LCG-France, SUBATECH Nantes 18-20 September 2012
A quick glimpse of 2012 • Standard data taking year for ALICE • p-p – emphasis on rare triggers, high Pt (Calorimeter) • pilot p-A run (few million events) • long p-A run in February 2013 (still counts as ‘2012’) • Bulk of analysis on 2011 Pb-Pb - the largest single period data sample
2012 - RAW • Standard treatment – 2 copies of RAW data • One at T0, one replica at T1s, proportional to the fraction of mass storage capacity ~1PB of RAW until now
2012 – job profile Average 28K jobs in parallel Increases as capacities become available
2012 – site contribution Wall time – 50/50 T0/1 to T2s
2012 – French sites Wall time – 25/75 T1 to T2s
Other important parameters • Storage (always insufficient…) • 2PB of disk, 45% at T1 • The balance is not as equal as CPU • Network – extremely well provisioned • T2 connectivity • will further • improve with • LHCONE
More details on workload Organized activities, including trains Chaotic
Resources use - tasks • Last year goal – increase the fraction of organized analysis • Tool – analysis trains • Long-term goal, takes a substantial amount of coordination and user education • The resources use distribution • 10% RAW reconstruction (constant) • 16% train analysis (5% beginning of year) • 23% chaotic analysis (36% beginning of 2012) • 51% Monte-Carlo productions (49% beginning of 2012)
The Analysis Trains • Polling together many user analysis tasks (wagons) in a single set of Grid jobs (the train) • Managed through a web interface by a Physics Working Group conductor (ALICE has 8 PWGs) • Provides a configuration and test platform (functionality, memory, efficiency) and a submission/monitoring interface • Speed – few days to go through a complete period (PBs of data!) MonALISA Web interface LPM AliROOT Analysis Framework AliEn Grid jobs
Data access in analysis • The chaotic and to some extent organized analysis is I/O bound (efficient use of disk/network resources) • Average 8 GB/s, peak 20 GB/s • Total read data from 1-st of April is 120 PB.
CPU efficiency • Stable (but ‘low’), some improvement with time – increase of trains share over chaotic analysis
CPU efficiency for organized tasks • MC - high, RAW ~OK, trains – still needs improvement
Analysis efficiency • Processing phases per event • Reading event data from disk – sequential • De-serializing the event object hierarchy – sequential • Processing the event parallelizable • Cleaning the event structures - sequential • Writing the output – sequential but parallelizable • Merging the outputs – sequential but parallelizable Event #p Event #n Event #m Event #0 Event #0 Event #0 Event #1 Event #1 Event #1 Event #2 Event #2 Event #2 tread tds tproc tcl tmerge twrite A.Gheata – improving analysis efficiency
Analysis efficiency (2) • The efficiency of the analysis job: • job_eff = (tds+tproc+tcl)/ttotal • analysis_eff = tproc / ttotal • Time/event for different phases depending on many factors • Tread~ IOPS*event_size/read_throughput – to be minimized • Minimize event size, keep under control read throughput • Tds+Tcl~ event_size*n_branches – to be minimized • Minimize event size and complexity • Tproc= ∑wagonsTi – to be maximized • Maximize number of wagons and useful processing • Twrite= output_size/write_throughput – to be minimized A.Gheata – improving analysis efficiency
Grid upgrades • New AliEn version (v.2-20) – ready for deployment • Lighter catalogue structure • Presently @500 M LFNs, 2.5x PFNs (replicas) • Growing at 10Mio new entries per week • Extreme job brokering • The jobs are no longer pre-split and with pre-determined input data set • Potentially one job could process all input data (of the set) at a given site • The data locality principle remains (for now) • The site/central services upgrade – need some downtime, after end of data taking in Feb.2013
File Brokering Current schema Submit 4 jobs: File1 File 4 File2 File3 File 5 Broker per file Submit 3 empty subjobs If nothing left, just exit File 1,2,4,5 When a job starts, analyze as much as possible File 3 From P.Saiz – AliEn development
Short development roadmap • Data management: • Popularity service • SE layout (EOS-like) • GUIDless catalogue • Job Processing: • Job Merging • Error classification • Multicoreand multiagent • Remote access optimization • Combine AF/classical grid CE – interactive Grid
General remarks on the future • 2013-2014 – Long Shutdown 1 • No revolution is (ever) planned, however… • All LHC experiments have submitted LoIfor the LS3 (HL-LHC) upgrades in 2022 • For the computing is massively larger than today (data rates and volumes, CPU needs) – 10-30x of today – the factors are not yet finalized • Massive online DAQ and HLT event filtering farms 2x size of what a T1 is today • No clear ideas how this will be achieved – technologically and financially • Moore’s and Kryder’s laws will not ‘cover’ the needs
General remarks on the future • The present Grid profited from ~10 years of planning and development (on par with the detectors) • And it delivered from day 1, continues to this day • The future planning and development of Grid/Cloud/<Insert name here> should start now – years of experience will help, but not enough • Parallel programming cannot be done by physicists… there are other hurdles too
General remarks on the future • Big improvement is expected from the frameworks and code • Undoubtedly a common effort and professional help will be necessary • Parallelism is a no-brainer, given the technological trends • Big parts of the code must be re-engineered and re-written • Every experiment has a panel which is charged with the design of the ‘new’ software • Crystal balls have been ordered
Summary – back to today • The 2012 is so far a standard data taking/processing/analysis year for ALICE – much excitement is expected in February with the p-A data • The operation is smooth and is helped a lot by the mature Grid around the world • The French T1/T2s are part of this structure, with remarkably stable performance and well balanced components – CPU, storage, networks • … and of course a solid expert support at all levels – a big **thank you** for this!
Summary – cont. • The (near)future developments are focused on analysis tasks and tools • Emphasis on data containers and process synchronization • Whole node is a promising path and will naturally help the multicore development • Progressive introduction of new features, improvements • The Grid must run continuously also during the LS1 shutdown • Resources (disk) are scarce • More efficient use – less replicas, WAN access