1 / 13

Analysis Trains

Analysis Trains. Costin Grigoras Jan Fiete Grosse-Oetringhaus ALICE Offline Week, 04.10.12. LEGO Trains. 42 trains configured (37 active) 5 CF, 4 GA, 1 PP, 8 JE, 5 DQ, 11 HF, 8 LF Submitted trains this year 213 CF, 35 DQ, 24 GA, 124 HF, 173 JE, 114 LF, 3 PP

Télécharger la présentation

Analysis Trains

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis Trains Costin Grigoras Jan Fiete Grosse-Oetringhaus ALICE Offline Week, 04.10.12

  2. LEGO Trains • 42 trains configured (37 active) • 5 CF, 4 GA, 1 PP, 8 JE, 5 DQ, 11 HF, 8 LF • Submitted trains this year • 213 CF, 35 DQ, 24 GA, 124 HF, 173 JE, 114 LF, 3 PP • 1-5 train operators / train • Operator mailing listalice-analysis-train-operators@cern.ch • TWiki pagehttps://twiki.cern.ch/twiki/bin/ viewauth/ALICE/AnalysisTrains since 01.02.12 on average 2400 jobs at any given time Analysis Trains - Jan Fiete Grosse-Oetringhaus

  3. Running Statistics alidaq aliprod alitrain SUM Analysis Trains - Jan Fiete Grosse-Oetringhaus

  4. Time until trains finish • Time between train submission and submission of final merging job • Average below 2 days (good!) but quite some spread per Train Average per month Analysis Trains - Jan Fiete Grosse-Oetringhaus

  5. AliEn Upgrade • The upgrade this Monday of parts to v2-20 had a few side-effects • General interruption from 10.00 to midnight; during this period Costin & Pablo were continuously working on fixing the situation • Jobs (in particular) merging that got submitted during that time failed, and needed to be retried later  Mistake, LPM should have been disabled for the upgrade • New status FAILED which is not considered as a final state  lead to some delay for merging jobs, fixed today (parallel failure of CERN EOS makes submission very slow) • Bug in SE selection, some jobs go to FAILED  being fixed by Pablo at present • I propose that planned upgrades are evaluated in particular with respect to the analysis trains and a plan is made how to recover failures from/during the period Analysis Trains - Jan Fiete Grosse-Oetringhaus

  6. Planned Improvements Analysis Trains - Jan Fiete Grosse-Oetringhaus

  7. Improve Merging • Merging • Dedicated CE/SE for merging (at CERN)  being investigated • Merging job submission to be speeded up (at the moment dependent on number of waiting analysis jobs) • Job Splitting • Investigate new AliEn option to select the input files once the job has started  increases number of files per job (less merging, more files for event mixing) Analysis Trains - Jan Fiete Grosse-Oetringhaus

  8. Train Statistics • Add consumed CPU and wall time for total and per job in run view 2.2y CPU total 3.2y Wall total 3.2h CPU / job 4.2h wall / job 4.7 files / job Analysis Trains - Jan Fiete Grosse-Oetringhaus

  9. Dataset Selection • Allow users on the interface to indicate on which dataset they would like to run • Operator marks dataset as "active" (similar to wagons) • User selects the desired datasets among those Desired datasets LHC10h_AOD086 LHC11h_AOD095 … Analysis Trains - Jan Fiete Grosse-Oetringhaus

  10. Merging Test • Test also the merging per wagon Merging test OK Failed Analysis Trains - Jan Fiete Grosse-Oetringhaus

  11. Further Ideas • Number of wagons • Enabling/disabling by lists (of wagon numbers / names?) • Saving / loading of train configurations • Groups of wagons • Ordering of wagons Analysis Trains - Jan Fiete Grosse-Oetringhaus

  12. Demo …some new features… Analysis Trains - Jan Fiete Grosse-Oetringhaus

  13. Summary • The LEGO train system got very popular • The average finishing time of a train is 2 days, but with quite some spread • We have lots of improvements requests and ideas • We have a lack of manpower (there is only Costin and me, both with many other tasks, too) which leads sometimes to large response times Analysis Trains - Jan Fiete Grosse-Oetringhaus

More Related