1 / 1

TESTING THE CDF DISTRIBUTED COMPUTING FRAMEWORK

TESTING THE CDF DISTRIBUTED COMPUTING FRAMEWORK.

raleigh
Télécharger la présentation

TESTING THE CDF DISTRIBUTED COMPUTING FRAMEWORK

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TESTING THE CDF DISTRIBUTED COMPUTING FRAMEWORK V. Bartsch3, M. Burgon-Lyon1, A. Baranowski2,S. Belforte4, G. Garzoglio2, R. Herber2, R. Illingworth2, R. Kennedy2, U. Kerzel5, A. Kreymer2, M. Leslie3, L. Loebel-Carpenter2, A. Lyon2, W. Merrit2, F. Ratnikov6, R. St. Denis1, A. Sill7, S. Stonjek2,1, I. Terekhov2, J. Trumbo2, S. Veseli2, S. White2 1University of Glasgow, 2Fermi National Accelerator Laboratory, 3University of Oxford, 4Istituto Nazionale di Fisica Nucleare, 5Universität Karlsruhe, 6Rutgers University, 7Texas Tech University Introduction: • SAM and CAF • definitions and technical information: see poster on Deployment of SAM for the CDF experiment • up to now about 1PB of data delivered to clients with SAM to the SAM stations • CAF used for various analysis and Monte Carlo production jobs on daily basis Number of jobs on the CAF Total CDF Files To User 2003 2004 2002 1000 TB Stress tests on the standard CAF at Fermilab: Total amount of data daily read by CDF on the CAF • Stress tests: • create the usual user load • 50 SAM projects on the CAF and move 20 TBytes per day • split jobs in several segment • run several parts of the job in parallel on different CPUs • Issues discovered and solved: • submitting more than 100 SAM project at one time • problems with the project master • large number of files (order of 10 000 files) • optimizer problems: checks location of files when requesting next file of the dataset or after network outage, blocks other requests • could slow down all SAM stations • problem is already solved, but number of files request still should be small User friendliness: • recovery of partly failed user projects possible with sam recovery dataset command • depends on the correct (or not at all) release of the file • Conclusion: • test phase successful , deployment of SAM for the CAF system foreseen this autumn • limits beyond requirements, need a new round of testing to probe the limits consumed data per day of the central SAM station during the tests Decentralized CAF: Comparison of the total amount of CPU and disk space of the CAF and the DCAF systems • Goal: 2005 50% of the computing outside Fermilab • distributed computing • use of DCAF (Decentralized CDF Analysis Farm) • SAM station environment has to be common to all stations and adaptations to the environment have to be made. Disk Growth slower than CPU growth • need fast network or use DCAFs for Monte Carlo production • software working on the same cluster as the LCG software desireable to spread CDF jobs on more clusters • use of JIM, see poster on JIM deployment for the CDF experiment Disk CPU July FNAL FNAL 04 Dec FNAL FNAL 04 Name of author

More Related