130 likes | 267 Vues
In the April 2001 meeting of the Transatlantic Networking Committee, Boaz Klima presented significant advancements in DØ Monte Carlo production. Key progress includes full detector simulations, improved data processing techniques, and the introduction of offsite Monte Carlo Production Centers (MCPCs) aimed at generating massive data outputs efficiently. The discussion covered current CPU capabilities, data storage strategies, and future projections for bandwidth and MC event generation. The collaborative efforts are anticipated to enhance the analysis and research capabilities within the DØ experiment at Fermilab.
E N D
DØ Monte Carlo (and Data) Production Boaz Klima Fermilab Transatlantic Networking Committee Meeting Apr. 18, 2001
DØ MC Products • DØGSTAR – full detailed detector simulation (output – hits and generator info) • DØSIM – digitization, raw data, trigger info (output – digi hits, trig/raw chunks) • DØRECO – reconstruction (output – tracks, objects, particles); also used for data… • DØRECO_ANALYZE – algorithms’ analysis package (output – ROOT-tuple with particles’ info) • DØTRIGSIM – L1/2/3 trigger simulation (output – ROOT-tuple with trigger info) Boaz Klima
DØ MC Products (cont.) • “Average” CPU/event (750MHz PC) and event size for L~1E32 (2.5 additional minbias events) (depends on event type and generator) • DØGSTAR – ~1min and ~1-2MB • DØSIM – ~.5 min and ~.6-2MB (.3MB w/o digi) (Combination with more details and lower cutoffs [cutneu] for evolution as fast as showerlib [fast!] in Run 1 !) • DØRECO – ~1min; 1MB(depends on what’s written out). • RECO_ANALYZE - ~5 sec and ~10Kb. • DØTRIGSIM – ~15 sec and ~10Kb. • Entire chain - ~ 2-3 min Boaz Klima
DØ Data Storage User & Admin. Interface (API and GUI) Station F Project Project Consumer/ Producer Station A Station E Project Consumer/ Producer Project Master DB and Information Servers Consumer/ Producer Mass Storage System Project Global Optimizer Consumer/ Producer Station D Station B Consumer/ Producer Station C • Data Interface via SAM/ENSTORE model Boaz Klima
DØ Data Processing • Run 2 began on Mar. 1, 2001! • Will we have enough CPU power at Fermilab to keep up with data processing? (currently 200 processors) Boaz Klima
DØ Data Samples • Store Data of different sizes: • TMB for basic analysis and event selection. • DST contains enough info for almost all analyses. • STA mainly for debugging purposes. Boaz Klima
DØ Monte Carlo Production • Planto generate ALL MC events off-site: • Currently 1 CPU can fully simulate ~500-1000 events/day. • Current DØ Data “Grid” ~500 CPU’s • Generate 50-100M events/year. • Some farms will be upgraded substantially this/next year *Not Completely DØ Current total Bandwidth to Fermilab ~50-100Mb/sec Boaz Klima
DØ Data “Grid” - Locations Tata(?) Prague Rio(?) Boston Boaz Klima
MC Production Centers (MCPCs) • Massive MC generation offsite - A new philosophy atDØ • Requires control system and careful/smart monitoring • MCPCs are a GREAT deal for DØ • Excellent for DØ physics (closer to e+e- model of Data/MC) • Money spent by (& computers stay at) home institution (not DØ) • Local manpower for software development & maintenance • For some inst. this fits well within grand plan ( LHC, Grid) • Current CPU power for MC ~ 10,000 * Start of Run 1 *100 more MC data than in Run 1 (with full GEANT!) • Thanks to dedicated people at every active MCPC Boaz Klima
DØ MC Production to-date • Generated ~4M MC events for three phases of MC Challenge and processed them through all stages • Testing programs, infrastructure, and exploring physics • MCPCs processing events via the entire chain of programs and shipping RECO output files to Fermilab Boaz Klima
Production – under discussion • Storage, mode of operation • What to store and where (currently RECO output @Fermilab) • Balance between different needs and budget (tapes are expensive!) • Transferring MC data (ftp, FedEx, by hand,…) • Control and monitoring system • Operating large vs medium size MCPCs • Interaction with SAM • Should MC Production centers be ready to process data? (databases,…) • … Boaz Klima
Conclusions • DØ complete MC Production use offsite farms • World Wide “Grid” (2-4 continents) • Currently 6 active MCPCs; soon adding 2-3 new MCPCs • CPU power on Mar. ‘01 ~ 10,000 * Start of Run 1 • Generate 50-100 Million fully-simulated events per year • Shipping them to Fermilab or storing at remote sites? • Generation so far meets demand; extremely useful for all parties • Significant upgrades expected in 2001/2 Capable of generating as many events as data taken at DØ!!! Boaz Klima
Summary for TAN • DØ Monte Carlo Production is up and working now • World Wide “Grid” (2-4 continents) • Currently 6 active MCPCs; soon add 2-3 new ones • Current total Bandwidth to Fermilab ~50-100Mb/sec • Shipping MC data back and forth is essential Total Bandwidth ~200Mb/sec (2001) • Significant upgrades expected in 2001/2/3 Total Bandwidth ~400Mb/sec (2002) • Real data processing at remote farms + reprocessing (?) Total Bandwidth ~800Mb/sec (2002) • Future? my current guesstimate would be Total Bandwidth ~(1200/1600/3200/4000)Mb/sec (2003/4/5/6) • Is TA bandwidth available? TA-institutions in the U.S.? Boaz Klima