BESIII Computing Requirements and Data Management Strategies for High Energy Physics
The BESIII experiment handles peak data rates of 3000 Hz, generating approximately 1 x 10^10 events annually, resulting in around 2,640 TB of total data. To efficiently process this massive volume, BESIII relies on robust computing resources, including CPU power and storage systems, while ensuring high-speed network access. With four reconstructions per year, specialized analysis frameworks minimize data size and enhance access speed. Challenges involving tape reading/writing, network bottlenecks, and software management are essential considerations for optimizing data handling in the experiment.
BESIII Computing Requirements and Data Management Strategies for High Energy Physics
E N D
Presentation Transcript
BESIII computing 王贻芳
Peak Data volume/year • Peak data rate at 3000 Hz • Events/year: 1*1010 • Total data of BESIII is about 2*640 TB
BESIII computing needs • CPU power • Storage • Network • System software
CPU power needed for data reconstruction and simulation • Four times reconstructions/year • Equivalent to a farm of 200 P4 1.6G • Analysis needs another farm of 100 P4 1.6 G • Maybe underestimated
Data type and storage media • Store 3 reconstructions • Virtual storage library • Fast and automatic access
Tape reading/writing speed • Online data recording(writing): 3000*12KB=36MBytes/s • Reconstruction(reading/writing): 3000*24KB*5*2 = 760MBytes/s • MC simulation ? 3000*24*2 = 152MBytes/s It is almost impossible ! We should design our software framework carefully to minimize the data size !
Disk read/write access speed • Data Reconstruction 3000*2KB*5 = 30 MB/s • MC simulation 3000*2KB*5 = 30 MB/s • Analysis 40*40Mb = 200 MB/s main building to computer center
Network needs • From Online farm to computer center 3000*12KB = 36 MB/s A dedicated Gbps network line for safety and stability • Offline farm network a bottleneck • User analysis > 40 users access disk files
Software system • Mainly based on free software • CERN library based • Following latest development of HEP software
Main bottleneck • Tape reading/writing use more CPU to reduce data size • Network within PC farm, main building CC DST in main building ? • Large scale storage library management • Large scale PC farm: stability, scalability, management