Computing for CDF Run II: Software Engineering & Data Analysis

Computing for CDF run II (2000 and beyond)(a work in progress !)SUMMARY • Software engineering: • Object Oriented migration • Languages • Data • size • storage/handling • Analysis in Pisa • data size • computing model • hardware needs • Network • analysis needs • on-line monitoring needs • Commercial software A “keep your feet on the ground” approach Computing for CDF run II Sefano Belforte - INFN Pisa

Software Engineering I • CDF will evolve to OO approach, large emphasis on capitalizing on past code and experience: • mixed language environment • C++ basic choice for Run II offline • support FORTRAN as well (expect it will die sometime) • JAVA in online only so far (as well as C) • wrap old code in C++ and run in new environment • data access still allowed via BANKS, OO is encouraged but not required • software engineering tools left to experts so far, keep design simple, concentrate on physics, a lot of time wasted on class charts on thin air, real stuff starting now • learning curve on new paradigm just started, 90% of new code is in C++, but capability to run together with old Fortran code to check new algorithms and build right away complete package was vital Computing for CDF run II Sefano Belforte - INFN Pisa

Software Engineering II • Simulation, reconstruction and analysis code as user modules embedded in a general framework that provides: • data I/O, user interfaces, dynamic module invocation, data-base communication etc. etc. • framework (A_C++) developed together with BaBar, developed by professionals • user modules written by physicists, OO/C++ experts to provide assistance, some more complex modules maybe written by experts,but most written by “everybody” from old timers (Fortran IV addicts) to young students: physics competence first ! • It is a very large project, startup was difficult, is one major concern of the collaboration, now you know as much as I do. Computing for CDF run II Sefano Belforte - INFN Pisa

Compilers and Platforms • It it would only be the new code, it would be a breeze. But CDF wants to carry on the good old stuff. • Lesson from the past: as standard as possible, only ANSI compliant C++ compilers. KAI(commercial one) only supported one so far. • Platform support: needs driven: • SGI: is just the machine we have in FNAL • Linux: is what we will use for Production and Level 3 • Digital Unix: just because we (Pisa and many other universities) asked for it, and we (Pisa) are working on it (with Padova) • Solaris: is talked to be inevitable, but still has to come • AIX: likely disappearing (no-one wants it) • The party line is to do things so that platform migration will be easy all in all we already survived VMS  ACP  AMDAHL  UNIX ! • “Usual” FNAL story: everything is manpower limited. Computing for CDF run II Sefano Belforte - INFN Pisa

Code Distribution • Code management via CVS • Release management via Soft_Rel_Tool (shared with BaBar) • So far FNAL only supports distribution as “packaged products” via laboratory owns ups/upd methods • No AFS • Code distributed as tar files, may need local rebuilding • CDF will need local system manager help for this ! Computing for CDF run II Sefano Belforte - INFN Pisa

Data Storage • Not an easy decision, past was YBOS sequential files… would not scale easily to run II data size, but worked  keep it for initial debugging of detector and code and as fallback. Then choose among: • Objectivity, home-grown LWDB, old-fashioned sequential files, ROOT. • we just had: • 1 year debate, 3 months technical review, 2 day workshop, recommendation came out 5 days ago, still wait definitive management ruling. • recommendation: ROOT…. But compatible with YBOS sequential files and leave door open for objectivity in the future. Computing for CDF run II Sefano Belforte - INFN Pisa

Data Size • Data logging: 75 Hz for 2 years (2 inverse fb) • Raw event size: 250 KB • Add analysis data and data overlap: 520 KB • Reduced analysis data set (PAD) = 60 KB/event • hope to get  30KB with ROOT compression • Overall data set for Run II: 1 PB = 1000 TB • Overall PAD size: 160 TB • FNAL data storage (all data are at FNAL, at least!) • 30 TB disk (25TB PADs + 10 TB robot cache) • 400TB tape robot (a guess so far) • rest on shelf tapes. • 200 physicists doing analysis at FNAL, about 20 at Pisa, still simple scaling don’t work (can’t have 3 TB disk + 40 TB robot in Pisa, data can not be parted among physicists). Computing for CDF run II Sefano Belforte - INFN Pisa

Data Analysis in Pisa I • Scenario 1:copy here the PAD’s we are interested in: • W/Z : 2 million events = 60 GB (gets top as well !) • J/psi : 80 million events = 2.4 TB • B->pipi: 1 Hz for 1 year = 30 million events = 1 TB • maybe less inclusive, maybe less compression : 3 TB total • need: 3TB robot, O(500GB) disk, about 3 times na48farm, buy it in 2000+ … same cost ? • What about SUSY, jets .. ? • Scenario 2:keep PADs in FNAL, copy here n-tuples: • wild guess: will need 10 GB per physicist adding code, common stuff etc. 300GB disk, plus HSM for backups etc, size as present na48farm • BUT : n-tuple refresh from FNAL ! One set a week means 10GB * 20 users = 200GB/week = 500 KB/sec Fnal-to-Pisa Computing for CDF run II Sefano Belforte - INFN Pisa

Data Analysis in Pisa II • Scenario 3:keep everything in FNAL. • just use X-terminals in Pisa and run everything through the network • need as good a line as from my home 28.8K modem = 4KB/sec/user = 80 KB/sec to FNAL • past attempts always failed, even now file transfer to/from FNAL is O(10KB/sec) (Netscape e.g.) but interactive shells are already almost impossible to use. • Scenario 4:keep local copy only of currently used n-tuple • is like 2 (need local power comparable to na48farm) but limit refresh rate to 1GB/week/user (already can hear people screaming for more!) • still need 20GB/week = 50 KB/sec guaranteed bandwidth. Computing for CDF run II Sefano Belforte - INFN Pisa

Data Analysis in Pisa: The Final Answer • We will have to try, can’t pick the right approach before collaboration has finalized data handling and distribution tools, and analysis topics have been pinpointed • We will try everything, user pressure will drive • Needs will be dominated by physics output (maybe we find SUSY in 3-lepton samples and everybody looks at this small data set…) • We will exploit local computing as much as possible to reduce network load (likely bottleneck, as it always has been) • Still will need to access FNAL PADs to produce data sets to copy to Pisa. If network is no good will use taped (expensive though!). But we desperately need guaranteed bandwidth for interactive work • If can not log in FNAL, no way to do most analysis here, only use “dead” data sets: no express data, no hot topics, just late sidelines… the good old way: take the plane and go to FNAL. Computing for CDF run II Sefano Belforte - INFN Pisa

Analysis in Pisa:My personal bottom line • The CDF group in Pisa will put together a local computing facility which will not cost more than what we have spend e.g. for Run I analysis or na48farm: one central server with 3TB robotic tapes and O(500) GB disks, O(10) top of the line PC’srunningLinuxwith10GB local disk each. • We will be able to really do competitive analysis on hot topics only if INFN can provide really good network for interactive response. • 10 min to download a few Mbytes postscript document: OK • 10 sec to move the cursor on a remote editor: UNACCEPTABLE Computing for CDF run II Sefano Belforte - INFN Pisa

Pisa: Mainframe or Cluster ? • Need mainframe for robot, file server etc. • PC farm supported via Linux, will use as needed • Will mostly depend on hardware cost at time of buy (2 years from now or more) • past experience with scattered hardware taught that it must be avoided • hardware on desk is no good • hot • noisy • damaged • hardware bought a bit at a time is a mess • distributed computing is a maintenance nightmare • Anyhow, we will probably have both: efficiency and easy of management call for central facility (at present also cost effective), users will have PC’s on their desks anyhow. Computing for CDF run II Sefano Belforte - INFN Pisa

Italy: global center or everyone by himself ? • Data are just too much, can not bring to Italy all that we need and cut the line with FNAL. • Thus we need a good link (see above) from each INFN site to FNAL anyhow. • Given a good access to FNAL resources for remote job entry at least, local requirements “per user” are small, sites with less then 5 physicists working full time on analysis at any given time may just do with a few PC-like workstations. • At present we do not see a need, nor an advantage, in concentrating computing from more INFN sites in just one. • For example in the past different INFN sites picked different analysis topics (very sensible), then is better to keep everybody’s data near the user. • Global center may all in all enhance network needs, I would like to see the lesser requirements satisfied first ! Computing for CDF run II Sefano Belforte - INFN Pisa

Network needs • Offline covered before: need guaranteed interactive work, will probably be OK if heavy graphics is slow. Will transfer large data sets via tapes. • Online: a new front. • 1. Silicon Vertex Tracker: a crucial part of Level 2 trigger, needs a lot of online monitor, care, debugging: experts will be in Pisa • 2. Internal Silicon Layers: our largest contribution to new detector, want to keep ownership after installation, needs on-line assistance from Pisa as well. • 3. Remote Control Room: if we can do shifts from Pisa, we save on travel • 3. Is about saving money, 1 and 2 about getting proper credit and physics rewards from present large efforts: keep ownership of our detectors ! • NEED: 2 or 3 X-terminals that work as extensions of FNAL LAN Computing for CDF run II Sefano Belforte - INFN Pisa

Extending FNAL LAN • Needed for online monitoring and debugging and remote control room • Do not need full 1MB/sec Ethernet, need tests but probably 200 KB/sec is good enough. • Do not need it at all times (problem in FNAL, phone call to expert in PISA, get the line for a few hours) • It has to be a guaranteed connection, something to rely on, if it fails a few times, the collaboration will request expert on site • Can “you” do it ? Can I have a dedicated slice of the net kept aside for just my terminal ? • Do we have to buy 64K ISDN by the minute as for videoconference ? • Lot of work to do, but need agreement on direction to move. Computing for CDF run II Sefano Belforte - INFN Pisa

Commercial Software • Back to the beginning: CDF is a dynosaur slowly evolving from Run I environment, no jumping. • DI3000 in Run I: bad experience. • Present status (the way I read it): it is being introduced, but slowly, and most collaboration (rightfully) pull the brakes, no special enthusiasm. • Software packages in use • general • ROOT • cvs • ZOOM (FNAL C++ Cernlib replacement) • KAI C++ compiler • mSQL (proposed) • just for professionals • memory analysers (Sniff++ etc.) • OO development tools Computing for CDF run II Sefano Belforte - INFN Pisa

Conclusion • The picture is starting to clear out only now • if this worries you, at least I am not alone • Can not show a real neat plan, still... • Will not need much computing harware before data, we are confident we can handle that pretty much as last run, need local help for system management and code distribution. • We have higher goals this time: • competitive top of the line physics • keep ownership of detectors we build • For this we have new needs: • network ! • network ! ! • network ! ! ! Computing for CDF run II Sefano Belforte - INFN Pisa

Computing for CDF Run II: Software Engineering & Data Analysis