110 likes | 222 Vues
This presentation by Marek Kowal and Krzysztof Wrona covers the current status, known issues, and future plans for the ZARAH systems. It details hardware specifications, including the Zenith PC farms and SGI machines, and outlines the supported operating systems. The data status is emphasized, with a policy to maintain the latest data on discs and upcoming needs for additional storage. Challenges with the existing file systems and the planned migration to more efficient solutions, as well as enhancements in the development of a staging facility, are also discussed.
E N D
ZARAH Status Marek Kowal, Krzysztof Wrona OFFLINE Parallel Session Toronto, 26.06.2000
Plan of talk • Hardware • OS support • Data status • Known issues • Future plans / developments • Summary
Hardware • 49 PCs (zenith PC farms). PPro200, PII 350, PIII 450 • 3 SGI Multiprocessor machines (total of 44 processors) • 2 SGI fileservers (4TB of disc space) • networking: 100MB/s nodes, 1Gb/s main infrastructure, 800Mb/s HIPPI between SGIs
OS support • Currently we support the following architectures on zarah/zenith farms: • i386-suse63-linux (zenith analysis farm) • i386-suse-linux (zenith reconstruction farm) • mips3-sgi-irix6.2 (zarah cluster) • We have at least one to many architectures
Data status • Policy - keep latest MDSTs on discs • 94,95,96,97,98,99 - ALL data on discs • 2000 - up till 13.06 • Currently 300GB free disc space • It is enough for 1.5-2 months • Need for additional disc space
Known issues • We had to upgrade the analysis farms because of IO problems with tpfs filesystem • tpfs apparently does not scale enough for our purposes. Does not work with IRIX6.5 • some vital software still avaliable on zarah only (mkzed, newrzed, ZES) • LSF 3.0 does not really like SUSE6.3 • /jobspool on PCFarm too small
Future plans • Migrate completely from zarah1 • mkzed, mknewzed, http, monitor queue • ZES • additional disc space (2TB, possibly RAID) • New PC Farm for reconstruction under way - firs experiences with double processor Pentiums
Develompents • replace tpfs by SSF (Simple Staging Facility) • DiscCache project • MC naming convention
SSF • smaller - more stable • scalable - lots of client nodes • multipool • better error recovery • replaces /tpfs • problem with zresolve algorithm
Disc Cache project • Once finished, an official DESY staging facility (supported by DESY and Fermilab) • Replaces tpfs/SSF systems completely • ZARAH takes active part in DC development • Deadline still not set - prepare for living with SSF for a longer time...
Summary. DT = Tf - T0 • PC farm in full production, big success • Upgraded OS on analysis farm • Data kept up to date with reconstruction (with some breaks...) • PCs can be “stolen” from analysis farm to reconstruction farm when needed • SSF well advanced, DC status still unknown