110 likes | 222 Vues
ZARAH Status. Marek Kowal, Krzysztof Wrona OFFLINE Parallel Session Toronto, 26.06.2000. Plan of talk. Hardware OS support Data status Known issues Future plans / developments Summary. Hardware. 49 PCs (zenith PC farms). PPro200, PII 350, PIII 450
E N D
ZARAH Status Marek Kowal, Krzysztof Wrona OFFLINE Parallel Session Toronto, 26.06.2000
Plan of talk • Hardware • OS support • Data status • Known issues • Future plans / developments • Summary
Hardware • 49 PCs (zenith PC farms). PPro200, PII 350, PIII 450 • 3 SGI Multiprocessor machines (total of 44 processors) • 2 SGI fileservers (4TB of disc space) • networking: 100MB/s nodes, 1Gb/s main infrastructure, 800Mb/s HIPPI between SGIs
OS support • Currently we support the following architectures on zarah/zenith farms: • i386-suse63-linux (zenith analysis farm) • i386-suse-linux (zenith reconstruction farm) • mips3-sgi-irix6.2 (zarah cluster) • We have at least one to many architectures
Data status • Policy - keep latest MDSTs on discs • 94,95,96,97,98,99 - ALL data on discs • 2000 - up till 13.06 • Currently 300GB free disc space • It is enough for 1.5-2 months • Need for additional disc space
Known issues • We had to upgrade the analysis farms because of IO problems with tpfs filesystem • tpfs apparently does not scale enough for our purposes. Does not work with IRIX6.5 • some vital software still avaliable on zarah only (mkzed, newrzed, ZES) • LSF 3.0 does not really like SUSE6.3 • /jobspool on PCFarm too small
Future plans • Migrate completely from zarah1 • mkzed, mknewzed, http, monitor queue • ZES • additional disc space (2TB, possibly RAID) • New PC Farm for reconstruction under way - firs experiences with double processor Pentiums
Develompents • replace tpfs by SSF (Simple Staging Facility) • DiscCache project • MC naming convention
SSF • smaller - more stable • scalable - lots of client nodes • multipool • better error recovery • replaces /tpfs • problem with zresolve algorithm
Disc Cache project • Once finished, an official DESY staging facility (supported by DESY and Fermilab) • Replaces tpfs/SSF systems completely • ZARAH takes active part in DC development • Deadline still not set - prepare for living with SSF for a longer time...
Summary. DT = Tf - T0 • PC farm in full production, big success • Upgraded OS on analysis farm • Data kept up to date with reconstruction (with some breaks...) • PCs can be “stolen” from analysis farm to reconstruction farm when needed • SSF well advanced, DC status still unknown