120 likes | 280 Vues
BaBar Tier A @ CC-IN2P3. An example of a data access model in a Tier 1. Jean-Yves Nief CC-IN2P3, Lyon. Overview of BaBar @ CC-IN2P3 (I). CC-IN2P3: mirror site of Slac for BaBar since November 2001: real data. simulation data. ( total = 220 TB )
E N D
BaBar Tier A @ CC-IN2P3 An example of a data access model in a Tier 1 Jean-Yves Nief CC-IN2P3, Lyon LCG Phase 2 Planning Meeting - Friday July 30th, 2004
Overview of BaBar @ CC-IN2P3 (I) • CC-IN2P3: mirror site of Slac for BaBar since November 2001: • real data. • simulation data. (total = 220 TB) • Provides the infrastructure needed to analyze these data by the end users. • Open to all the BaBar physicists. LCG Phase 2 Planning Meeting - Friday July 30th, 2004
permanent cache Overview of BaBar @ CC-IN2P3 (II) • 2 types of data available: • Objectivity format (commercial OO database): giving it up. • Root format (ROOT I/O: Xrootd developped @ SLAC). • Hardware: • 200 GB tapes (type: 9940). • 20 tape drives (r/w rate = 20 MB/s). • 20 Sun servers. • 30 TB of disks (ratio disk/tape = 15%). actually ratio ~30% (ignoring rarely accessed data) LCG Phase 2 Planning Meeting - Friday July 30th, 2004
BaBar usage @ CC-IN2P3 • 2002 – 2004: ~ 20% of the CPU available (on a total of ~1000 CPUs available). • Up to 450-500 users’ jobs running in // • « Distant access » of the Objy and root files from the batch worker (BW): • random access to the files: only the objects needed by the client are transfered to the BW (~kB per request). • hundreds of connections per server. • thousands of requests per second. LCG Phase 2 Planning Meeting - Friday July 30th, 2004
(6) (5) (4) (etc…) (etc…) (3) Master servers (1) (2) Data access model T1.root HPSS Data servers disks Slave daemon: Xrootd / Objy (1) + (2): dynamic load balancing (4) + (5): dynamic staging (6): random access to the data Master daemon: Xrootd / Objy Client T1.root ? LCG Phase 2 Planning Meeting - Friday July 30th, 2004
Dynamic staging • Average file size: 500 MB. • Average staging time: 120 s. • When the system was overloaded (before dyn. load balancing era): 10-15 min delays (with only 200 jobs) Up to 10k files from tape to disk cache / day (150k staging requests/month!). Max of 4 TB from tape to disk cache / day LCG Phase 2 Planning Meeting - Friday July 30th, 2004
Dynamic load balancing • Up and running since December 2003 for Objectivity (before a file could only be staged on a given server). • no more delayed jobs (even with 450 jobs in //). • more efficient management of the disk cache (entire disk space seen as a single file system). • fault tolerance in case of server crashes. LCG Phase 2 Planning Meeting - Friday July 30th, 2004
Pros … • Mass Storage System (MSS) usage completly transparent for the end user. • No cache space management by the user. • Extremely fault tolerant (server crashes or during maintenance work). • Highly scalable + entire disk space efficiently used. • On the admin side: can choose your favourite MSS, favourite protocol to do the staging (Slac: pftp, Lyon: RFIO, ….). LCG Phase 2 Planning Meeting - Friday July 30th, 2004
… and cons • Entire machinery relies on a lot of different components (especially a MSS). • In case of a very high demand on the client side response time can be real slow. But also depending on: • number of data sets available. • a good data structure. LCG Phase 2 Planning Meeting - Friday July 30th, 2004
Data structure: the fear factor • A performant data access model depends also on this. • Deep copies vs « pointers’ » files (only containing pointers to other files) ? LCG Phase 2 Planning Meeting - Friday July 30th, 2004
What about other experiments ? • Xrootd well adapted for users’ jobs using ROOT to analyze a large dataset. • being included in the official version of ROOT. • already setup in Lyon and being used or tested by other groups: D0, EUSO and INDRA. • access to files stored in HPSS transparently. • no need to manage the disk space. LCG Phase 2 Planning Meeting - Friday July 30th, 2004
Summary • Storage and data access is the main challenge. • Good ratio disk/tape hard to find: depends on many factors (users, number of tape drives etc…). • Xrootd provides lots of interesting features for distant data access. extremely robust (great achievement for a distributed system). LCG Phase 2 Planning Meeting - Friday July 30th, 2004