1 / 12

Jean-Yves Nief CC-IN2P3, Lyon

BaBar Tier A @ CC-IN2P3. An example of a data access model in a Tier 1. Jean-Yves Nief CC-IN2P3, Lyon. Overview of BaBar @ CC-IN2P3 (I). CC-IN2P3: mirror site of Slac for BaBar since November 2001: real data. simulation data. ( total = 220 TB )

zamir
Télécharger la présentation

Jean-Yves Nief CC-IN2P3, Lyon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BaBar Tier A @ CC-IN2P3 An example of a data access model in a Tier 1 Jean-Yves Nief CC-IN2P3, Lyon LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  2. Overview of BaBar @ CC-IN2P3 (I) • CC-IN2P3: mirror site of Slac for BaBar since November 2001: • real data. • simulation data. (total = 220 TB) • Provides the infrastructure needed to analyze these data by the end users. • Open to all the BaBar physicists. LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  3. permanent cache Overview of BaBar @ CC-IN2P3 (II) • 2 types of data available: • Objectivity format (commercial OO database): giving it up. • Root format (ROOT I/O: Xrootd developped @ SLAC). • Hardware: • 200 GB tapes (type: 9940). • 20 tape drives (r/w rate = 20 MB/s). • 20 Sun servers. • 30 TB of disks (ratio disk/tape = 15%). actually ratio ~30% (ignoring rarely accessed data) LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  4. BaBar usage @ CC-IN2P3 • 2002 – 2004: ~ 20% of the CPU available (on a total of ~1000 CPUs available). • Up to 450-500 users’ jobs running in // • « Distant access » of the Objy and root files from the batch worker (BW): • random access to the files: only the objects needed by the client are transfered to the BW (~kB per request). • hundreds of connections per server. • thousands of requests per second. LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  5. (6) (5) (4) (etc…) (etc…) (3) Master servers (1) (2) Data access model T1.root HPSS Data servers disks Slave daemon: Xrootd / Objy (1) + (2): dynamic load balancing (4) + (5): dynamic staging (6): random access to the data Master daemon: Xrootd / Objy Client T1.root ? LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  6. Dynamic staging • Average file size: 500 MB. • Average staging time: 120 s. • When the system was overloaded (before dyn. load balancing era): 10-15 min delays (with only 200 jobs) Up to 10k files from tape to disk cache / day (150k staging requests/month!). Max of 4 TB from tape to disk cache / day LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  7. Dynamic load balancing • Up and running since December 2003 for Objectivity (before a file could only be staged on a given server). • no more delayed jobs (even with 450 jobs in //). • more efficient management of the disk cache (entire disk space seen as a single file system). • fault tolerance in case of server crashes. LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  8. Pros … • Mass Storage System (MSS) usage completly transparent for the end user. • No cache space management by the user. • Extremely fault tolerant (server crashes or during maintenance work). • Highly scalable + entire disk space efficiently used. • On the admin side: can choose your favourite MSS, favourite protocol to do the staging (Slac: pftp, Lyon: RFIO, ….). LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  9. … and cons • Entire machinery relies on a lot of different components (especially a MSS). • In case of a very high demand on the client side  response time can be real slow. But also depending on: • number of data sets available. • a good data structure. LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  10. Data structure: the fear factor • A performant data access model depends also on this. • Deep copies vs « pointers’ » files  (only containing pointers to other files) ? LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  11. What about other experiments ? • Xrootd well adapted for users’ jobs using ROOT to analyze a large dataset. • being included in the official version of ROOT. • already setup in Lyon and being used or tested by other groups: D0, EUSO and INDRA. • access to files stored in HPSS transparently. • no need to manage the disk space. LCG Phase 2 Planning Meeting - Friday July 30th, 2004

  12. Summary • Storage and data access is the main challenge. • Good ratio disk/tape hard to find: depends on many factors (users, number of tape drives etc…). • Xrootd provides lots of interesting features for distant data access.  extremely robust (great achievement for a distributed system). LCG Phase 2 Planning Meeting - Friday July 30th, 2004

More Related