290 likes | 311 Vues
Explore LCG Data Management goals, common interfaces, current and future developments, and key services like Storage Resource Manager and Grid File Access Library. Understand the importance of reliability and performance for meeting data challenges. Learn about the Grid File Access Library system, its components, and capabilities. Discover how Replication and Registration Service optimizes file management.
E N D
GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD HEPiX 2004-05-23
Agenda • LCG Data Management goals • Common interface • Current status • Current developments • Medium term developments • Conclusion HEPiX 2004-05-23
LCG Data Management goals • Meet requirements of Data Challenges • Common interface • Reliability • Performance HEPiX 2004-05-23
Common interfaces • Why? • Different grids: LCG, Grid3, Nordugrid • Different Storage Elements • Possibly different File Catalogs • Solutions • Storage Resource Manager (SRM) • Grid File Access Library (GFAL) • Replication and Registration Service (RRS) HEPiX 2004-05-23
Storage Resource Manager • Goal: agree on single API for multiple storage systems • Collaboration between CERN, FNAL, JLAB and LBNL and EDG • SRM is a Web Service • Offering Storage resource allocation & scheduling • SRMs DO NOT perform file transfer • SRMs DO invoke file transfer service if needed (GridFTP) • Types of storage resource managers • Disk Resource Manager (DRM) • Hierarchical Resource Manager (HRM) • SRM is being discussed at GGF and proposed as a standard HEPiX 2004-05-23
Grid File Access Library (1) • Goals • Provide a Posix I/O interface to heterogeneous Mass Storage Systems in a GRID environment • A job using GFAL should be able to run anywhere on the GRID without knowing about the services accessed or the Data Access protocols supported HEPiX 2004-05-23
Grid File Access Library (2) • Services contacted • Replica Catalogs • Storage Resource Managers • Mass Storage Systems through diverse File Access protocols like FILE, RFIO, DCAP, (ROOT I/O) • Information Services: MDS HEPiX 2004-05-23
POOL Physics Application POSIX I/O VFS Grid File Access Library (GFAL) root I/O open() read() etc. dCap I/O open() read() etc. Replica Catalog Client Information Services Client rfio I/O open() read() etc. SRM Client Local File I/O Wide Area Access MDS RC Services SRM Service dCap Service Root I/O Service rfio Service Grid File Access Library (3) HEPiX 2004-05-23
GFAL File System • GFALFS now based on FUSE (Filesystem in USErspace) file system developed by Miklos Szeredi • Uses: • VFS interface • Communication with a daemon in user space (via character device) • The metadata operations are handled by the daemon, while the I/O (read/write/seek) is done directly in the kernel to avoid context switches and buffer copy • Requires installation of a kernel module fuse.o and of the daemon gfalfs • The file system mount can be done by the user HEPiX 2004-05-23
GFAL support • GFAL library is very modular and is small (~ 2500 lines of C): effort would be minimal unless new protocols or new catalogs have to be supported • Test suite available • GFAL file system: • Kernel module: 2000 lines (FUSE original) + 800 lines (GFAL specific for I/O optimization) • Daemon: 1600 lines (FUSE unmodified) + 350 lines GFAL specific (separate file) • Utilities like mount: 600 lines (FUSE + 5 lines mod) HEPiX 2004-05-23
Replication and Registration Service • Copy and register files • Multiple SEs and multiple Catalogs • Different types of SE • Different types of RC • Different transfer protocols • Optimization, handling of failures • Meeting at LBNL in September 2003 with participants from CERN, FNAL, Globus, JLAB and LBNL • Refined proposal by LBNL being discussed HEPiX 2004-05-23
Current status (1) • SRM • SRM 1.1 interfaced to CASTOR (CERN), dCache (DESY/FNAL), HPSS (HRM at LBNL) • SRM 1.1 interface to EDG-SE being developed (RAL) • SRM 2.1 being implemented at LBNL, FNAL, JLAB • SRM “basic” being discussed at GGF • SRM is seen by LCG as the best way currently to do the load balancing between GridFTP servers. This is used at FNAL. HEPiX 2004-05-23
Current status (2) • EDG Replica Catalog • 2.2.7 (improvements for POOL) being tested • Server works with Oracle (being tested with MySQL) • EDG Replica Manager • 1.6.2 in production (works with classical SE and SRM) • 1.7.2 on LCG certification testbed (support for EDG-SE) • Stability and error reporting being improved HEPiX 2004-05-23
Current status (3) • Disk Pool Manager • CASTOR, dCache and HRM were considered for deployment at sites without MSS. • dCache is the product that we are going to ship with LCG2 but this does not prevent sites having another DPM or MSS to use it. • dCache is still being tested in the LCG certification testbed HEPiX 2004-05-23
CASTOR • This solution was tried first because of local expertise • Functionality ok • Solution dropped by CERN IT management for lack of manpower to do the support worldwide HEPiX 2004-05-23
HRM/DRM (Berkeley) • This system has been used in production for more than a year to transfer data between Berkeley and Brookhaven for the STAR experiment • The licensing and support was unclear • However VDT will probably distribute this software • IN2P3 (Lyon) is investigating if they could use this solution to provide an SRM interface to their HPSS system HEPiX 2004-05-23
dCache (DESY/FNAL) • Joint project between DESY and FNAL • DESY developed the core part of dCache while FNAL developed the Grid interfaces (GridFTP and SRM) and monitoring tools • dCache is used in production at DESY and FNAL, but also at some Tier centers for CMS • IN2P3 is also investigating if dCache could be used as a frontend to their HPSS system HEPiX 2004-05-23
Current status (4) • Grid File Access Library • Offers Posix I/O API and generic routines to interface to the EDG RC, SRM 1.1, MDS • A library lcg_util built on top of gfal offers a C API and a CLI for Replica Management functions. They are callable from C++ physics programs and are faster than the current Java implementation. • A File System based on FUSE and GFAL is being tested (both at CERN and FNAL) HEPiX 2004-05-23
LCG-2 SE (April release) • Mass Storage access – to tape • SRM interfaces exist for Castor, Enstore/dCache, HPSS • SRM SEs available at CERN, FNAL, INFN, PIC • Classic SEs (GridFTP, no SRM) deployed everywhere else • GFAL included in LCG-2 – it has been tested against CASTOR SRM and rfio as well as against Enstore/dCache SRM and Classic SEs. HEPiX 2004-05-23
Test suites • Test suites have been written and run against classic SE, CASTOR and dCache for: • SRM • GFAL library and lcg_util • The new version (better performance) of the GFAL File System is being extensively tested against CASTOR and the tests against dCache have started • Latest versions (> 1.6.2) of the Replica Manager support both the classical SEs and the SRM SEs HEPiX 2004-05-23
File Catalogs in LCG-2 • Problems were seen during Data Challenges • performance of java CLI tools • performance problems due to lack of bulk operations • no major stability problems • JOINs between Replica Catalog and Metadata Catalog is expensive • worked with users and other middleware to reduce these joins (often unnecessary) HEPiX 2004-05-23
Proposal for next Catalogs • Build on current catalogs, and satisfy medium term needs from the DC's • Replica Catalog • like current LRC, but not "local" • we never had "local" ones anyway, since RLI was not deployed • no user defined attributes in catalog -> no JOINs • File Catalog • store Logical File Names • impose a hierarchical structure, and provide "directory-level“ operations • user defined metadata on GUID (like in current RMC) HEPiX 2004-05-23
Replication of Catalogs • need to remove single point of failure and load • during one Saturday of CMS DC, Catalogs accounted for 9% of all external traffic at CERN. • RLI (distributed indexes) were never tested or deployed • RLI does not solve distributed metadata query problem (only indexes GUIDs) • IT/DB tested Oracle based replication with CMS during Data Challenge • Proposed to build on this work, and use replicated, not distributed catalogs • small number of sites (~4 - 10) • New design (Replica Catalog and File Catalog) should reduce replication conflicts • need to design the conflict resolution policy - last updated might be good enough HEPiX 2004-05-23
Questions (1) • Is this a good time to introduce security ? • authenticated transactions would help with problem analysis • How many sites should have replicated catalogs ? • Sites require Oracle (not a large problem, most Tier1's have it and license is not a problem) • replication conflicts rise with more sites. • It depends on outbound TCP issues from worker nodes (but a proxy could be used). HEPiX 2004-05-23
Questions (2) • What about MySQL as a backend? • Oracle/MySQL interaction being investigated by IT/DB and others under a "Distributed Database Architecture" proposal • replication between the two is possible • Likely to use MySQL at Tier-2s and Tier-1s without Oracle • Need to investigate which is minimum version of MySQL we require • probably will be MySQL 5.x, when it is stable HEPiX 2004-05-23
Current developments • Bulk operations in EDG RC (LCG certification testbed) • Integration of GFAL with ROOT • Classes TGfal and TGfalFile • Support of ROOT I/O in GFAL • Interface GFAL and lcg_util to EDG-SE HEPiX 2004-05-23
Medium term developments • Reshuffling of Replica Catalogs for performance • Replicated Catalogs instead of Distributed Catalogs • File Collections? • SRM 2.1 • Replication/Registration Service (Arie Shoshani) • Integration of POOL with GFAL to reduce dependencies (using TGfal class) HEPiX 2004-05-23
Important features of SRM 2.1 for LCG (compared to SRM 1.1) • Global space reservation • Directory operations • Better definition of statuses and error codes HEPiX 2004-05-23
Conclusion • In the past 12 months • Common interfaces have been designed, implemented and deployed (SRM and GFAL) • The reliability of the Data Management tools has been improved quite considerably • We are still improving the performance of those tools HEPiX 2004-05-23