80 likes | 182 Vues
This data management overview discusses usage of SEs, file placement, FTS transfers, job brokering, current system challenges, future replica catalogs, access protocols, and file staging protocols. It outlines the incorporation of gfal2 and FTS3, usage of compatible URLs, and the transition to DIRAC File Catalog. The text explores considerations for dynamic data caching, data popularity metrics, and exploration of http/webdav for data location and access.
E N D
Brief overview of current DM • Replica catalog: LFC • LFN -> list of SEs • SEs are defined in the DIRAC Configuration System • For each protocol: end-point, SAPath, [space token, WSUrl] • Currently only used: SRM and rfio • File placement according to Computing Model • FTS transfers from original SE (asynchronous) • Disk replicas and archives completely split: • Only T0D1 and T1D0, no T1D1 SE any longer • Production jobs: • Input file download to WN (max 10 GB) using gsiftp • User jobs: • Protocol access from SE (on LAN) • Output upload: • From WN to (local) SE (gsiftp). Upload policy defined in the job • Job splitting and brokering: • According to LFC information • If file is unavailable, the job is rescheduled PhC
Caveats with current system • Inconsistencies between FC, SE catalog and actual storage • Some files are temporarily unavailable (server down) • Some files are lost (unrecoverable disk, tape) • Consequences: • Wrong brokering of jobs: cannot access files • Except for download policy if another replica is on disk/cache • SE overload • Busy, or not enough movers • As if files are unavailable • Jobs are rescheduled PhC
Future of replica catalog • We probably still need one • Job brokering: • Don’t want to transfer files all over the place (even with caches) • DM accounting: • Want to know what/how much data is where • But… • Should not need to be highly accurate as now • Allow files to be unavailable without the job failing • Considering the DIRAC File Catalog • Mostly replica location (as used in LFC) • Built-in space usage accounting per directory and SE PhC
Access and transfer protocols • Welcome gfal2 and FTS3! • Hopefully transparent protocol usage for transfers • However transfer requests should be expressed with compatible URLs • Access to T1D0 data • 99% for reconstruction or re-stripping, i.e. download • Read once, therefore still require a sizeable staging pool • Unnecessary to copy to T0D1 before copying to WN • xrootvs http/webdav • No strong feelings • What is important is unique URL, redirection and WAN access • However why not use (almost) standard protocols • CVMFS experience is very positive, why not http for data? • Of course better if all SEs provide the same protocol • http/webdav for EOS and Castor? • We are willing to look at the http ecosystem PhC
Other DM functionality • File staging from tape • Currently provided by SRM • Keep SRM for T1D0 handling • Limited usage for bringOnline • Not used for getting tURL • Space tokens • Can easily be replaced by different endpoints • Preferred to using namespace! • Storage usage • Also provided by SRM • Is there a replacement? PhC
Next steps • Re-implement DIRAC DM functionality with gfal2 • Exploit new features of FTS3 • Migrate to DIRAC File Catalog • In parallel with LFC • Investigate http/webdav for file location and access • First, use it for healing • Still brokering using a replica catalog • Usage for job brokering (replacing replica catalog)? • Scalability? PhC
What else? • Dynamic data caching • Not clear yet how to best use this without replicating everything everywhere • When do caches expire? • Job brokering? • Don’t want to hold jobs while a dataset is replicated • Data popularity • Information collection in place • Can it be used for automatic replication/deletion? • Or better as a hint for Data managers? • What is the metrics to be used? • What if 10 files out of a 100TB dataset are used for tests, but none is interested in the rest? • Fraction of dataset used or absolute number of accesses? • Very few analysis passes on full dataset • Many iterative usage of same subset PhC