1 / 1

Enhancing dCache Configuration for Efficient Multi-Experiment Support in ATLAS and LHCb

This document addresses the ongoing deadlock situation in the dCache system used for ATLAS, LHCb, and other experiments. We outline the performance bottlenecks involving SRM and PNFS, highlighting the importance of isolating virtual organizations (VOs) for improved management and reduced complexity. A procedure for pool migration and installation of new administration nodes is detailed, including the steps needed for a complete downtime of the original dCache instance and the transition to a new setup. Key improvements aim to facilitate smoother data handling and distribution across various experiments.

Télécharger la présentation

Enhancing dCache Configuration for Efficient Multi-Experiment Support in ATLAS and LHCb

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GridKa dCache LHCb CMS compass cdf ALICE dteam belle auger ops ATLAS dCache ATLAS ops c3grid astrogrid medigrid ingrid kerndgrid textgrid wisent hepcg • Deadlock situation • One dteam file located on ATLAS pool • PNFS id not known to dCache • Pool tried flushing file to tape • PnfsManager did not know file • PnfsManager run in deadlock situation „The square root of serving multiple experiments is a single dCache“ D.Ressmann, S.Halstenberg, C.Jung, A.Trunov, J.van Wezel • Motivation • Performance problems – bottleneck SRM, PNFS • Further single points of failure: PoolManager, PinManager • Easier to manage due to less complex configuration • VOs do not disturb each other • Configuration can be fine tuned for the requirements of the separated VO GridKa dCache CMS ATLAS ALICE LHCb com-pass cdf auger belle ops dteam hepcg wisent textgrid c3grid kerndgrid ingrid 9 other D-Grid VOs medigrid astrogrid Separation • Procedure • Pool migration (hosts only serving ATLAS pools) • Installation of new ATLAS administration nodes • Complete downtime of original dCache instance • dump of SRM and PNFS DB • Remove ATLAS configuration: • Switch off all ATLAS pools • PoolManager • Pin- and SpaceManager entries • PNFS DBs • Restart original dCache instance • Reload DB dumps in new ATLAS instance • Remove all non ATLAS entries from DBs • Change dCacheSetup on all ATLAS pools • Start ATLAS instance D-Grid dCache 9 other D-Grid VOs • ATLAS nodes • Head node • PNFS deamon, PnfsManager, postgres DB • SRM,utilityDomain, postgres DB • gPlazma • Billing DB • dCap door • 4 gridftp doors • 23 pool nodes Further information today: 037 - Improving data distribution on disk pools for dCache 070 - dCache administration at the GridKa Tier-1-center, ready for data taking 069 - Reference installation for the German grid Initiative D-Grid Forschungszentrum Karlsruhe GmbH Hermann-von-Helmholz-Platz 1 D-76344 Eggenstein-Leopoldshafen Doris.Ressmann@iwr.fzk.de Steinbuch Centre for Computing

More Related