1 / 34

Grid Middleware & TOOLS session summary

Ian Bird, CERN Rob Gardner, University of Chicago. Grid Middleware & TOOLS session summary. Introduction. 82 abstracts submitted, 36 oral presentations (7 sessions), 44 posters, [2 withdrawn] Categories: cover a broad range Experiment experiences Data Management Workload Management

Télécharger la présentation

Grid Middleware & TOOLS session summary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ian Bird, CERN Rob Gardner, University of Chicago Grid Middleware & TOOLS session summary

  2. Introduction • 82 abstracts submitted, • 36 oral presentations (7 sessions), 44 posters, [2 withdrawn] • Categories: cover a broad range • Experiment experiences • Data Management • Workload Management • Monitoring, Information, Accounting • Security & Authorization • Fabric & Deployment

  3. Experiment experiences

  4. Grid reliability – Pablo Saiz

  5. Grid efficiency during CMS data challenges – Oliver Gutsche

  6. D0 – reprocessing on OSG Amber Boehnlein Common theme: making sites reliable requires debugging sites/systems one by one

  7. Job agents – pilot jobs Monitoring Alien grid environment - Pablo Saiz

  8. Data management

  9. SRM v2.2 – Flavia Donno 18 month effort to agree, build, test, deploy new version

  10. dCache – one of several MSS systems • Patrick Fuhrmann – overview of dCache developments • - Gerd Behrmann – distributed instance for NDGF

  11. LCG Data management tools LFC, DPM, FTS – Markus Schulz

  12. Examples of services that consider deployment & management issues

  13. CORAL – distributed database access Dirk Duellmann

  14. Workload management

  15. Pilot jobs?

  16. Pilot jobs – and variants: Such a good idea – everyone wants one …

  17. Stuart Paterson – optimizations in DIRAC Marianne Bargiotti Integrity checking in DIRAC

  18. Pilots can move intelligence into the jobPaul Nilsson – Panda experience

  19. gLite WMS developments Marco Cecchi

  20. Igor Sfiligoi – comparison of WMS CHEP'07, Victoria

  21. Monitoring, information, etc.

  22. Experiment dashboards Julia Andreeva Monitoring from VO/user perspective

  23. GridICE – monitoring Guido Cuscela Permits different views of running jobs

  24. James Casey Advances in monitoring of grid services

  25. Stephen Burke – 6 years experience with GLUE schema Martin Flechl – details on integration of information systems

  26. Security, authorization, etc

  27. David Groep - glExec Supporting pilot jobs

  28. Fabric & Deployment

  29. Greig Cowan Using DPM over the WAN

  30. Addressing failover for core operations services – Alfredo Pagano Various strategies

  31. Platform LSF – Robert Stober Integrating heterogeneous clusters

  32. Observations • Solutions exist for most needs now – • Certainly not all perfect yet • Experiment layer relatively deep • Plethora of workload management systems • Not so many for data management … • Service management issues starting to be addressed by some services (DPM, LFC, FTS, Gridsite, Coral) • But in general little thought on how site managers should manage services • Interoperability / interoperation

  33. Observations • Workload management • Everyone wants pilot (aka glidein) jobs (and everyone has written a system to submit them) • Commonality – to reach a reliable service experiments need to systematically debug sites being used: • D0, CMS, dashboards, … • Sophisticated systems to monitor, debug, recover • Dirac, dashboards, grid service monitoring, etc., • To improve reliability and help debug the system

More Related