190 likes | 341 Vues
Production Grids: differences, similarities, convergence. Oxana Smirnova, Lund University/CERN, 3 rd NGN meeting, Vilnius, March 17, 2005. Outlook. Multitude of Grid projects Production Grids versus others Cross-disciplinary Grids Application-specific Grids Convergence issues.
E N D
Production Grids: differences, similarities, convergence. Oxana Smirnova, Lund University/CERN, 3rd NGN meeting, Vilnius, March 17, 2005
Outlook • Multitude of Grid projects • Production Grids versus others • Cross-disciplinary Grids • Application-specific Grids • Convergence issues
Where are we standing – tools • Few projects in US and EU develop basic (core) Grid-like technologies and free tools • Examples: Globus, Unicore, Avaki/Legion, Condor • Objectives: R&D, fundamental IT research, proof-of-concept • Drawbacks: • No motivation to provide a ready-to-use product • Often – no possibilities to finalize or support the product • Little or no feedback solicited from potential users, unclear use cases and requirements • Result: no product that works out of the box • Each core toolkit lacks one ore another essential service • A note on commercial usage: no solution is mature enough • Big problem: lack of production-level test-beds
Where are we standing – users • Enormous amount of projects and initiatives worldwide aim at creating Grid-like infrastructures • Motivation: need to optimize distributed resource usage • Idea: amassed quantity of compute resources should be transferred into a qualitatively new phenomenon • Starting point: large distributed resources • Problem: no ready-made Grid solution provided by the core development projects • Approaches: • Investigate the possibilities, raise Grid awareness • Make the best out of existing tools • Build an own Grid solution
Taxonomy attempt (non-exclusive) • Geopolitical Grids • Local (universities, towns) • National (Italy, Sweden, Estonia, South Korea…) • Regional (Nordic, Baltic, South-East Europe, Transcaucasian…) • International (EU, EU-US) • Application-specific Grids • Collaborative (accelerator experiments) • Intra-disciplinary (medicine, high energy physics) • Inter-disciplinary • Resource-specific Grids • High Performance Computing • Commodity clusters/farms/pools
How many Grid projects are there? • No authoritative list exists, almost impossible to keep track, but… • An enthusiast at www.gridcomputing.com tries to keep the list:
What constitutes a production Grid? • As of today, very few projects can afford pursuing a production Grid system • Currently, the project must be capable of developing high-level Grid software • At some point, core software has to be modified or introduced anew • Nature and amount of the newly developed software varies with project’s scope and objectives • The resulting Grid system must offer high-quality services, most notably: • Security • Reliability • Stability • Flexibility • Scalability • Portability
What is NOT a production Grid? • A prototype system with limited number of users and limited functionality • A test setup with well-planned activities • A demonstration setup running few strictly defined applications • A network of resources with only core services installed • A project developing standalone components • As of today, production grids are based on commodity Linux PC clusters; practically no notable HPC-Grid runs in a production mode • Worth mentioning: TeraGrid (http://www.teragrid.org)
Multidisciplinary production Grids • Disclaimer: up to now, High Energy Physics community is by far the largest Grid user • Major projects: • Enabling Grids for E-SciencE (EGEE) – EU and partners (incl. USA, Asia, Russia), prototyping phase • Grid3/Open Science Grid (OSG) – USA, transition from Grid3 to OSG • NorduGrid – Nordic states, stable • GRID.IT – Italy, stable
EGEE • The largest Grid project, funded by the EU FP6 • http://eu-egee.org • Time span: April 2004 to April 2006, with planned 2-years extension • Partners: 70+ institutes worldwide • Major activities: • Raise Grid awareness • Provide resources and operational support via Grid technologies for scientists • Maintain and improve Grid software • Is a follow-up to the European DataGrid project (EDG), inheriting large parts of its Grid solutions • Middleware: gLite, based on Globus and Condor • Is based on the resources contributed to the LHC Computing Grid (LCG) • The first release of the gLite middleware is to come out this month • Is widely expected to become the largest production Grid
atlas dc2 cms dc04 Grid3 • Grid3: originally, provided infrastructure and simple Grid-like solution for High Energy Physics computing in USA • http://www.ivdgl.org/grid2003/ • Collaboration of several research centers, active: 2003-2004 • Uses Globus and Condor, plus few own developments • Was proven to be able to provide reliable services to other applications
Open Science Grid • Continuation and extension of Grid3 achievements • http://www.opensciencegrid.org/ • Consortium, aims at creating a national US Grid infrastructure • Focus on general services, operations, end-to-end performance • Takes over Grid3 in Spring 2005 (NOW)
NorduGrid • Collaboration of Nordic researchers, developing an own Grid middleware solution (ARC) since 2001 • http://www.nordugrid.org • A Grid based on ARC-enabled sites • Driven (so far) mostly by the needs and resources of the LHC experiments • Dozens of other applications • Assistance in Grid deployment outside the Nordic area • See other talks today for details…
GRID.IT • National Grid infrastructure in Italy • http://www.grid.it/ • Funded for 2003-2005 • Also triggered by HEP community needs, but expands to many other applications • Like EGEE, heavily based on the EDG Grid middleware • Does specific developments, most notably, portals and monitoring tools • Contributes to EGEE
Application-specific Grids • At the moment, most notable application-specific Grids are developed in the HEP community • Reviewed projects: • LHC Computing Grid (LCG), stable • AliEn, stable • Dirac, stable • Some others: • SAM-Grid, stable • Healthgrid, involves various sub-projects at different stages
LHC Computing Grid (LCG) • Provides Grid infrastructure for the experiments at the LHC accelerator in Geneva • http://cern.ch/lcg/ • Major activities: • Fabric • Grid deployment and operations • Common applications • Distributed data analysis • Originally, have chosen EDG as the basic middleware • Applies some modifications; uses only selected services • Took over EDG middleware support • Later, agreed to share some operational responsibilities and middleware with EGEE • CERN is actually the coordinator of EGEE • EGEE’s gLite middleware is expected to inherit many EDG solutions • Since late 2003, LCG is in the production mode, used by the LHC experiments • 100+ sites, 7000+ processors
AliEn • Distributed computing environment developed for the ALICE experiment at LHC • http://alien.cern.ch • Stands out for developing own innovative solutions • Based on available products (Globus, OpenSSL…) • Might not be satisfying all the Grid criteria, but provides all the necessary for the application functionality • Provides a set of services • Resource Brokers, Package Manager, site services… • Offers a user-friendly distributed file and metadata catalog • A user-friendly interface • Was considered a prototype for EGEE, will continue contributing with ideas
DIRAC • The distributed production and analysis system for yet another LHC experiment, LHCb • http://dirac.cern.ch/ • Develops own service oriented computational grid package • Composed of a set of light-weight services and a network of distributed agents to deliver workload to computing resources • Deployed on 20 “DIRAC”, and 40 “LCG” sites • 90% job efficiency for DIRAC, 60% for LCG • Implements an innovative agent-based job pull technology, very efficiently saturating all available resources • Makes use of Globus (gridftp, authorisation) and Condor (matchmaking) • Can use AliEn’s File Catalog
Will they converge? • Despite many initial similarities, all the Grid solutions of today are diverging • Absence of standards • Absence of well-established, reliable components and services • Multitude of projects with different priorities • Current attempts at interoperability: applications level, extra “adapter” layers • Users are alarmed at the prospect of having many incompatible Grids • Contradicts the idea of harnessing distributed resources in an optimal manner • Very recently, middleware developers started discussing possibilities of service-level interoperability • Still, difficult to conclude what will come out • Hopefully, the common standards will emerge – otherwise, computing Grid will indeed resemble the power Grid, with its adapters and converters