1 / 78

Introduction to Grid

Introduction to Grid. Eddie.Aronovich@cs.tau.ac.il. Acknowledgements. Presentation is based on slides from: Roberto Barbera, University of Catania and INFN (EGEE Tutorial Roma, 02.11.2005) Mike Mineter, Concepts of grid computing

oki
Télécharger la présentation

Introduction to Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Grid Eddie.Aronovich@cs.tau.ac.il

  2. Acknowledgements • Presentation is based on slides from: • Roberto Barbera, University of Catania and INFN (EGEE Tutorial Roma, 02.11.2005) • Mike Mineter, Concepts of grid computing • Fabrizio Gagliardi, EGEE Project Director, CERN, Geneva, Switzerland (Naregi Symposium 2005 – Tokyo) • Fabrizio Gagliardi, EGEE Project Director, CERN, Geneva, Switzerland (APAC, 27 September 2005) • Guy Warner, NeSC Training Team (An Induction to EGEE for GOSC and the NGS NeSC, 8th December 2004 ) • Service Oriented Architecture & Grid Computing by Marc Brooks, The MITRE Corporation EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  3. Security & Intellectual Property (I) • The existing EGEE grid middleware is distributed under an Open Source License developed by EU DataGrid • No restriction on usage (scientific or commercial) beyond acknowledgement • Same approach for new middleware • Application software maintains its own licensing scheme • Sites must obtain appropriate licenses before installation EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  4. EGEE project in 1K words https://goc.grid-support.ac.uk/gridsite/monitoring/ EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  5. What do we need more ? • Processing power • Storage • Security aware integrative infrastructure • Community aware environment Or what we may call…. EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  6. e-Science • What is e-Science? Collaborative science that is made possible by the sharing across the Internet of resources (data, instruments, computation, people’s expertise...) • Often very compute intensive • Often very data intensive (both creating new data and accessing very large data collections) – data deluges from new technologies • Crosses organisational boundaries • Examples…. EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  7. Mont Blanc (4810 m) Downtown Geneva A good example: Particle Physics • Large amount of data produced in a few places: CERN, FNAL, KEK… • Large worldwide organized collaborations (i.e. LHC CERN experiments) of computer-savvy scientists • Computing and data management resources distributed world-wide owned and managed by many different entities • Large Hadron Collider (LHC) at CERN in Geneva Switzerland: • One of the most powerfulinstruments ever built to investigate matter EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  8. 10-15 Petabytes ˜20.000.000 CD-ROM 10 times the Eiffel Tower ˜3000 m Orders of magnitude… EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  9. Grids and e-Infrastructure • “Campus grids”: internal to an institute / university: • “High throughput” – harvesting compute time • Not really ‘a grid’ unless crossing administrative domains • Can become a resource on a grid • Example: Condor • http://www.nesc.ac.uk/esi/events/556/ • Grids: cross administrative boundaries • National scale: in IL, IAG • Regional efforts: in China, EUMedGrid, CrossGrid, SeeGrid • International scale: in Europe, EGEE EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  10. e-Infrastructure • implementation blocks From a talk by Mario Campolargo, Brussels, 30 May 2005 EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  11. What is Service Oriented Architecture (SOA)? • An SOA application is a composition of services • A “service” is the atomic unit of an SOA • Services encapsulate a business process • Service Providers Register themselves • Service use involves: Find, Bind, Execute • Most well-known instance is Web Services Service Registry Find Register Service Consumer Service Provider Bind, Execute EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  12. Service Registry Find Register Service Consumer Service Provider Bind, Execute SOA Actors • Service Provider • Provides a stateless, location transparent business service • Service Registry • Allows service consumers to locate service providers that meet required criteria • Service Consumer • Uses service providers to complete business processes EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  13. Service Registry Find Register Service Consumer Service Provider Bind, Execute SOA Benefits Business Benefits • Focus on Business Domain solutions • Leverage Existing Infrastructure • Agility Technical Benefits • Loose Coupling • Autonomous Service • Location Transparency • Late Binding EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  14. SOA/Web Services Related Standards Source: http://roadmap.cbdiforum.com/reports/protocols/ EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  15. Contents • “The Grid” vision • What is “a grid” ? • Drivers of grid computing • Implementation samples • Grid Status & Standards • The basis: authentication, authorisation, security • So, What can it do ? EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  16. The Grid Metaphor Mobile Access G R I D M I D D L E W A R E Supercomputer, PC-Cluster Workstation Data-storage, Sensors, Experiments Visualising Internet, networks EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  17. The grid vision • The grid vision is of “Virtual computing” (+ information services to locate computation, storage resources) • Compare: The web: “virtual documents” (+ search engine to locate them) • MOTIVATION: collaboration through sharing resources (and expertise) to expand horizons of • Research • Commerce – engineering, … “the knowledge economy” • Public service – health, environment,… EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  18. Contents • “The Grid” vision • What is “a grid” ? • Drivers of grid computing • Implementation samples • Grid Status & Standards • The basis: authentication, authorisation, security • So, What can it do ? EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  19. Institute A Institute C Institute B Institute D “A grid” • The initial vision: “The Grid” • The present reality: Many “grids” • Each grid is an infrastructure enabling one or more “virtual organisations” to share computing resources • What’s a VO? • People in different organisations seeking to cooperate and share resources across their organisational boundaries • Why establish a Grid? • Share data • Pool computers • Collaborate VO EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  20. Application Software Operating System Disks, Processor, Memory, … The Single Computer • The Operating System enables easy use of • Input devices • Processor • Disks • Display • Any other attached devices EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  21. Application Software Middlewarefor sharing computers, servers, printers, … Operating System on each computer Resources connected by a LAN Resources on a Local Area Network User just perceives “shared resources”, with no regard to location in the organisation: - Authenticated by username / password - Authorised to use own files,… EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  22. Resources on a grid Application Software Interface between app. and grid Grid Middleware: “collective services” Grid Middleware on each resource Operating System on each resource Resources connected by internet EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  23. INTERNET A grid • Grid middleware runs on each shared resource • Data storage • (Usually) batch jobs on pools of processors • Users join VO’s • Virtual organisation negotiates with sites to agree access to resources • Distributed services (both people and middleware) enable the grid EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  24. What characterises a grid? • Co-ordinated resource sharing • No centralised point of control • Different administrative domains. • Standard, open, general-purpose protocols and interfaces • NOT specific to an application • EGEE, NGS support multiple VO’s • Delivering non-trivial qualities of service • Co-ordinated to deliver combined services, greater than sum of the individual components • http://www.gridtoday.com/02/0722/100136.html EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  25. The components of a Grid • Resources • networking, computers, storage, data, instruments, … • Grid Middleware • the “operating system of the grid” • Operations infrastructure • Run enabling services (people + software) • Virtual Organization management • Procedures for gaining access to resources EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  26. Key concepts • Virtual organisation: people and resources collaborating - across admin, organisational boundaries • Single sign-on • I connect to one machine – some sort of “digital credential” is passed on to any other resource I use, basis of: • Authentication: How do I identify myself to a resource without username/password for each resource I use? • Authorisation: what can I do? Determined by • My membership of VO • VO negotiations with resource providers • Grid middleware runs on each resource • User just perceives “shared resources” with no concern for location or owning organisation EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  27. Contents • “The Grid” vision • What is “a grid” ? • Drivers of grid computing • Implementation samples • Grid Status & Standards • The basis: authentication, authorisation, security • So, What can it do ? EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  28. Large Hadron Collider at CERN • Data Challenge: • 10Petabytes/year of data !!! • 20 million CDs each year! • Simulation, reconstruction, analysis: • LHC data handling requires computing power equivalent to ~100,000 of today's fastest PC processors! • Operational challenges • Reliable and scalable through project lifetime of decades Mont Blanc (4810 m) Downtown Geneva EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  29. dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf BLAST dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf Seq1 > dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbdfndfjvbndfbnbnfbjnbjxbnxbjk:nxbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf BLAST Seq1 > dcscdssdcsdcdsc bscdsbcbjbfvbfvbvfbvbvbhvbhsvbhdvbhfdbvfd Seq2 > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdfgcsdycgdkcsqkc … Seqn > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdfgcsdycgdkcsqkchdsqhfduhdhdhqedezhhezldhezhfehflezfzejfv dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf Seq2 > dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbdfndfjvbndfbnbnfbjnbjxbnxbjk:nxbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf Seqn > dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbdfndfjvbndfbnbnfbjnbjxbnxbjk:nxbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf BLAST dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf RESULT dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbfvbfvbvfbvbvbhvbhsvbhdvbhfdbvfdbvdfvfdvhbdfvbhdbhvdsvbhvbhdvrefghefgdscgdfgcsdycgdkcsqkcqhdsqhfduhdhdhqedezhdhezldhezhfehflezfzeflehfhezfhehfezhflezhflhfhfelhfehflzlhfzdjazslzdhfhfdfezhfehfizhflqfhduhsdslchlkchudcscscdscdscdscsddzdzeqvnvqvnq! Vqlvkndlkvnldwdfbwdfbdbd wdfbfbndblnblkdnblkdbdfbwfdbfn BLAST dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB BLAST gridification Computing element Input file UI Computing element EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  30. DAME: Grid based tools and Infer-structure for Aero-Engine Diagnosis and Prognosis Engine flight data London Airport Airline office New York Airport Grid Diagnostics Centre Maintenance Centre American data center European data center “A Significant factor in the success of the Rolls-Royce campaign to power the Boeing 7E7 with the Trent 1000 was the emphasis on the new aftermarket support service for the engines provided via DS&S. Boeing personnel were shown DAME as an example of the new ways of gathering and processing the large amounts of data that could be retrieved from an advanced aircraft such as the 7E7, and they were very impressed”, DS&S 2004 XTO Companies: Rolls-Royce DS&S Cybula Universities: York, Leeds, Sheffield, Oxford Engine Model Case Based Reasoning EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  31. Input “sandbox” DataSets info Output “sandbox” SE & CE info Job Submit Event Job Query Publish Job Status Storage Element Major components Replica Catalogue “User interface” Information Service Resource Broker Author. &Authen. Input “sandbox” + Broker Info Output “sandbox” Logging & Book-keeping Computing Element Job Status EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

  32. UI RB node Replica Location Server Network Server Workload Manager Inform. Service Job Contr. Characts. & status Computing Element Storage Element

  33. Job Status UI RB node submitted Replica Location Server Network Server Workload Manager Inform. Service UI: allows users to access the functionalities of the WMS (via command line, GUI, C++ and Java APIs) Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  34. edg-job-submit myjob.jdl Myjob.jdl JobType = “Normal”; Executable = "$(CMS)/exe/sum.exe"; InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = other. GlueHostOperatingSystemName == “linux" && other. GlueHostOperatingSystemRelease == "Red Hat 7.3“ && other.GlueCEPolicyMaxCPUTime > 10000; Rank = other.GlueCEStateFreeCPUs; Job Status UI RB node submitted Replica Location Server Network Server Workload Manager Inform. Service Job Contr. - CondorG CE characts & status SE characts & status Job Description Language (JDL) to specify job characteristics and requirements Computing Element Storage Element

  35. submitted waiting UI NS: network daemon responsible for accepting incoming requests RB node Job Status Replica Location Server Network Server Job Input Sandbox files Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  36. submitted waiting UI Job submission RB node Job Status Replica Location Server Network Server Job Workload manager Inform. Service RB storage WM: acts to satisfy the request Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  37. Job Status submitted waiting UI Job submission RB node Replica Location Server Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage Where must this job be executed ? Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  38. submitted waiting UI Job submission RB node Job Status Matchmaker: responsible to find the “best” CE for a job Replica Location Server Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  39. Where are (which SEs) the needed data ? submitted waiting UI Job submission RB node Job Status Replica Location Server Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage What is the status of the Grid ? Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  40. submitted waiting UI Job submission RB node Job Status Replica Location Server Network Server Match- Maker/ Broker Workload Manager Inform. Service RB storage CE choice Job Contr. - CondorG CE characts & status SE characts & status Computing Element Storage Element

  41. submitted waiting UI Job submission RB node Job Status Replica Location Server Network Server Workload Manager Inform. Service RB storage Job Adapter Job Contr. - CondorG Job Adapter: responsible for the final “touches” to the job before performing submission (e.g. creation of wrapper script, PFN, etc.) CE characts & status SE characts & status Computing Element Storage Element

  42. submitted waiting UI ready Job submission RB node Job Status Replica Location Server Network Server Workload Manager Inform. Service RB storage Job Job Contr. CE characts & status Job Controller: responsible for the actual job management operations (done via CondorG) SE characts & status Computing Element Storage Element

  43. submitted waiting UI ready scheduled Job submission RB node Job Status Replica Location Server Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG CE characts & status SE characts & status Job Computing Element Storage Element

  44. “Compute element” – reminder! Job request I.S. Logging Logging Info system Globus gatekeeper gridmapfile Grid gate node Local resource management system:Condor / PBS / LSF master Homogeneous set of worker nodes

  45. submitted waiting UI ready scheduled running Job Job submission RB node Job Status Replica Location Server Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Input Sandbox files “Grid enabled” data transfers/ accesses Storage Element Computing Element

  46. submitted waiting UI ready scheduled running done Job submission RB node Job Status Replica Location Server Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Output Sandbox files Computing Element Storage Element

  47. submitted waiting UI ready scheduled running done Job submission RB node Job Status edg-job-get-output <dg-job-id> Replica Location Server Network Server Workload Manager Inform. Service RB storage Job Contr. - CondorG Computing Element Storage Element

  48. UI Job submission RB node Job Status submitted Replica Location Server Network Server waiting RB storage ready Workload Manager Output Sandbox files Inform. Service scheduled Job Contr. - CondorG running done cleared Computing Element Storage Element

  49. UI Job monitoring RB node edg-job-status <dg-job-id> edg-job-get-logging-info <dg-job-id> Network Server LB: receives and stores job events; processes corresponding job status Workload Manager Job status Logging & Bookkeeping Job Contr. - CondorG Log Monitor Log of job events LM: parses CondorG log file (where CondorG logs info about jobs) and notifies LB Computing Element

  50. Contents • “The Grid” vision • What is “a grid” ? • Drivers of grid computing • Implementation samples • Grid Status & Standards • The basis: authentication, authorisation, security • So, What can it do ? EGEE tutorial, Seoul Biomed Grid Induction (Tel-Aviv Univ), Feb 2006

More Related