1 / 25

Grid Computing and Middleware

Grid Computing and Middleware. Shawn Malhotra Monday, February 5 th , 2007. Overview. Background and definition Importance of middleware Globus Toolkit Sample Applications. What is Grid Computing?. Computing model that leverages the power of many networked resources Not just CPUs

myrrh
Télécharger la présentation

Grid Computing and Middleware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grid Computing and Middleware Shawn Malhotra Monday, February 5th, 2007

  2. Overview • Background and definition • Importance of middleware • Globus Toolkit • Sample Applications

  3. What is Grid Computing? • Computing model that leverages the power of many networked resources • Not just CPUs • Storage devices, special equipment (i.e. telescope) • Share resources across administrative domains • Requires security features • Different than traditional cluster computing • Programmer sees a single ‘virtual computer’ • Web ↔ Information as Grid ↔ Computing Power

  4. Why is Grid Computing Important? • Helps solve computationally expensive problems • Flexible enough to handle many small problems • Share costly resources amongst institutions • Federally funded research labs / academic institutions • Make resources available to anybody • Cost barrier is lowered • ‘Pay as you go’ type service • Increases overall bandwidth

  5. Motivation for Middleware • Need robust, efficient ways to pool resources • Previous ‘ad-hoc’ methods not sufficient • Need for standardization! • Distributed Computing System (DCS) • Developed at the University of California at Irvine • Early 1970s • Focus on CPU management • Poor security solution • Abandoned in the 1980s

  6. Globus Toolkit • Broader scope, more complete solution • CPU Management • Storage Management • Monitoring Services • More details to come … • Most popular grid computing framework • Implements several standards • OGSA, WSRF, SOAP, WSDL

  7. Globus Toolkit - Overview • Facilitates grid application development • Open, extensible, flexible, high abstraction

  8. Job Submission • GRAM interface • Grid Resource Allocation and Management • Specify resource requirements and flow • Uniform way to submit remote jobs • Translate request for local resources • Offers a variety of features • Retrieve job status • Send job signals (kill, start, restart) • Uses Web services interface

  9. Job Scheduling • What happens after the job is submitted? • Submitted to a scheduler • Queues jobs decides where/when to run • Requirement matching, priority systems, etc. • Abstracts resources from user • Pool heterogeneous resources together • Can have multiple layers of scheduling • Local schedulers vs. Metaschedulers

  10. Security • Access to resources must be controlled • Grid Security Infrastructure (GSI) • Provides basic security constructs • Certificate-based PKI system • Supports single sign-on over the grid • Supports delegation • Access control left to individual services • Infrastructure provides necessary info and control • Uses Web services interface

  11. Other Provided Modules • Data management • Facilitates file transfer, access to data stores • Monitoring and discovery • APIs to get status, subscribe to content • Important since ‘grid’ is never down, only components • Collaboration tools • Facilitates person-to-person collaboration • Build web portals for chat, e-mail, etc.

  12. Example Applications • What can you build with such a toolkit? • Applications range from the depths of the sea to the stars above! • LOOKING  deep sea research • Condor  batch computing infrastructure • BIRN  medical resource pooling • LEAD  meteorological data • NVO  virtual observatory

  13. Workload management system • Queuing, scheduling, prioritization, monitoring • Pool desktops into batch system • Use when idle, auto-detect when busy again • ClasAd mechanism • Novel way to match resources with requests • Flocking • Seamless combination of multiple networks http://www.cs.wisc.edu/condor

  14. Make tools / data related to oceanography available to all researchers • ‘20,000 Terabits Beneath the Sea’ • Presented at iGrid2005 • Real-time high definition deep sea video • Monitor active underwater volcanoes http://lookingtosea.ucsd.edu/

  15. Resource pooling • Tools for research and diagnoses • Collaboration • Common user interface • Better hypotheses testing • Use a distributed patient population http://www.nbirn.net/

  16. Sharing meteorological resources • Algorithm Development and Mining (ADaM) • Works on observational data • Provides analysis tools • ARPS Data Assimilation System (ADAS) • Provides visualization tools • Earth Science Markup Language (ESML) • Uniform way of expressing data • Data Access Systems • Allow uniform access to distributed data https://portal.leadproject.org/gridsphere/gridsphere

  17. Expose the vast amount of astronomical data for all to use • Telescopes will produce 7 petabytes per year by 2012 • Standardized way of expressing data • VOTable • Creation of tools to produce required data • ConeSearch • Make accessing data like using real tools http://www.us-vo.org/

  18. The WISDOM Project • Analyze potential anti-malaria drugs • Focus lab tests on promising compounds • Uses up to 5000 computers in 27 countries • Simulate drug interaction with malaria protein • Test 80,000 drugs per hour, 140 million in total • Shows the power of collaboration • Many computers borrowed from particle physics simulator in the UK – GridPP • Shared spare capacity http://grid.globalwatchonline.com/epicentric_portal/site/GRID/

  19. Grid Computing – The Future • Currently the domain of ‘Big Science’ • Make it more mainstream for ‘Little Science’ • Technology is not the barrier • Evolution of the standards • Continued enhancement of the toolkit • Better front-end design • Promote peer-to-peer collaboration • Security is still a challenge

  20. Summary • Grid computing is a powerful collaborative computing model • Grid computing requires efficient, fully featured middleware to thrive • Grid computing enables research and development that is not possible in isolation

  21. References • Globus site • http://www.globus.org/ • Wikipedia • http://en.wikipedia.org/wiki/Grid_computing • Grid Café • http://gridcafe.web.cern.ch/gridcafe/

  22. The Need for Grid Solutions • Grids are essential to sustain Moore’s Law as physical limitations will eventually limit what individual computing stations can achieve • It will become less necessary as individual resources become more powerful since technology grows faster than the complexity of our research

  23. The Corporate Barrier • True grid computing will never be embraced by corporations due to security issues and sensitivity of data. This will limit the scope and power of the technology • Much like Web 2.0 has caused a shift in corporate presence on the internet, a ‘Grid 2.0’ will eventually force corporations to embrace this technology

  24. Grid Middleware • Middleware designed to manage a grid will eventually merge with software designed to handle multiple CPUs on one motherboard to form a common solution. • Grid computing is far too different from multi-CPU processing to ever offer a common solution.

  25. Expanding User Base • Development of a good middleware solution that abstracts most details of the grid will bring grid computing to ‘Little Science’ and eventually individual users. • The complexity of grid computing and lack of demand will prevent grid computing from ever becoming part of the main stream.

More Related