1 / 22

Towards a High Performance Extensible Grid Architecture

Towards a High Performance Extensible Grid Architecture. Klaus Krauter Muthucumaru Maheswaran {krauter, mahes}@cs.umanitoba.ca Computer Science Department University of Manitoba Winnipeg Manitoba. Outline. Grid Computing Issues Network computing environment

hoai
Télécharger la présentation

Towards a High Performance Extensible Grid Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a High PerformanceExtensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter, mahes}@cs.umanitoba.ca Computer Science Department University of Manitoba Winnipeg Manitoba

  2. Outline • Grid Computing Issues • Network computing environment • Scalability, Extensibility, and Adaptability • Quality of Service • Grid Models • Resource Management Techniques • Application Execution Models • Grid Architecture • Example Applications • Compiling, Numerical Processing, Grid Aware Application • Related Work

  3. Grid Computing Issues

  4. Network Computing Environment • Heterogeneous Nodes • Autonomous administration domains with different resource management policies • Servers, network devices, workstations, PDA, etc. • Connected by Communication Links • Support differentiated service levels • Use native operating system services • Does not replace existing scheduling and resource control mechanisms • Native operating system is a Grid device driver

  5. Scalability • Target Size • Hundreds to Millions of nodes • Different platforms for different scale Grids • Global resource management protocols • Fixed format messages • Ability to locally tune protocol performance parameters to match local infrastructure and administrative policy • Local policies for resource management • Scheduling, Quality of Service, Tolerance to faults

  6. Extensibility and Adaptability • Extensible resource protocol content • Fixed message framework with structured extensibility (XML like) • Extensible resource management protocol processing • Message content extensions are processed by extension modules • Modules are dynamically loaded and register content identifiers • Variability • Multiple different implementations of the resource protocols • Adaptability • Nodes and resources enter and leave the grid continuously • Fault tolerance by resource replication • Operate in an actively hostile environment • Try to survive Byzantine failures

  7. Quality of Service • Not restricted to end-to-end network • Processor, memory, I/O also need to support QoS specifications • Co-allocation and Co-reservation • Allocation and scheduling need to take into account QoS given to other jobs already in the Grid • Providing Service Level Agreements • Aggregate performance levels or on a per job basis? • Site autonomy and resource control restricts the ability to provide guarantees • Applications should be able to negotiate QoS with the Grid

  8. Grid Models

  9. Resource Management Techniques • Super Scheduler • Hierarchy of cooperating schedulers • Issues: Co-allocation • Market Based • Auctioning for resources • Issues: Price management and co-allocation • Resource Discovery • Resource attribute and status in a distributed database • Centralized, Agent based, or Hybrid • Issues: devise highly distributed, scalable, fault tolerant schemes

  10. Application Execution Models • Legacy application • Native OS resource and scheduling, implicit QoS • Use external resource description language • Modify native OS and service libraries and infer resource requirements and QoS • Recompile with Grid aware compiler that inserts specialized Grid code • Grid Aware application • Use specialized Grid API • First “applications” will be compilers, service libraries (MPI, PVM), Grid workbenches and monitoring tools

  11. Grid Aware Applications

  12. Non-Grid Aware Applications

  13. Grid Architecture

  14. Design Approach • Layered • Grid Kernel • Grid Core Services • Grid toolkits, workbenches, and user interfaces • Fully distributed peer-to-peer model • No centralized information servers • Implementations free to use specialized servers • Minimal configuration • Use Service Location Protocol like service

  15. Grid Kernel Architectural Principles • Functions that use the services are aware of the distributed environment • No guarantees made about reliability of nodes or links • Operate on all types of heterogeneous nodes using minimal resources • Services will be implemented using native OS with minimal changes to trusted computing base • Provide uniform extensible API and services across all nodes • Provide resource management mechanisms but do not implement resource management policies

  16. Grid Architecture

  17. Grid Layers and Core Services

  18. Grid Example Applications

  19. Applications • Compiling • Ensure similar compiler and libraries are used on all nodes • Compute how long to transfer and compile • Perform deadline scheduling • Legacy Numerical Processing • Dynamically linking of Grid code, variable QoS for job steps • Describe network QoS requirements or infer dynamically • Much further research required • Collaborative Research Workbench • Negotiate video bandwidth required • Query if a simulation can be run and completed quickly, or schedule it later • Different GUI depending on resources nearby to a research

  20. Related Work

  21. Related Work • Application Enabling Systems • Provide tools to allow applications to access globally distributed resources in an integrated fashion • ATLAS, Globe, Globus/GUSTO, Legion, ParaWeb, Symera • User Access Systems • Provide end users of the Grid transparent access to geographically distributed systems in a location independent manner • CCS, MOL, NetSolve, PUNCH

  22. Questions ?

More Related