160 likes | 293 Vues
This tutorial teaches how to couple resources across multiple sites to run large-scale applications using Globus. Learn to execute existing parallel codes seamlessly with MPICH-G, a grid-enabled MPI, and leverage enhanced communication methods like Nexus. Topics covered include resource allocation, scheduling, co-allocation, dynamic resource discovery, and the programming models used in grid computing. Discover practical applications such as gas dynamics simulations and climate modeling, while understanding security protocols and resource management strategies to optimize performance across heterogeneous computing environments.
E N D
Globus Grid TutorialPart 2:Running Programs Across Multiple Resources
Goals of this Tutorial • Learn how to couple resources at multiple sites and use them for a single application • Required by very large applications • Also by applications that need a heterogeneous mix of resources • Learn how to run existing parallel codes under Globus • By using MPICH-G, a grid-enabled MPI • Other application examples include • SF-Express, climate models, etc.
Gas Dynamics (PPM) • Tightly coupled CFD problem • Needs large computational power • Mask latency by overlapping communication and computation • Move data a brick at a time • Size bricks to CPU and network Archiver Task Manager Brick Manager Brick Manager Brick Updater Brick Updater Brick Updater Woodward, U Minn.
Problems • How do we start a program running across multiple machines? • Co-allocation and scheduling • Different schedulers and security systems • What programming model should be used? • Can we run existing applications?
Globus Advantages • Resource management architecture provides co-allocation tools • Can mix communication methods • Nexus multimethod communication • MPI, sockets, etc. • Uniform access to local services • Security, resource management, etc. • Architecture promotes building high-level programming tools • E.g., MPICH-G, a grid-enabled MPI
Programming Tools: Approaches • “Hand coded” applications, combining existing tools with Globus calls • Use sockets, MPI, threads, SHM, etc. • Globus security and resource management still provide added value • Grid-enabled libraries • Manage both communication and resource management • Provide uniform programming environment across resources
MPICH-G • A complete implementation of the Message Passing Interface (MPI) • Passes MPICH regression test without change • MPI is the defacto standard for message-passing, parallel programs. • Enables existing MPI programs to run within a grid environment without change. • Documentation • http://www.globus.org/mpi
4 2 3 Running a Program • Goal: Run a Message Passing Interface (MPI) program on multiple computers • MPICH-G uses Globus for authentication, resource allocation, executable staging, output redirection, etc. % mpirun -np 4 my_app 1
Running an MPICH-G Program • Create a file named “machines” • A list of Globus managers and counts sp2.sdsc.edu-loadleveler 4 neptune.cacr.caltech.edu-lsf 4 jupiter.isi.edu-fork % mpirun -np 12 my_app • Creates a total of 12 tasks allocated in a round-robin fashion with “count” tasks per allocation request (sp2 4) (neptune 4) (jupiter 1) (sp2 3)
How MPICH-G Works • mpirun: • Locates complete globus resource manager information for specified resources (MDS) • Creates resource specification request • Calls globusrun to execute the program • Uses Nexus for communication • Delivers enhanced performance by using multiple communication protocols
Starting Multiple Jobs • The globusrun command: • Submits multiple simultaneous job requests • Stages executables (GASS) • Waits for termination (GRAM/DUROC) • Forwards stdout/stderr (GASS) • Convenient wrapper around several Globus services: • DUROC, GASS, GRAM, GSI, MDS
Globus Resource Managers • Every resource is controlled by a resource manager called a GRAM • Interfaces to local resource management system, e.g., LoadLeveler, NQE, LSF. • Every resource manage has a unique distinguished name, or DN • DN is a sequence of attribute-value pairs:/C=US/O=Globus/O=USC/OU=ISI/CN=jupiter.isi.edu-fork • The MDS stores information about each resource manage
Limitations of Simple mpirun • Limitations of “machines” file • Executable staging only for homogeneous sets of machines • For heterogeneous sets, executables must be placed in the same location on every machine • More general MPICH-G startup is possible • Dynamic discovery of resources • Specify name of the executable at each site • Specify location of executables and data files • Currently achieved by passing RSL string
Exercise 2Introduction to MPI • Use mpirun to run an MPI program % mpirun -np 2 program • Use globus-rcp to copy files remotely % globus-rcp filename host:filename
Globus Components in Action mpirun globusrun DUROC GRAM GRAM GRAM fork LSF LoadLeveler P2 P2 P2 P1 P1 P1 Nexus
Summary • Using multiple resources located in multiple domains is a basic grid operation • Globus supports this operation via core services and high-level tools • Standard MPI programming environment provides a convenient way of building grid applications • Must be careful about configuration and latency