1 / 11

Process Management Working Group Process Management “Meatball”

Process Management Working Group Process Management “Meatball”. Dallas November 28, 2001. Subcomponents. Process/Job management What it includes and what it doesn’t include Current status of interface definition Demo Monitoring Examples Relationship to process management Checkpointing

karan
Télécharger la présentation

Process Management Working Group Process Management “Meatball”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Process Management Working GroupProcess Management “Meatball” Dallas November 28, 2001

  2. Subcomponents • Process/Job management • What it includes and what it doesn’t include • Current status of interface definition • Demo • Monitoring • Examples • Relationship to process management • Checkpointing • Is this a component? • Relationship to process management

  3. One Meatball Access control Meta Meta Meta Security Scheduler Monitor Manager manager Interacts with all components Node System Monitor Accounting Scheduler Configuration & Build Manager Resource Allocation management Process Queue Manager Manager User DB Data Migration High Usage User Checkpoint/ File Performance Reports utilities Restart System Communication & I/O Application Environment

  4. Process Manager Responsibilities • Starts processes (and therefore knows hosts and pids) • Delivers arguments, environment, limits • (between fork and exec) • Starts other processes that need to know pids • Monitoring (e.g. Paradyn) • Debugging (e.g. TotalView) • Other (e.g. Myrinet monitor) • Kills jobs • Signals processes • May be part of checkpointing • Report on job start/termination • Provides return codes (job/process) • Handles stdio as directed • Service application runtime layer • Implements PMI (put/get/barrier/spawn, others as discovered)

  5. P.M. Non-Responsibilities • Policy • Real-Time resource usage monitoring

  6. Process Manager Component Interface to Other Components • Defined (I.e. proposed XML schema exists) • Start-job • Start-job response • Kill-job • Kill-job response • To do • Suspend-job, resume-job • Signal-job in general • Asynchronous notifications • Job started • Job terminated • Others

  7. The Process Manager Interface to Application Libraries • A Prototype: PMI (formerly known as BNR) • Used by application libraries (e.g. MPI implementations, UPC implementations, common runtime systems for multiple languages and libraries) • Provided by process managers • Simple and general • Find out rank and size • Put and get into keyval space • Barrier • Spawn • Currently used by MPICH, provided by MPD

  8. The Chiba City Testbed • Dedicated to scalability research in computer science rather than to applications • Currently 256 dual-processor nodes • Designed to promote experimentation with system software • SciDAC projects can get accounts: • Web form at http://www-accounts.mcs.anl.gov • Specify SCIDAC as Project Group • Specify closest Argonne SciDAC person as contact (Rusty or Narayan for SSS) • Future plans • 1000 nodes, 8000 virtual nodes • Vmware • User-mode Linux

  9. A Demo • Start Service Directory component • Start Process Manager component • It registers itself with Service Directory • Start Proto-scheduler component • It queries Service Directory for access location (host,port) of process manager • It sends job-start requests from hard-coded queue to process manager • Process manager runs parallel jobs • All components communicate using XML • Use XML schema for process-manager requests, responses • Prototypes written in Python with built-in XML parser

  10. A Modest Proposal • Multiple Wire Protocols are allowed. • Components declare a WP associated with a port when they register with the service directory. (They can register multiple ports.) • Other components learn the WP associated with a port when they find out the port. • The default protocol is the “basic” protocol. • TCP • A message consists of a complete XML document • After sending, the sender does shutdown on the socket, providing EOF to the receiver to signal the end of the message, but leaving the socket half-open to receive the response. • All components are required to support at least the basic protocol.

  11. Advantages • Something easy to start with • No “framing problem” • No other software required • Does not preclude other protocols, which include security, streaming, etc. • Can be used to bootstrap switches of protocol.

More Related