1 / 16

Software Tools for Dynamic Resource Management

Software Tools for Dynamic Resource Management. Irina V. Shoshmina, Dmitry Yu. Malashonok, Sergay Yu. Romanov Institute of High-Performance Computing and Information Systems www.csa.ru {irena,mal,serrom}@csa.ru. Resources: CONVEX(es) Parsytec CC/16 Parsytec CCid

onella
Télécharger la présentation

Software Tools for Dynamic Resource Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software Tools for Dynamic Resource Management Irina V. Shoshmina, Dmitry Yu. Malashonok, Sergay Yu. Romanov Institute of High-Performance Computing and Information Systems www.csa.ru {irena,mal,serrom}@csa.ru

  2. Resources: CONVEX(es) Parsytec CC/16 Parsytec CCid Parsytec Power Mouse System SPP1600 SGI OCTANE Workstations SunUltra 450 Paritet (intel cluster) www.csa.ru/CSA Scientific problems: hydroaerodynamics plasma nuclear physics medicine biology chemistry astronomy State of the art

  3. Difficulties • shortage of resources for soluble scientific problems • unsatisfactory management of tasks (the majority of tasks are parallel)

  4. Shortage of resources integrate computational resources of several scientific centres Advantages of integration • increase access and activity of usage of computational resources, • promote an integration of scientific community, • increase the range of resolving scientific and technical problems

  5. Management of tasks Tools optimisation of task distribution on computational nodes • Codine • SunGridEngine • PBS • Condor Disadvantages of tools • weak support of migration of parallel tasks • unsatisfactory load balancing • dependence on versions of PVM and MPI

  6. Main goals of the project • increase of efficiency of use of computing resources • improvement of quality of service of the users Main tasks • migration of parallel tasks • optimisation of distributed resource management • integration resources of several scientific centres

  7. Dynamite software developed by University of Amsterdam in the Esprit project 23499 Dynamite advantages • migration and checkpointing of PVM tasks • automatic work-load balancing of PVM tasks (on a cluster of workstations) • migration of dynamically linked tasks • migration of communication end points • reallocation of tasks

  8. Dynamite disadvantages • dependence on the PVM versions • absence of migration of MPI tasks • absence of satisfactory monitoring system • absence of advanced scheduling system • absence of modules of global distribution

  9. Main steps of the project • Migration of MPI and PVM tasks • Checkpointing of parallel tasks • Monitoring • Resource management • Addition architectures

  10. Global level Local level Local level Two-level system

  11. Migration of PVM and MPI tasks Main problems of migration • migration of PVM tasks • migration of MPI tasks • independence from versions and realisations of PVM and MPI • addition of architectures • files • sockets • kernel supported threads and etc.

  12. Checkpointing of parallel tasks • trace development of parallel tasks • migrate parallel tasks at two levels • migrate of a process of a parallel task (local level) • migrate of a parallel task wholly (global level) • process extreme situations

  13. Checkpointing of parallel tasks Global level local level local level local level

  14. Monitoring Parameters of • computational resources (loading of processors, memory, network), • tasks and queues, • users

  15. Resource management • distribution of tasks and queues at the moment • long-time scheduling • dynamic load balancing at global and local levels

  16. Globus Global environment local level local level local level local level local level local level Integration with Globus

More Related