1 / 12

Evaluation of Emerging Metacomputing Systems: Insights and Challenges

This report evaluates current metacomputing systems in the context of the ERDC MSRC, focusing on design, installation, maintenance, usability, and performance. It includes general and specific summaries related to production environments but does not constitute a comprehensive overview of systems utilized in large-scale applications. The report assesses Globus and Legion, addressing human factors, installation processes, security, and resource management while testing various LU solvers and MPI performance. Conclusions recommend infrastructure improvements and further testing against real-world metacomputing applications.

zora
Télécharger la présentation

Evaluation of Emerging Metacomputing Systems: Insights and Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MetaComputing: An Evaluation of Emerging Systems David Cronk Brett Ellis Graham Fagg (the PI)

  2. Goals • To review current metacomputing systems for the ERDC MSRC • Review was to be relative to an MSRC environment • To cover: • design, instillation, maintenance, support, usability, performance...

  3. Goals • Two summaries • General for everybody • Specific to the ‘production’ environment at ERDC

  4. What the report is not.. • A perfect review of MetaComputing systems being used to Perform Grand Challenge Applications at SuperComputing XYZ. • We were limited to resources here. I.e. • did not test batch queue systems • multi-site multi-machine ‘Meta-jobs’

  5. The Report • Overview of Globus and Legion • Human factors • Installation and maintenance • OPRs, LDAP, MDA and CAs… • Or when do you know that Legion works? • You want to run Globus without us? • Ease of Use • accounts, logging in, compiling and running MPI jobs… • Assistance and support

  6. The Report • Site Autonomy • Security issues • Resource management • Grid files and local MayIs • System Functionality • The file system • GASS verse context space • Programming Language support • Fault tolerance • when to reinstall

  7. The Report • Performance • We tested a number of LU solvers including one from Scalapack as well as basic MPI performance • Did not test HPF over a meta-mpi… nor F90.

  8. The Report • Performance • Not as expected … and currently under review I.e. Legion will be faster

  9. Summary • Two possible answers • If certain infrastructure already exists at the MSRCs such as Meta-Queuing and single login-in (as in krb5) then they only buy us a global file system.. • Just run AFS ?? • Need to run multiple MPI jobs? • Use MPI_Connect/Pacx/.. • MPI_Connect was used for the joint CGWAVE SC98 challenge using machines at ASC & ERDC

  10. Summary OR: • Cannot declare a verdict until after we: • run we test against real world Meta-Applications that need multiple MPPs at multiple sites on multiple batch queue systems needing data from different repositories… • Grand challenge again.

  11. Status Allowing the different project teams to review current document to allow for correction of our mistakes, and defense of their systems. Next? Doing the Meta-Application stuff hopefully…

More Related