1 / 31

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute)

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute). Anne Weill – Zrahia Technion,Computer Center October 2008. Resources needed for applications arising from Nanotechnology. Large memory – Tbytes High floating point computing speed – Tflops

venice
Télécharger la présentation

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nanco: a large HPC cluster for RBNI(Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008

  2. Resources needed for applications arising from Nanotechnology • Large memory –Tbytes • High floating point computing speed –Tflops • High data throughput – state of the art …

  3. SMP architecture P P P P Memory

  4. Cluster architecture Processor Memory Processor Memory Interconnection network

  5. Why not a cluster • Single SMP system easier to purchase/maintain • Ease of programming in SMP systems

  6. Why a cluster • Scalability • Total available physical RAM • Reduced cost • But …

  7. Having an application which exploits the parallel capabilities requires Studying the application or applications which will run on the cluster

  8. Things to include in design

  9. Our choices

  10. Other requirements • Space, power ,cooling constraints , strength of floors • Software configuration: • Operating system • Compilers & application deve. tools • Load balancing and job scheduling • System management tools

  11. Configuration M M M P P P P P P node2 node64 node1 Infiniband Switch

  12. Before finalizing our choice … One should check , on a similar system : • Single processor peak performance • Infiniband interconnect performance • SMP behaviour • Non commercial parallel applications behaviour

  13. Parallel applications issues • Execution time • Parallel speedup Sp= T1/Tp • Scalability

  14. Benchmark design • Must give a good estimate of performance of your application • Acceptance test -should match all its components

  15. Comparison of performance

  16. Execution time of Monte-Carlo parallel code (MPI)

  17. What did work • Running MPI code interactively • Running a serial job through the queue • Compiling C code with MPI

  18. What did not work • Compiling F90 or C++ code with MPI • Running MPI code through the queue • Queues do not do accounting per CPU

  19. Parallel performance results Theoretical peak 2.1 Tflops Nanco performance on HPL: 0.58 Tflops

  20. Comparison with Sun Benchmark

  21. Execution time –comparison of compilers

  22. Performance with different optimizations

  23. Conclusions from acceptance tests • New gcc (gcc4) is faster than Pathscale for some applications • MPI collective communication functions are differently implemented in various MPI versions • Disk access times are crucial - use attached storage when possible

  24. Scheduling decisions • Assessing priorities between user groups • Assessing parallel efficiency of different job types (MPI,serial ,OPenMP) /commercial software and designing special queues for them • Avoiding starvation by giving weight to the urgency parameter

  25. Observations during production mode • Assessing user’s understanding of machine – support in writing scripts and efficient parallelization • Lack of visualization tools – writing of script to show current usage of cluster

  26. Utilization of cluster

  27. Utilization of nanco sep08

  28. Nanco jobs by type

  29. Conclusion • Benchmark correct design is crucial to test capabilities of proposed architecture • Acceptance tests allow to negotiate with vendors and give insights on future choices • Only after several weeks and running of the cluster at full capacity can we make informed decisions on management of the cluster

More Related