VolpexMPI : Performance Evaluation of VolpexMPI over Infiniband

VolpexMPI: Performance Evaluation of VolpexMPI over Infiniband Stephen Herbein Mentors: JaspalSubhlok& Edgar Gabriel

Volpex: Parallel Execution on Volatile Nodes • Fault tolerance: why ? • Node failures on machines with thousands of processors (large cluster) • Node and communication failure in distributed environments (volunteer environment) • VolpexProject Goals: • Execution on failure prone platforms • Key problem: High failure rates ANDcommunicating parallel programs

VolpexMPI • MPI library for execution of parallel application on volatile nodes • Key features: • controlled redundancy: each MPI process can have multiple replicas • Receiver based direct communication between processes • Distributed sender logging to support slow processes

Managing Replicated MPI processes • Only need one copy for program to execute successfully

Bandwidth comparison • 4 byte latency over Gigabit Ethernet: • Open MPI v1.4.1: ~50us • VolpexMPI: ~1.8ms

NAS Parallel Benchmarks • VolpexMPI execution times are comparable to reference OpenMPI execution times (100)

Overhead of redundancy and processor failures • Performance impact of executing with replicas (left side) • Performance impact of processor failures (right side) • Both run with 16 processes

Use in High Performance Clusters • Not just limited to volunteer computing • Tested on small cluster using Ethernet • Yet to be tested on Large Scale Cluster with high performance communication, like Infiniband • Evaluate and Validate the use of VolpexMPI on High Performance Clusters • Specifically clusters that use Infiniband

What is Infiniband • High speed fiber connection • Associated protocols designed to remove the overhead associated Ethernet and IP • Leads to higher bandwidth, lower latency, and lower CPU usage • Widespread use in HPC • Most used interconnect in the TOP500 (42%)

How to Run VolpexMPI over Infiniband • Ways to use Infiniband • IPoIB • Sockets Direct Protocol (SDP) • IPoIB • High Bandwidth • High Latency • SDP • Higher Bandwidth • Low Latency • Bypasses TCPStack

Current Measurements

Summary • Status • Currently implementing SDP in underlying Socket Library of VolpexMPI • Challenges • Parallel programs are notoriously hard to debug • Not familiar with network or socket programming • Goals • Re-run bandwidth and latency tests using SDP • Re-run NAS benchmarks using SDP • Evaluate and validate use of VolpexMPI on High Performance clusters

VolpexMPI : Performance Evaluation of VolpexMPI over Infiniband