Introduction

#include <mpi.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char** argv){ int myid; int numprocs; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myid); if(myid == 0){ int i; for(i = 1; i < numprocs; ++i){ MPI_Send(&i, 1, MPI_INT, i, 0, MPI_COMM_WORLD); } printf("%d Value: %d\n", myid, myid); } else { int val; MPI_Status s; MPI_Recv(&val, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &s); printf("%d Value: %d\n", myid, val); } MPI_Finalize(); return 0; } 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 10010101000101010001010100101010010111 00100100111010101101101001001001001100 10011100100100001111001011001111000111 00100100111010101101101001001001001100 MPI Program MPI Binary Formal Verification Methods applied to High Performance Computing Software and Hardware Ganesh Gopalakrishnan, Mike Kirby, Robert Palmer, Yu Yang, Salman Pervez, Michael DeLisi, Geof Sawaya, Xiaofang Chen, Subodh Sharma, Igor Melatti, Sarvani Vakkalanka School of Computing / SCI Institute, University of Utah, Salt Lake City, UT 84112 Introduction In the process of writing or optimizing High Performance Computing software, mostly using MPI these days, designers can inadvertently introduce errors due to: (i) the richness of MPI, (ii) the associated complexity of the MPI semantics, (iii) and the well known fact that concurrent programming is tricky. Traditional debugging methods (such as software testing and running incomplete execution analyzers) are insufficiently incisive. The Utah Gauss group is researching several pragmatic adaptations of Formal Methods, and is developing methodological and tool support that: enhances designer understanding of MPI — parallel and distributed program semantics; helps them develop more efficient MPI programs by reducing communication overheads; and helps detect bugs such as deadlocks and other invariant violations early. Preliminary results include: a formal semantics of (a growing number of) MPI primitives; a program analysis framework based on the Microsoft Phoenix compiler that can analyze and help debug MPI programs; case studies involving tricky uses of MPI one-sided communication; and an in-situ model checker that can directly run MPI programs as if in a model checker. Conclusions As the stakes grow higher in HPC (expensive clusters, shortness of active professional lives), more incisive specification and verification techniques are essential. HPC is a rapidly growing area, being directly tied to growth in compute capabilities (e.g. threads and multicores) and simulation aspirations going toward Petaflop computing. Informal specification techniques and ad hoc validation techniques cannot serve as the foundation for build debugging tools in such a critical area. On the other hand, the use of non-scalable formal verification methods is also ineffectual. The Utah Gauss Group has expertise in formal methods and HPC. It collaborates with experts (e.g. Argonne National Labs). It is examining how best to make an impact using formal methods in HPC. Our experience has been that in such a fast moving area, agile de facto standards such as MPI provide ample opportunities to pursue the formal route. We recently formally verified a tricky byte-range locking algorithm that uses MPI 1-sided communication using model checking (one of three EuroPVM/MPI ’06 distinguished papers). Our formalization of MPI has already revealed several imprecise specifications in the standard. Our long-term goal is the demonstration, and ultimate incorporation of these ideas into the Microsoft Compute Cluster Software and also other open source releases of MPI. Formal Methods for handling Multithreading in MPI, as well as formal test generation for MPI libraries are also planned. Abstraction / Refinement / Model Checking Ad-hoc Testing What is Model Checking? Model Checking is a class of techniques for finding bugs in concurrent software and hardware. It has been repeatedly shown that ad-hoc testing methods have missed serious bugs in real systems. Finite state machine models of such systems have been able to very efficiently find these classes of bugs, thanks to employing a variety of techniques such as: creating tractable verification models of these systems through model extraction and simplification; coarse sampling of real executions; non-deterministic over-approximations; search techniques that do not interleave independent actions (e.g., local variable updates) of concurrent processes; techniques that capitalize on symmetry; and use of symbolic representations of state spaces. Used in conjunction with static analysis, run-time monitoring, and property driven testing, model checking continues to make immense impact in the hardware and software industry. proctype MPI_Send(chan out, int c){ out!c; } proctype MPI_Bsend(chan out, int c){ out!c; } proctype MPI_Isend(chan out, int c){ out!c; } typedef MPI_Status{ int MPI_SOURCE; int MPI_TAG; int MPI_ERROR; } … MPI LibraryModel int y; active proctype T1(){ int x; x = 1; if :: x = 0; :: x = 2; fi; y = x; } active proctype T2(){ int x; x = 2; if :: y = x + 1; :: y = 0; fi; assert( y == 0 ); } Compiler + ProgramModel Model Generator + Model Checking EnvironmentModel Error Simulator Refinement Abstractor Model Checker Result Analyzer MC Client MC Client MC Client MC Client MC Client MC Client … OK MC Client MC Client MC Client Case Study 1 : Byte Range Locking using MPI 1-sided communication (EuroPVM/MPI 2006) Materials and methods We begin with a formal and executable specification of MPI communication semantics expressed in TLA+ (Lamport, MSR). Using Phoenix, we extract a control skeleton of the MPI program, with its constituent MPI calls represented using the descriptions in our formal semantics. We are developing many abstraction methods that can reduce the fraction of the state-space that must be visited to verify properties about these models. Using TLC (a model checker for TLA+ from MSR), we apply finite-state model checking methods that can detect deadlocks and assertion violations. Many of these assertions are automatically inserted (e.g., is every MPI_Isend followed by an MPI_Wait or MPI_Test). An in-situ model checker (under construction) arranges to interrupt control-flow before any MPI call is attempted, and transfers control to a scheduler. In effect, each MPI process is forced to seek permission from this scheduler as to when it may go ahead and make this MPI call. The scheduler permits an interleaved execution to manifest; thereafter, it runs enough “execution interleaving order variants” to exhaust the execution space up to a certain depth bound. Abstraction procedures (under development) are expected to generate simpler programs to verify, containing the same classes of error. Ways to parallelize in-situ model checking are also under investigation. Lock Acquire Deadlock! Lock Release lock_release (start, end) { val[0] = 0; /* flag */ val[1] = -1; val[2] = -1; lock_win place val in win get values of other processes from win unlock_win for all i, if (Pi conflicts with my range) MPI_Send(Pi); } Literature cited Robert Palmer, Ganesh Gopalakrishnan, and Robert M. Kirby, “The Communication Semantics of the Message Passing Interface,” Technical Report UUCS-06-012, School of Computing, University of Utah, 2006. Robert Palmer, Steve Barrus, Yu Yang, Ganesh Gopalakrishnan, and Robert M. Kirby, “Gauss: A framework for verifying scientific computing software,” Workshop on Software Model Checking, 2005. Electronic Notes on Theoretical Computer Science (ENTCS), No. 953. Salman Pervez, Ganesh Gopalakrishnan, Robert M. Kirby, Rajeev Thakur, and William Gropp, “Formal verification of programs that use MPI one-sided communication.” Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI, LNCS 4192, pages 30--39, 2006. (One of Three Outstanding Papers.) DEADLOCK Deduces Conflict – Stage 2 Block on Receive Deduces Conflict – Stage 2 Block on Receive Window: flag start end 0 -1 -1 0 -1 -1 0 -1 -1 Window: flag start end 0 -1 -1 0 -1 -1 0 -1 -1 0 3 5 0 3 5 0 -1 -1 0 -1 -1 P0 P1 P0 P1 P0 P1 Acknowledgments Supported in part by Microsoft Corporation under the Microsoft HPC Institutes program, NSF CNS 0509379, and Semiconductor Research Corporation Award SRC TJ-1318. We also acknowledge our collaborations with Argonne National Labs (R. Thakur and W. Gropp). Case Study 2: Formal Verification of MPI and Thread Programs For further information Please check our project website http://www.cs.utah.edu/formal_verification Email contacts: {ganesh, kirby} @ cs.utah.edu School of Computing and SCI Institute, University of Utah

Introduction

Introduction

Presentation Transcript

Introduction to introduction to introduction to … Optimization

INTRODUCTION/ INTRODUCTION

Introduction

INTRODUCTION

Introduction

Introduction