1 / 14

Sequential Debugging of Parallel Message Passing Programs Using Millipede

This paper discusses the importance of debugging parallel programs and introduces Millipede, a multi-level interactive parallel debugger. Examples, implementation details, and the benefits of using sequential tools for debugging parallel code are presented.

holte
Télécharger la présentation

Sequential Debugging of Parallel Message Passing Programs Using Millipede

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequential Debugging of Parallel Message Passing Programs Using Millipede Jan Bækgaard Pedersen Alan Wagner Department of Computer Science The University of British Columbia Vancouver, BC, Canada Parallel Computation Lab University of British Columbia

  2. Overview • Importance of debugging • Sequential debugging • Using sequential tools to debug parallel programs • Millipede • Examples • Implementation • Conclusion Parallel Computation Lab University of British Columbia

  3. How Important Is It To Debug? • As much time is spent on debugging as on writing the code [Pancake] • 35-90% of parallel programmers still only use print statements for debugging [Pancake] • Some reasons for tools not being widely used • Lack of focus • Information overload • Learning to use new tools with GUI Parallel Computation Lab University of British Columbia

  4. Sequential Program Pointer Errors Variable Inspection Break Points Memory leaks Stack Trace Sequential Debugging Sequential Debugging Tool Parallel Computation Lab University of British Columbia

  5. Debugging straight line code Extract Use a sequential tool to debug The sequential code! • Exploit existing sequential tools • Well known • Trusted • Larger selection Sequential Tools In Parallel Environments Parallel Debugging Message debugging Debugging straight line code Protocol debugging Visualization Parallel Computation Lab University of British Columbia

  6. Millipede MultiLevelInteractive Parallel Debugger Solution: Debugging Parallel Programs Use a tool that is tailored to the specific debugging task • Sequential tool to debug sequential code. • Other tools to debug • Message passing errors • Deadlocks • Message content • Protocol errors • Overall performance Parallel Computation Lab University of British Columbia

  7. Communication Visualization Module Graphical view of the message passing / protocol. Detect and analyze deadlocks And report the cause and fix Deadlock Detection & Correction Module Comm. Protocol Verification Module Online verification of the comm. protocol while running Message Debugging Module Inspect, control and change Contents of messages Sequential Debugging Module Debugging of the sequential code of the parallel program Millipede Parallel Computation Lab University of British Columbia

  8. The Sequential Debugging Module • How ? • Recompile the program with the –DMILLIPEDE flag • Set the environment variable MILLIPEDE_RCM • Millipede collects message information and stores it in log files for each process • Run the program the normal way • Messages sent and recorded • Set the environment variable MILLIPEDE_REM • Debug any of the processes using any sequential tool • Millipede intercepts all message passing calls and supplies the process with messages from the log file Parallel Computation Lab University of British Columbia

  9. Example 1 Example Code: pvm_upkint(&nproc, 1, 1); pvm_upkint(&n, 1, 1); . e = n % nproc; . If nproc = 0 Division by zero Program crashes / disappears With many processes running and one or more processes crashing it is hard to resolve why they crashed. Parallel Computation Lab University of British Columbia

  10. Example 1 (continued) Debugging using any sequential debugger: (1) gcc –g –DMILLIPEDE –o pgm pgm.c –lpvm3 (2) setenv MILLIPEDE_RCM (3) pgm (4) unsetenv MILLIPEDE_RCM; setenv MILLIPEDE_REM (5) gdb pgm Replay filename: MILLIPEDE_RPF-pgm-262152 . (gdb) step 45 e = n % nproc Program received signal SPGPFE Arithmetic exception in () (gdb) Parallel Computation Lab University of British Columbia

  11. Example 2 Example Code: x = calloc(node,sizeof(double)); y = calloc(node,sizeof(double)); . for (i=1; i<=nodes; i++) x[i] = …..; Memory leak • Segmentation fault • Wrong result Some of the processes might crash, some might compute an result. Parallel Computation Lab University of British Columbia

  12. Example 2 (continued) Using Purify: • ABW: Array Bounds Write. This is occurring while in: • main [wave_slave.c:57] • for (i=1; i<=nodes;i++) • x[i] = ……; Writing 8 bytes to 0xdc630 in the heap. Address 0xdc630 is 1 byte past the end of malloc’d block at 0xdc5a8 of 136 bytes. This block was allocated from: main [wave_slave.c:50]  x = calloc(node,sizeof(double)) Parallel Computation Lab University of British Columbia

  13. RCM REM pvm_receive(…) pvm_send(…) • Read log • Write log • Call PVM _PVM_send(…) All pvm_xxx calls are replaced by Millipede versions, which in turn will call the real PVM functions (renamed _PVM_xxx) Implementation Application Program Millipede Millipede Log file PVM Parallel Computation Lab University of British Columbia

  14. Millipede allows extraction of any sequential process from a • parallel system • Millipede enables the programmer to use any sequential tool • for debugging / performance tuning on the extracted process • Millipede supports multi level debugging: • Message debugging • Deadlock detection / correction • Protocol verification Conclusion Parallel Computation Lab University of British Columbia

More Related