1 / 18

Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

Judit Gimenez, German Llort, Harald Servat judit@cepba.upc.edu. CEPBA-Tools experiences with MRNet and Dyninst. Outline. CEPBA-Tools environment OpenMP instrumentation using Dyninst Tracing control trough MRNet Our wish list. Where we live. Traceland … … aiming at detailed analysis

gella
Télécharger la présentation

Judit Gimenez, German Llort, Harald Servat judit@cepba.upc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Judit Gimenez, German Llort, Harald Servat judit@cepba.upc.edu CEPBA-Tools experiences with MRNet and Dyninst

  2. Outline CEPBA-Tools environment OpenMP instrumentation using Dyninst Tracing control trough MRNet Our wish list

  3. Where we live Traceland … … aiming at detailed analysis and flexibility in the tools

  4. Importance of details Variance is important Along time Across processors Highly non linear systems Microscopic effects are important May have large macroscopic impact

  5. CEPBA-Tools TraceDriver MPtrace OMPItrace MPIDtrace Dimemas .trf trace2trace .cfg Nanos Compiler .prv aixtrace2prv AIXtrace .pcf Paraver LTT2prv LTTtrace Java, WAS GT4 Data display tools Paramedir JIS Aaa miss ratio 0.8 Bbb IPC 0.5 Ccc Efficiency 0.4 Ddd bandwidth 520 GPFS2prv GPFStrace

  6. CEPBA-Tools Challenge What can we say about an unknown application/system without looking at the source code in short time?

  7. OpenMP instrumentation OMPtrace Instrumentation of OpenMP Insight on: application Run Time scheduling Based on DiTools (SGI/Irix) only calls to dynamic libraries DPCL (IBM/AIX) functions and calls referenced within binary Dyninst (Itamium) functions and calls referenced within binary LD_PRELOAD (some Linux) only calls to dynamic libraries “Evolution” through the available platform except for Itanium (NASA-AMES request)

  8. OpenMP compilation and Run Time A() { A() { kmpc_fork_call !$omp parallel do do I=1,N loop body enddo Call A Call A } Idle() { } Compiler generated libomp Source program Compiler generated _A_LN_par_regionID { do I=start,end loop body enddo }

  9. OpenMP instrumentation points Main thread A() { 1 3 2 kmpc_fork_call Call A 5 4 6 } 2 4 1 3 6 5 OMP_PAR,1 USR_FCT, idA HWCi, Delta PAR_FCT, A_LN_par_regionID HWCi, Delta PAR_FCT, 0 HWCi, Delta USR_FCT, 0 HWCi, Delta OMP_PAR,0 7 (Fork/join) 7 (Fork/join) 1 1 1 1 Timeline _A_LN_par_regionID { do I=start,end loop body enddo }

  10. Instrumentation @ CEPBA-Tools The issue Sufficient information / sufficiently detailed Usable by presentation tool The environment evolution (1991-2007) from few processes to 10.000 instrumenting hours of execution including more and more information hardware counters, call stack, network counters, system resource usage, MPI collective internals... ...from traces of few MB to hundreds of GB

  11. Scalability of tracing Techniques for achieving scalability User specified on/off Limit file size (stop when reached, circular buffer) Only computing burst + counters + statistics Library Summarization (software counters – MPI_Iprobe/ MPI_Test) Trace2trace utilities Partial views ... autonomic tracing library

  12. MPItrace + MRNet user loginnode

  13. First target with MRNet A real problem scenario on MareNostrum some large runs punctually have very large degraded collectives instrumenting full run including details of collectives implementation would produce a huge trace Solution MPItrace + MRNet control which information is flushed to disk discard all the details except the related with large collectives

  14. Implementation i … i+n 10 … 300 … • Instrumenting on a circular buffer • Periodically • the MRNet front-end requests information on the collectives duration • the “spy” thread • stops the main thread • analyze the tracing buffer • collects information on the collectives • sends details on the range and duration • the root sends back a mask of selection • the “spy” thread • flushes to disk the selected data • resumes the application i … i+m 0 … 1

  15. First traces – CPMD 245MB, >15500 col <1MB, <85 col LIMIT >= 35ms 25MB, <85 col

  16. First traces – MRNet front-end analysis

  17. Next steps for MPItrace+MRnet Analysis of MRNet Evaluate impact topology / mapping Library control - maximum information, minimum data Automatic switching driven by on-line analysis Tracing level, type of data (counters set, instr. points), on/off Clustering, periodicity detection

  18. Our wish list Dyninst Support to MPI+OpenMP instrumentation Available for PowerPC MRNet Automatically compute the best topology based on available resources maybe considering user preferences about mapping, dispersion degree (fan-out)... Improve MRNet integration with MPI applications

More Related