html5-img
1 / 24

Data Flow Pattern Analysis of Scientific Applications

Data Flow Pattern Analysis of Scientific Applications. Michael Frumkin Parallel Systems & Applications Intel Corporation May 6, 2005. Outline. Why Data Flow Pattern Analysis? CFD Applications The NAS Parallel Benchmarks The NAS Grid Benchmarks Trace File Analysis Conclusions.

tomai
Télécharger la présentation

Data Flow Pattern Analysis of Scientific Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Flow Pattern Analysis of Scientific Applications Michael Frumkin Parallel Systems & Applications Intel Corporation May 6, 2005

  2. Outline • Why Data Flow Pattern Analysis? • CFD Applications • The NAS Parallel Benchmarks • The NAS Grid Benchmarks • Trace File Analysis • Conclusions

  3. Why Data Flow Pattern Analysis? • Scientific applications • model few natural processes • new effects are added infrequently • influence on the existing data flows are insignificant • Knowledge of data flow in program helps with • program understanding • program optimization, parallelization, multithreading • building application performance model

  4. Design of Scientific Applications • Time represented as an outer loop • Iterations over time step • Space is represented by structured/unstructured grids • Important for understanding data locality • Data access patterns • Spatial parallelism • Physics is represented by an operator at each grid point • Data flow • Operator level of parallelism/dependence

  5. CFD Data Flow Patterns • Solve the Navier-Stokes equation K(ui+1)=Lui • u is five-dimensional vector • K is non-linear operator • Solver • RHS computation

  6. ADI method x-solve y-solve z-solve ADI Pattern • ADI method K~Kx*Ky*Kz • Multilevel parallelism y-solve x-solve Multipartition z-solve

  7. BT Communication

  8. Explicit Operators • Stencil operators (explicit methods) • At each point of a 3-dimensional mesh apply: seven-point 27-point

  9. Lower-Upper Triangular Dependence Matrices ( ) ( ) • Two-dimensional pipeline • Hyperplane algorithm -1 0 0 1 0 0 0 -1 0 0 1 0 0 0 -1 0 0 1

  10. LU Communication

  11. Multigrid V-Cycle Interpolation & Smoothing Projection Interpolation & Smoothing Projection Projection Interpolation & Smoothing Interpolation & Smoothing Projection Smoothing

  12. MG Communication

  13. BT x_solve (serial) Call Graph Data Flow Analysis do k=1,ksize do j=1,jsize do i=1,isize

  14. Nest Data Flow Graph do_45 do_134 do_330 Each arc represents Affinity Relation

  15. www.nas.nasa.gov/Software/NPB NAS Parallel Benchmarks • Application Benchmarks • CFD • BT, SP, LU • Data Intensive • DC, DT, BTIO • Computational Chemistry • UA • Kernel Benchmarks • FT, CG, MG, IS • Verification • Performance Model • FORTRAN, C, HPF, Java* • Serial, MPI, OpenMP, Java* Threads * Other names and brands may be claimed as the property of others.

  16. NPB Performance on Altix* ** * Other names and brands may be claimed as the property of others. ** Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests.  Any difference in system hardware or software design or configuration may affect actual performance.  Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing.

  17. Basic Data Flow Patterns • Shuffles • Sorting • FFT • Routing • Gather/Scatter • Conjugate Gradient • MD and FE codes • Sparse matrices • Transpose • FFT • Sorting • Tree • Parallel prefix, Reduction • Sorting

  18. icl.cs.utk.edu/hpcc HPC Challenge Benchmarks • HPL* • DGEMM* • STREAM* • PTRANS* • FFTE* • RandomAccess* • Effective Bandwidth b_eff* * Other names and brands may be claimed as the property of others.

  19. Implemented in DT of NPB and in NGB Programming With Directed Graphs • Arc • Arc* newArc(Node *tail, Node *head) • AttachArc(DGraph *dg) • deleArc(Arc *ar) • Node • newNode(char *name) • Node* AttachNode(DGraph *dg) • deleteNode(Node *nd) • DGraph • DGraph* newDGraph(char *name) • writeGraph(DGraph *dg, char* fname) • DGraph * readGraph(char* fname) do_134

  20. Directed Graphs Around • Parse trees • File Systems • Application task graphs • Device Schematics Visualization and layout Tools • VCG tool • Edge tool • Tom Sawyer Software • Commercial tools

  21. Task Graphs are rapidly growing Cart3D* • Performs CFD analysis on complex geometries • Uses six executables • Intersect* – intersects geometry • Cubes* – produces Cartesian meshes • Reorder* – reorders meshes • Mgprep* – coarsens mesh • flowCart* – convergence acceleration • Clic* – analyzes the flow • Executables communicate via files • Returns relevant forces • Lift, Drag, Side Force * Other names and brands may be claimed as the property of others.

  22. Mixed Bag (MB) Launch LU2 LU4 LU8 MG4 MG8 MG2 FT8 FT8 FT2 Report #steps Helical Chain (HC) Launch Embarrassingly Distributed (ED) Visualization Pipeline (VP) BT SP LU Launch Launch BT SP LU SP SP SP SP SP SP SP SP SP BT MG FT BT SP LU BT MG Report FT BT MG FT Report Report The NAS Grid Benchmarks • Reflect task level programming paradigm • Contain four patterns • Embarrassingly Distributed (ED) • Helical Chain (HC) • Visualization Pipeline (VP) • Mixed Bag (MB)

  23. Automatic Trace Analysis Using OLAP Data Dependent Patterns • Intermittent patterns • Useful for application performance tuning • Visualization is important • Allows to employ human eye ability to detect patterns • Automatic Pattern Mining • OLAP approach • MPI communication patterns

  24. Conclusions Data Flow in Applications • Application Parallelization • Application Understanding • Application Mapping • Application Performance

More Related