1 / 25

Designing and Evaluating Parallel Programs

Designing and Evaluating Parallel Programs. Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs by Ian Foster (http://www-unix.mcs.anl.gov/dbpp/). Parallel Machines. Flynn's Taxonomy.

clara
Télécharger la présentation

Designing and Evaluating Parallel Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs by Ian Foster (http://www-unix.mcs.anl.gov/dbpp/)

  2. Parallel Machines

  3. Flynn's Taxonomy First proposed by Michael J. Flynn in 1966: • SISD: single instruction, single data • MISD: multiple instruction, single data • SIMD: single instruction, multiple data • MIMD: multiple instruction, multiple data

  4. A Parallel Programming Model: Tasks and Channels • Task operations: • send msg • receive msg • create task • terminate • In practice: • Message passing:MPI • Data parallelism • Shared memory

  5. Parallel Algorithms Examples: Finite Difference

  6. Parallel Algorithms Examples: Pairwise Interactions Molecular dynamics: total force fi acting on atom Xi

  7. Parallel Algorithms Examples: Search

  8. Parallel Algorithms Examples: Parameter Study

  9. Parallel Program Design

  10. Partitioning Domain Decomposition: Functional Decomposition:

  11. Partitioning Design Checklist • Does your partition define at least an order of magnitude more tasks than there are processors in your target computer? • Does your partition avoid redundant computation and storage requirements? • Are tasks of comparable size? • Does the number of tasks scale with problem size? Ideally, an increase in problem size should increase the number of tasks rather than the size of individual tasks. • Have you identified several alternative partitions? You can maximize flexibility in subsequent design stages by considering alternatives now. Remember to investigate both domain and functional decompositions.

  12. Communication Local Global Asynchronous Unstructured and Dynamic

  13. Communication Design Checklist • Do all tasks perform about the same number of communication operations? • Does each task communicate only with a small number of neighbors? • Are communication operations able to proceed concurrently? • Is the computation associated with different tasks able to proceed concurrently?

  14. Agglomeration

  15. Increasing Granularity

  16. Replicating Computation • Sum all and store it on all nodes. • array: 2(N-1) steps • tree: 2 log N steps • Ring instead of array in (N-1) steps?

  17. Replicating Computation

  18. Agglomeration Design Checklist • Has agglomeration reduced communication costs by increasing locality? • If agglomeration has replicated computation, have you verified that the benefits of this replication outweigh its costs, for a range of problem sizes and processor counts? • If agglomeration replicates data, have you verified that this does not compromise the scalability of your algorithm by restricting the range of problem sizes or processor counts that it can address? • Has agglomeration yielded tasks with similar computation and communication costs? • Does the number of tasks still scale with problem size? • If agglomeration eliminated opportunities for concurrent execution, have you verified that there is sufficient concurrency for current and future target computers? • Can the number of tasks be reduced still further, without introducing load imbalances, increasing software engineering costs, or reducing scalability? • If you are parallelizing an existing sequential program, have you considered the cost of the modifications required to the sequential code?

  19. Mapping

  20. Recursive Bisection

  21. Task Scheduling

  22. Mapping Design Checklist • If considering an SPMD design for a complex problem, have you also considered an algorithm based on dynamic task creation and deletion? • If considering a design based on dynamic task creation and deletion, have you also considered an SPMD algorithm? • If using a centralized load-balancing scheme, have you verified that the manager will not become a bottleneck? • If using a dynamic load-balancing scheme, have you evaluated the relative costs of different strategies? • If using probabilistic or cyclic methods, do you have a large enough number of tasks to ensure reasonable load balance? Typically, at least ten times as many tasks as processors are required.

  23. Case Study: Atmosphere Model

  24. Approaches to Performance Evaluation • Amdhal’s Law • Developing models: • Execution time: • Computation time • Communication time • Idle time • Efficiency and Speedup • Scalability analysis: • With fixed problem size • With scaled problem size

More Related