1 / 8

The PFunc Implementation of NAS Parallel Benchmarks.

The PFunc Implementation of NAS Parallel Benchmarks. Presenter: Shashi Kumar Nanjaiah Advisor: Dr. Chung E Wang Department of Computer Science California State University, Sacramento. Overview.

susane
Télécharger la présentation

The PFunc Implementation of NAS Parallel Benchmarks.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The PFunc Implementation of NAS Parallel Benchmarks. Presenter: Shashi Kumar Nanjaiah Advisor: Dr. Chung E Wang Department of Computer Science California State University, Sacramento

  2. Overview. The goal of this project is to prove the efficacy of task parallelism in PFunc to parallelize industry-standard benchmark computation kernels and applications on shared-memory. • Introduce PFunc, a new tool for task parallelism. • New features and extensions. • Fibonacci example. • Introduce NAS parallel benchmarks. • Briefly explain the 7 benchmarks.

  3. BackgroundPFunc - A new tool for task parallelism. • Extends existing task parallel feature-set. • Cilk, Threading Building Blocks, Fortran M, etc. • Portable. • Linux, OS X, AIX and Windows. • Customizable. • Generic and generic programming techniques. • No runtime penalty. • C and C++ APIs. • Released under Eclipse Public License v1.0. • http://coin-or.org/projects/PFunc.xml

  4. Example: Parallelizing Fibonacci numbers. typedef struct {int n; int fib_n;} fib_t; void fibonacci (void arg) { fib_t* fib_arg = (fib_t*) arg; if (0 == fib_arg-> n || 1 == fib_arg-> n{ fib_arg-> fib_n = fib_arg-> n; }else{ pfunc_cilk_task_t fib_task; fib_t fib_n_1 = {(fib_arg-> n) - 1, 0}; fib_t fib_n_2 = {(fib_arg-> n) – 2, 0}; pfunc_cilk_task_init (&fib_task); pfunc_cilk_spawn_c (fib_task, /* Handle to the task* / NULL, /* Attribute -- use default */ NULL, /* Group -- use default */ fibonacci, /* Function to execute */ &fib_n_1); /* Argument */ fibonacci (&fib_n_2); pfunc_cilk_wait (fib_task); pfunc_cilk_task_clear (&fib_task); fib_arg-> fib_n = fib_n_1.fib_n + fib_n_2.fib_n; }}

  5. Fibonacci: task creation overhead. Fibonacci number 37 (236 ≈ 69 billion tasks). • 2x faster than TBB! • Only 2x slower than Cilk. • But provides more flexibility! • Fibonacci is the worst case behavior. • Library-based rather than a custom compiler.

  6. NAS Parallel Benchmarks. • Stands for NASA Advanced Supercomputing. • Help to evaluate performance of parallel tools and machines. • Consist of 5 kernels and 3 pseudo applications. • Taken mostly from Computational Fluid Dynamics (CFD). • Originally written in Fortran, but C versions are available. • http://www.nas.nasa.gov/Resources/Software/npb.html • NPB OpenMP-C v2.3. • Base code taken from Omni group’s implementation.

  7. NAS Parallel Benchmarks.

  8. Conclusion. • Modify data-parallel NPB OpenMP-C version to task parallel version. • Compare against original NPB OpenMP-C version. • For problem sizes in classes A, B and C.

More Related