1 / 41

PETSc

PETSc. Material adapted from a tutorial by:. Satish Balay Bill Gropp Lois Curfman McInnes Barry Smith www-fp.mcs.anl.gov/petsc/docs/tutorials/index.htm Mathematics and Computer Science Division Argonne National Laboratory http://www.mcs.anl.gov/petsc. PETSc Philosophy.

halona
Télécharger la présentation

PETSc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PETSc Material adapted from a tutorial by: Satish Balay Bill Gropp Lois Curfman McInnes Barry Smith www-fp.mcs.anl.gov/petsc/docs/tutorials/index.htm Mathematics and Computer Science Division Argonne National Laboratory http://www.mcs.anl.gov/petsc

  2. PETSc Philosophy • Writing hand-parallelized (message-passing or distributed shared memory) application codes from scratch is extremely difficult and time consuming. • Scalable parallelizing compilers for real application codes are very far in the future. • We can ease the development of parallel application codes by developing general-purpose, parallel numerical PDE libraries.

  3. PETSc Concepts in the Tutorial • How to specify the mathematics of the problem • Data objects • vectors, matrices • How to solve the problem • Solvers • linear, nonlinear, and timestepping (ODE) solvers • Parallel computing complications • Parallel data layout • structured and unstructured meshes

  4. Tutorial Outline • Data layout and ghost values • structured mesh problems • unstructured mesh problems • partitioning and coloring • Putting it all together • a complete example • Debugging and error handling • Profiling and performance tuning • -log_summary • stages and preloading • user-defined events • Extensibility issues • Getting started • sample results • programming paradigm • Data objects • vectors (e.g., field variables) • matrices (e.g., sparse Jacobians) • Viewers • object information • visualization • Solvers • linear • nonlinear • timestepping (and ODEs)

  5. 4 3 advanced developer 2 1 beginner intermediate Tutorial Approach From the perspective of an application programmer: • Beginner • basic functionality, intended for use by most programmers • Intermediate • selecting options, performance evaluation and tuning • Advanced • user-defined customization of algorithms and data structures • Developer • advanced customizations, intended primarily for use by library developers

  6. Incremental Application Improvement • Beginner • Get the application “up and walking” • Intermediate • Experiment with options • Determine opportunities for improvement • Advanced • Extend algorithms and/or data structures as needed • Developer • Consider interface and efficiency issues for integration and interoperability of multiple toolkits

  7. The PETSc Programming Model • Goals • Portable, runs everywhere • Performance • Scalable parallelism • Approach • Distributed memory, “shared-nothing” • Requires only a compiler (single node or processor) • Access to data on remote machines through MPI • Can still exploit “compiler discovered” parallelism on each node (e.g., SMP) • Hide within parallel objects the details of the communication • User orchestrates communication at a higher abstract level than message passing tutorial introduction

  8. Collectivity • MPI communicators (MPI_Comm) specify collectivity (processors involved in a computation) • All PETSc creation routines for solver and data objects are collective with respect to a communicator, e.g., VecCreate(MPI_Comm comm, int m, int M, Vec *x) • Some operations are collective, while others are not, e.g., • collective: VecNorm( ) • not collective: VecGetLocalSize() • If a sequence of collective routines is used, they must be called in the same order on each processor tutorial introduction

  9. PETSc PDE Application Codes ODE Integrators Visualization Nonlinear Solvers, Unconstrained Minimization Interface Linear Solvers Preconditioners + Krylov Methods Object-Oriented Matrices, Vectors, Indices Grid Management Profiling Interface Computation and Communication Kernels MPI, MPI-IO, BLAS, LAPACK PDE Application Codes tutorial introduction

  10. Nonlinear Solvers Time Steppers Newton-based Methods Other Euler Backward Euler Pseudo Time Stepping Other Line Search Trust Region Krylov Subspace Methods GMRES CG CGS Bi-CG-STAB TFQMR Richardson Chebychev Other Preconditioners Additive Schwartz Block Jacobi Jacobi ILU ICC LU (Sequential only) Others Matrices Compressed Sparse Row (AIJ) Blocked Compressed Sparse Row (BAIJ) Block Diagonal (BDIAG) Dense Other Vectors Index Sets Indices Block Indices Stride Other PETSc Numerical Components tutorial introduction

  11. Flow of Control for PDE Solution Main Routine Timestepping Solvers (TS) Nonlinear Solvers (SNES) Linear Solvers (SLES) PETSc PC KSP Application Initialization Function Evaluation Jacobian Evaluation Post- Processing User code PETSc code tutorial introduction

  12. True Portability • Tightly coupled systems • Cray T3D/T3E • SGI/Origin • IBM SP • Convex Exemplar • Loosely coupled systems, e.g., networks of workstations • Sun OS, Solaris 2.5 • IBM AIX • DEC Alpha • HP • Linux • Freebsd • Windows NT/95 tutorial introduction

  13. Data Objects • Vectors (Vec) • focus: field data arising in nonlinear PDEs • Matrices (Mat) • focus: linear operators arising in nonlinear PDEs (i.e., Jacobians) • Object creation • Object assembly • Setting options • Viewing • User-defined customizations beginner beginner intermediate intermediate advanced tutorial outline: data objects

  14. Vectors proc 0 • Fundamental objects for storing field solutions, right-hand sides, etc. • VecCreateMPI(...,Vec *) • MPI_Comm - processors that share the vector • number of elements local to this processor • total number of elements • Each process locally owns a subvector of contiguously numbered global indices proc 1 proc 2 proc 3 proc 4 data objects: vectors beginner

  15. Vectors: Example #include "vec.h" int main(int argc,char **argv) { Vec x,y,w; /* vectors */ Vec *z; /* array of vectors */ double norm,v,v1,v2;int n = 20,ierr; Scalar one = 1.0,two = 2.0,three = 3.0,dots[3],dot; PetscInitialize(&argc,&argv,(char*)0,help); ierr = OptionsGetInt(PETSC_NULL,"n",&n,PETSC_NULL); CHKERRA(ierr); ierr =VecCreate(PETSC_COMM_WORLD,PETSC_DECIDE,n,&x); CHKERRA(ierr); ierr = VecSetFromOptions(x);CHKERRA(ierr); ierr = VecDuplicate(x,&y);CHKERRA(ierr); ierr = VecDuplicate(x,&w);CHKERRA(ierr); ierr = VecSet(&one,x);CHKERRA(ierr); ierr = VecSet(&two,y);CHKERRA(ierr); ierr = VecSet(&one,z[0]);CHKERRA(ierr); ierr = VecSet(&two,z[1]);CHKERRA(ierr); ierr = VecSet(&three,z[2]);CHKERRA(ierr); ierr = VecDot(x,x,&dot);CHKERRA(ierr); ierr = VecMDot(3,x,z,dots);CHKERRA(ierr);

  16. Vectors: Example ierr = PetscPrintf(PETSC_COMM_WORLD,"Vector length %d\n",(int)dot); CHKERRA(ierr); ierr = VecScale(&two,x);CHKERRA(ierr); ierr = VecNorm(x,NORM_2,&norm);CHKERRA(ierr); v = norm-2.0*sqrt((double)n); if (v > -1.e-10 && v < 1.e-10) v = 0.0; ierr = PetscPrintf(PETSC_COMM_WORLD,"VecScale %g\n",v); CHKERRA(ierr); ierr = VecDestroy(x);CHKERRA(ierr); ierr = VecDestroy(y);CHKERRA(ierr); ierr = VecDestroy(w);CHKERRA(ierr); ierr = VecDestroyVecs(z,3);CHKERRA(ierr); PetscFinalize(); return 0; }

  17. Vector Assembly • VecSetValues(Vec,…) • number of entries to insert/add • indices of entries • values to add • mode: [INSERT_VALUES,ADD_VALUES] • VecAssemblyBegin(Vec) • VecAssemblyEnd(Vec) data objects: vectors beginner

  18. Parallel Matrix and Vector Assembly • Processors may generate any entries in vectors and matrices • Entries need not be generated on the processor on which they ultimately will be stored • PETSc automatically moves data during the assembly process if necessary data objects: vectors and matrices beginner

  19. Selected Vector Operations

  20. Sparse Matrices • Fundamental objects for storing linear operators (e.g., Jacobians) • MatCreateMPIAIJ(…,Mat *) • MPI_Comm - processors that share the matrix • number of local rows and columns • number of global rows and columns • optional storage pre-allocation information data objects: matrices beginner

  21. proc 1 proc 2 } proc 3 proc 3: locally owned rows proc 4 Parallel Matrix Distribution • Each process locally owns a submatrix of contiguously numbered global rows. MatGetOwnershipRange(Mat A, int *rstart, int *rend) • rstart: first locally owned row of global matrix • rend -1: last locally owned row of global matrix proc 0 data objects: matrices beginner

  22. Matrix Assembly • MatSetValues(Mat,…) • number of rows to insert/add • indices of rows and columns • number of columns to insert/add • values to add • mode: [INSERT_VALUES,ADD_VALUES] • MatAssemblyBegin(Mat) • MatAssemblyEnd(Mat) data objects: matrices beginner

  23. Viewers • Printing information about solver and data objects • Visualization of field and matrix data • Binary output of vector and matrix data beginner beginner intermediate tutorial outline: viewers

  24. Viewer Concepts • Information about PETSc objects • runtime choices for solvers, nonzero info for matrices, etc. • Data for later use in restarts or external tools • vector fields, matrix contents • various formats (ASCII, binary) • Visualization • simple x-window graphics • vector fields • matrix sparsity structure beginner viewers

  25. Viewing Vector Fields Solution components, using runtime option -snes_vecmonitor • VecView(Vec x,Viewer v); • Default viewers • ASCII (sequential): VIEWER_STDOUT_SELF • ASCII (parallel): VIEWER_STDOUT_WORLD • X-windows: VIEWER_DRAW_WORLD • Default ASCII formats • VIEWER_FORMAT_ASCII_DEFAULT • VIEWER_FORMAT_ASCII_MATLAB • VIEWER_FORMAT_ASCII_COMMON • VIEWER_FORMAT_ASCII_INFO • etc. velocity: v velocity: u temperature: T vorticity: z beginner viewers

  26. Viewing Matrix Data • MatView(Mat A, Viewer v); • Runtime options available after matrix assembly • -mat_view_info • info about matrix assembly • -mat_view_draw • sparsity structure • -mat_view • data in ASCII • etc. beginner viewers

  27. Linear Solvers Goal: Support the solution of linear systems, Ax=b, particularly for sparse, parallel problems arising within PDE-based models User provides: • Code to evaluate A, b solvers:linear beginner

  28. Linear PDE Solution Main Routine PETSc Linear Solvers (SLES) Solve Ax = b PC KSP Application Initialization Evaluation of A and b Post- Processing User code PETSc code solvers:linear beginner

  29. Linear Solvers (SLES) SLES: Scalable Linear Equations Solvers • Application code interface • Choosing the solver • Setting algorithmic options • Viewing the solver • Determining and monitoring convergence • Providing a different preconditioner matrix • Matrix-free solvers • User-defined customizations beginner beginner intermediate intermediate intermediate intermediate advanced tutorial outline: solvers: linear advanced

  30. Conjugate Gradient GMRES CG-Squared Bi-CG-stab Transpose-free QMR etc. Block Jacobi Overlapping Additive Schwarz ICC, ILU via BlockSolve95 ILU(k), LU (sequential only) etc. Linear Solvers in PETSc 2.0 Preconditioners (PC) KrylovMethods (KSP) solvers:linear beginner

  31. Basic Linear Solver Code (C/C++) SLES sles; /* linear solver context */ Mat A; /* matrix */ Vec x, b; /* solution, RHS vectors */ int n, its; /* problem dimension, number of iterations */ MatCreate(MPI_COMM_WORLD,n,n,&A); /* assemble matrix */ VecCreate(MPI_COMM_WORLD,n,&x); VecDuplicate(x,&b); /* assemble RHS vector */ SLESCreate(MPI_COMM_WORLD,&sles); SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PATTERN); SLESSetFromOptions(sles); SLESSolve(sles,b,x,&its); solvers:linear beginner

  32. Basic Linear Solver Code (Fortran) SLES sles Mat A Vec x, b integer n, its, ierr call MatCreate(MPI_COMM_WORLD,n,n,A,ierr) call VecCreate(MPI_COMM_WORLD,n,x,ierr) call VecDuplicate(x,b,ierr) call SLESCreate(MPI_COMM_WORLD,sles,ierr) call SLESSetOperators(sles,A,A,DIFFERENT_NONZERO_PATTERN,ierr) call SLESSetFromOptions(sles,ierr) call SLESSolve(sles,b,x,its,ierr) C then assemble matrix and right-hand-side vector solvers:linear beginner

  33. Customization Options • Procedural Interface • Provides a great deal of control on a usage-by-usage basis inside a single code • Gives full flexibility inside an application • Command Line Interface • Applies same rule to all queries via a database • Enables the user to have complete control at runtime, with no extra coding solvers:linear intermediate

  34. Setting Solver Options within Code • SLESGetKSP(SLES sles,KSP *ksp) • KSPSetType(KSP ksp,KSPType type) • KSPSetTolerances(KSP ksp,double rtol,double atol,double dtol, int maxits) • ... • SLESGetPC(SLES sles,PC *pc) • PCSetType(PC pc,PCType) • PCASMSetOverlap(PC pc,int overlap) • ... solvers:linear intermediate

  35. 2 1 beginner intermediate Setting Solver Options at Runtime • -ksp_type [cg,gmres,bcgs,tfqmr,…] • -pc_type [lu,ilu,jacobi,sor,asm,…] • -ksp_max_it <max_iters> • -ksp_gmres_restart <restart> • -pc_asm_overlap <overlap> • -pc_asm_type [basic,restrict,interpolate,none] • etc ... 1 2 solvers:linear

  36. SLES: Review of Basic Usage • SLESCreate( ) - Create SLES context • SLESSetOperators( ) - Set linear operators • SLESSetFromOptions( ) - Set runtime solver options for [SLES,KSP,PC] • SLESSolve( ) - Run linear solver • SLESView( ) - View solver options actually used at runtime (alternative: -sles_view) • SLESDestroy( ) - Destroy solver solvers:linear beginner

  37. 2 1 beginner intermediate SLES: Review of Selected Preconditioner Options 1 2 And many more options... solvers: linear: preconditioners

  38. 2 1 beginner intermediate SLES: Review of Selected Krylov Method Options 1 2 And many more options... solvers: linear: Krylov methods

  39. SLES: Runtime Script Example solvers:linear intermediate

  40. Viewing SLES Runtime Options solvers:linear intermediate

  41. 2 1 3 beginner intermediate advanced SLES: Example Programs Location:www.mcs.anl.gov/petsc/src/sles/examples/tutorials/ • ex1.c, ex1f.F - basic uniprocessor codes • ex2.c, ex2f.F - basic parallel codes • ex11.c - using complex numbers • ex4.c - using different linear system and preconditioner matrices • ex9.c - repeatedly solving different linear systems • ex15.c - setting a user-defined preconditioner 1 2 3 • And many more examples ... solvers:linear

More Related