html5-img
1 / 37

Solving Large Sparse Linear Systems

Solving Large Sparse Linear Systems. Ruth Shaw Dept. of CS & Appl. Stats. UNB Saint John. Outline. Solving Linear Systems Iterative Methods Jacobi Method PDE Example Designing the Parallel Algorithm. Solving Linear Systems. Ax = b. Iterative Methods. Jacobi Gauss-Seidel

baina
Télécharger la présentation

Solving Large Sparse Linear Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Solving Large Sparse Linear Systems Ruth Shaw Dept. of CS & Appl. Stats. UNB Saint John

  2. Outline • Solving Linear Systems • Iterative Methods • Jacobi Method • PDE Example • Designing the Parallel Algorithm

  3. Solving Linear Systems Ax = b

  4. Iterative Methods • Jacobi • Gauss-Seidel • successive over-relaxation (SOR) • conjugate gradient method • biconjugate gradient method • generalized minimal residual method

  5. Jacobi Method Convert Ax = b into the form (L + D + U)x = b Dx = (-L - U)x + b x = D-1(-L - U)x + D-1b x = Cx + d Generate a sequence of approximations x (k+1) = Cx (k) + d

  6. Jacobi Method Consider a small 3 x 3 system of equations: 2x1 – x2 + x3 = -1 x1 + 2x2 – x3 = 6 x1 – x2 + 2x3 = -3 This is converted to: x1 = [-1 – (x2 - x3)] / 2 x2 = [ 6 – (- x1 + x3)] / 2 x3 = [-3 – (x1 + x2)] / 2

  7. Jacobi Method Successive iterates are generated as follows: x1 (k+1) = [-1 – (x2 (k)- x3 (k))] / 2 x2(k+1) = [ 6 – (- x1 (k) + x3 (k))] / 2 x3(k+1) = [-3 – (x1 (k) + x2 (k))] / 2 The initial guess (k = 0) to the solution vector is usually taken to be zero

  8. Jacobi Method Approximation to solution after 13 iterations: x1 = 1.0002, x2 = 2.0001, x3 = -0.9997 Exact Solution: x1 = 1, x2 = 2, x3 = -1 Max Error: 3.0 x 10-4

  9. PDE Example • Finite-Difference Solution of a PDE • typically involves solving a linear system with a large banded coefficient matrix • based on subdividing the domain of the problem with a mesh of discrete points … derivatives are replaced with difference quotients

  10. PDE Example The steady-state temperature distribution in a plate, 0 ≤ x≤ 1, 0 ≤ y ≤ 1, can be described by the Poisson equation: uxx + uyy = f(x, y) If there are no heat sources (f(x, y) = 0) the equation is know as Laplace’s equation.

  11. PDE Example Consider the two-dimension Laplace equation uxx + uyy = 0, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 with boundary conditions u(0, y) = y2, u(1, y) = 1, 0 < y < 1 u(x, 0) = x2, u(x, 1) = 1, 0 < x < 1

  12. PDE Example We define a mesh in the (x, y) plane: xi = ih, i = 0, 1, …, n, h = ∆x = 1/n yi = jk, j = 0, 1, …, m, k = ∆y = 1/m Denote the value of u(x, y) at the point (xi, yi) as uij

  13. PDE Example Replacing the second derivatives in the Laplace equation uxx + uyy = 0 by centered differences gives a system of algebraic equations for the function values at the mesh points (1/k2)(ui,j+1 – 2uij + ui,j-1) + (1/h2)(ui+1,j – 2uij + ui-1,j) = 0

  14. PDE Example Setting ∆x = ∆ y gives (ui,j+1 – 2uij + ui,j-1) + (ui+1,j – 2uij + ui-1,j) = 0 Which results in ui,j+1 – 4uij + ui,j-1 + ui+1,j + ui-1,j = 0

  15. PDE Example Setting n =3 gives the following system of equations: u12 – 4u11 + u10 + u21 + u01 = 0 u13 – 4u12 + u11 + u22 + u02 = 0 : u10 u01

  16. PDE Example The values outside the mesh represent the boundaries of the plate

  17. PDE Example Denote the boundary values by bk , k = 1,…,4n and the unknown mesh values by ui , i = 1,…,n*n

  18. PDE Example Moving the boundary values to the right of each equation gives us the following linear system: 4u1 – u2 – u4 = b1 + b12 - u1 + 4u2 – u3 – u5 = b2 – u2 + 4u3 – u6 = b3 + b4 - u1 + 4u4 – u5 – u7 = b11 – u2 – u4 + 4u5 – u6 – u8 = 0 - u3 – u5 + 4u6 – u9 = b5 - u4 + 4u7 – u8 = b9 + b10 - u5 – u7 + 4u8 – u9 = b8 - u6 – u8 4u9 = b6 + b7

  19. PDE Example Assigning values to the boundaries then gives us the following graph

  20. PDE Example Which now represents the following system: 4u1 – u2 – u4 = 0.3 - u1 + 4u2 – u3 – u5 = 0.25 – u2 + 4u3 – u6 = 1.9 - u1 + 4u4 – u5 – u7 = 0.5 – u2 – u4 + 4u5 – u6 – u8 = 0 - u3 – u5 + 4u6 – u9 = 1.0 - u4 + 4u7 – u8 = 1.8 - u5 – u7 + 4u8 – u9 = 1.0 - u6 – u8 + 4u9 = 2.0

  21. PDE Example For the Jacobi method, this system is converted to the following set of equations: u1 = [0.3 – (u2 + u4)] / 4 u2 = [0.25 – (u1 + u3 + u5)] / 4 : u9 = [2.0 – (u6 + u8)] / 4

  22. PDE Example The Jacobi iteration scheme is then given by: u1(k+1) = [0.3 – (u2(k) + u4(k) )] / 4 u2(k+1) = [0.25 – (u1(k) + u3(k) + u5(k))] / 4 : u9(k+1) = [2.0 – (u6(k) + u8(k))] / 4

  23. PDE Example After 38 iterations Max error 8.9 x 10-7

  24. Designing the Parallel Algorithm There are essentially two approaches to designing a parallel program: data-parallel approach domain decomposition … divide the data into pieces control-parallel approach functional decomposition … divide the computation into pieces

  25. Designing the Parallel Algorithm After deciding on the decomposition of the algorithm, the communication pattern needs to be determined => minimize processor communication

  26. Parallel Jacobi Recall our PDE example for n = 3: u1(k+1) = [0.3 – (u2(k) + u4(k) )] / 4 u2(k+1) = [0.25 – (u1(k) + u3(k) + u5(k))] / 4 u3(k+1) = [1.9 – (u2(k) + u6(k))] / 4 u4(k+1) = [0.5 – (u1(k) + u5(k) + u7(k))] / 4 u5(k+1) = [0 – (u2(k) + u4(k) + u6(k) + u8(k))] / 4 u6(k+1) = [1.0 – (u3(k) + u5(k) + u9(k))] / 4 u7(k+1) = [1.8 – (u4(k) + u8(k))] / 4 u8(k+1) = [1.0 – (u5(k) + u7(k) + u9(k))] / 4 u9(k+1) = [2.0 – (u6(k) + u8(k))] / 4

  27. Parallel Jacobi Assume p processes where n ≥ p … An obvious approach is to have each process calculate the values in a subvector of the solution vector => for simplicity, assume n/p is even

  28. Parallel Jacobi Assume p = 3: P0: u1(k+1) = [0.3 – (u2(k) + u4(k) )] / 4 u2(k+1) = [0.25 – (u1(k) + u3(k) + u5(k))] / 4 u3(k+1) = [1.9 – (u2(k) + u6(k))] / 4 P1: u4(k+1) = [0.5 – (u1(k) + u5(k) + u7(k))] / 4 u5(k+1) = [0 – (u2(k) + u4(k) + u6(k) + u8(k))] / 4 u6(k+1) = [1.0 – (u3(k) + u5(k) + u9(k))] / 4 P2: u7(k+1) = [1.8 – (u4(k) + u8(k))] / 4 u8(k+1) = [1.0 – (u5(k) + u7(k) + u9(k))] / 4 u9(k+1) = [2.0 – (u6(k) + u8(k))] / 4

  29. Parallel Jacobi Partition coefficient matrix A using block-row chunkA = (n*n) / p call MPI_SCATTER(A, chunkA, MPI_REAL, A_local, chunkA, MPI_REAL, 0, PI_COMM_WORLD, ierr) 4 -1 0 -1 0 0 0 0 0 -1 4 -1 0 -1 0 0 0 0 0 -1 4 0 0 -1 0 0 0 -1 0 0 4 -1 0 -1 0 0 0 -1 0 -1 4 -1 0 -1 0 0 0 -1 0 -1 4 0 0 -1 0 0 0 -1 0 0 4 -1 0 0 0 0 0 -1 0 -1 4 -1 0 0 0 0 0 -1 0 -1 4

  30. Parallel Jacobi Scatter the RHS vector b chunkb = n/p call MPI_SCATTER(b, chunkb, MPI_REAL, b_local, chunkb, MPI_REAL, 0, MPI_COMM_WORLD, ierr) 0.3 0.25 1.9 0.5 0 1.0 1.8 1.0 2.0

  31. Parallel Jacobi Each processor calculates n/p elements of the solution vector P0: x_local(1), x_local(2), x_local(3) P1: x_local(4), x_local(5), x_local(6) P2: x_local(7), x_local(8), x_local(9)

  32. Parallel Jacobi After each iteration use MPI_Allgather to give all processes a copy of the updated solution vector call MPI_ALLGATHER(x_local, chunkb, MPI_REAL, x_new, chunkb, MPI_REAL, MPI_COMM_WORLD, ierr) P0 P2 P1

  33. Parallel Jacobi Determine if the norm of the solution vector has reached the set tolerance option 1: all processes compute global norm option 2: each process computes local norm and then use MPI_Allreduce to give all processes a copy of the global norm

  34. Parallel Jacobi Speedup = serial time / parallel time

  35. Parallel Jacobi Speedup = serial time / parallel time

  36. Parallel Jacobi

  37. References Iterative Methods for Solving Linear Systems http://www.quantlet.com/mdstat/scripts/csa/html/node38.html Applied Numerical Analysis Using MATLAB, 2nd Ed., by Laurene Fausett, Pearson Prentic-Hall, 2008 Parallel programming in C with MPI and OpenMP by Michael Quinn, McGraw-Hill, 2004 Parallel Programming with MPI by Peter Pacheco, Morgan Kaufmann, 1997

More Related