120 likes | 205 Vues
This project focuses on parallelizing the backsolve algorithm for upper triangular matrices, solving equations efficiently. It covers the challenges, solution techniques, and implementation using parallel processing with MPI for improved performance.
E N D
PARALLELIZATION OF MULTIPLE BACKSOLVES Project #2 James Stanley April 25, 2002
PARALLELIZATION OF MULTIPLE BACKSOLVESProject #2 • Introduction (Backsolve) • Challenge • Example (m = 5) • Problem Description • Solution Technique • Parallel Implementation • Results
Introduction (Backsolve) If R is an upper triangular matrix, a backsolve is a solution of the equation, Rx=b Where b is a vector of length m. The formulas for the solution are
bm ___ mth equation: rmmxm = bm => xm= rmm Introduction (Backsolve) cont. from mth equation m-1th equation: rm-1,m-1xm-1 + rm-1,mxm = bm-1 bm-1- rm-1,* xm __________________ => xm-1 = rm-1
Introduction (Backsolve) cont. rii,xi + ri,i+1xi+1 + …+ rim,xi = bi , for 0 i m-1 ith equation: =>
Challenge Storage: To avoid storing zeros, store n(n+1)/2 nonzero elements of R1in a 1-Dimensional array by rows, and the m(m+1)/2 Nonzero elements of R2 in a 1-Dimensional array by rows.
Example Suppose m = 5, then or,
Example cont. Solving for the xi’s provides in memory
Problem Description Given RHS matrix H to solve for the nxmunknown matrix Y : R1YR2T = H Where R1 is a square upper triangular matrix of order nxn and R2is a square upper triangular matrix of order mxm.
(2) Let Z =R1Y • Then R1YR2T = ZR2T • ZR2T = H Solution Technique R2ZT = HT Parallel Solution of (1) and (2): (1) R2T=HT = (h1,h2,…,hn) (2) R1Y =Z = (z1,z2,…,zm) 1rst solve (1) for the mxn matrix ZT. 2nd take the Transpose of ZT to get Z. 3rd solve (2) for the nxm solution matrix Y.
Parallel Implementation • Generate HT and R2, R1 on Process 0, using a Random Number Generator. • Move all of R2 to all processes with the MPI_BCAST. (If there are p processors, then make sure n/p and m/p are integers. • Ship the n/p of the rows of H to each process with MPI_Scatter. • On each process solve n/p equations for local Z. • Ship all of the columns of Z using MPI_Gather to process 0. • Perform the transpose of Z on process 0. • Ship m/p of the rows of ZT to each process with MPI_Scatter. • On each process solve m/p equations for local Y. • Ship all of the m/p columns of local Y to process 0 and print the solution on process 0. Denotes communication time Denotes computation time