1 / 22

Use of adaptive low-rank approximation in sparse direct solver for 3D Helmholtz problem

Use of adaptive low-rank approximation in sparse direct solver for 3D Helmholtz problem. IPGG SB RAS , Russia, Novosibirsk. Sergey Solovyev. Agenda. Statement of the problem, key points Proposed algorithm Testing results. Statement of the problem , key points.

Télécharger la présentation

Use of adaptive low-rank approximation in sparse direct solver for 3D Helmholtz problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Use of adaptive low-rank approximation in sparse direct solver for 3D Helmholtz problem IPGG SB RAS, Russia, Novosibirsk Sergey Solovyev

  2. Agenda • Statement of the problem, key points • Proposed algorithm • Testing results

  3. Statement of the problem, key points

  4. Statement of the problem, key points Acoustic problem: Scalar Helmholtz equation Induction logging problem: equation Elasticity problem 3D, many right hand sides Nested Dissection reordering The Solver Gauss elimination algorithm Low-rank/HSS approximation Adaptive Cross Approximation Intel MKL BLAS/LAPACK Iterative refinement

  5. Proposed algorithm

  6. Algorithm of multifrontal HSS solver Input: init SLAE AX=B • Adaptive cross approximation (ACA) approach • Adaptive balancing between robust and low-rank arithmetic 1) Permute col/rows of matrix A 2) Perform decomposition using standard LU-algorithm and doing low-rank approximation of factors during factorization process References: Chandrasekaran, S., Gu, M., Li X. S., Xia, J. Some fast algorithms for hierarchically semiseparable matrices Xia, J. Efficient Structured Multifrontal Factorization for General Large Sparse Matrices. Rjasanow S. Adaptive Cross Approximation of Dense Matrices. 3) Solve triangular systems (LU factors inversion) 4) Perform iterative refinement if needed Output:

  7. Nested dissection technique of decreasing fill-in in LU-factors … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … Number of non-zeros in and is … … … Number of non-zeros in LU is not so dramatic, but still large for big n Factorization of diagonal blocks can be done in parallel … … … … … … … … … … … … … … … … … … … … … … …

  8. Rank properties of LU-factors Sparse blocks … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … Dense blocks … … … … … … … … … … … … … … … … … … … … …

  9. Rank properties of LU-factors Diagonal blocks can be compressed by HSS technique Sparse blocks … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … Dense blocks … … … … … … … … … … … … … … … Off-diagonal blocks can be approximated by low-rank matrices … … … … … …

  10. Low-rank approximation, internal accuracy n n k k n m m … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … – low-rank approximation for … … … … … … … – internal accuracy … … … … … … … … … … … … … … … … … …

  11. Structured MultifrontalFactorization k+3 15 k+3 15 LU low approximation k+2 13 14 k+2 13 14 Robust LU k+1 Switching level k+1 9 10 11 12 k+1 9 10 11 12 k 8 7 k 4 2 3 5 6 1 8 7 Robust LU 4 2 3 5 6 1 1…k-1 1…k-1 1…k-1 k k+1 k+2 k+3 1…k-1 k k+1 k+2 k+3 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … 9 10 11 12 … 9 10 11 12 … … … … … … … … … … … … … … … … 13 14 13 … 14 … … … … … … … … … … … … … … … 15 15 … … … … … … … … … … … … … … … … … … Pic. 2. L pattern for multifrontal factorization Pic. 1. L pattern in robust arithmetic

  12. Statement of Acoustic problem Helmholtz equation: • 3D computational domain • Parallelepipedal grid • Second-order approximation on 7-point stencil • Perfect Matching Layer (PML) boundary condition AX=B • Complex symmetric matrix A (non-Hermitian) • Many right hand sides, B

  13. Results of testing

  14. Testing complex Helmholts n*n*n • Test description: • Computational domain is cube 1000m^3; V=1500 m/s, frequency = 12 Hz • Right hand side – delta function • OMP switch off • Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz • 512 GB RAM n n n • Check behavior of the next characteristics (n=141): • Residual b-Ax; • Memory usage; • Performance; • Investigate performance and memory consumption (n=100,…,300)

  15. Validation testing results. Correlation between relative residual and internal accuracy (iterative refinement switch off)

  16. Memory testing results. HSS: Memory gain up to 2.5 times PARDISO: Number of nnz in L-factors: 3.5*10^9 Correlationbetween memorystorageandinternal accuracy nnz(L) = 3.5*10^9

  17. Performance testing results. HSS: Iterative refinement accuracy equal Factorization time up to 7 times faster Total time up to 2.5 times faster PARDISO: Factorization time 9045sec. Solve time 500sec. (100nrhs)

  18. Performance and memory consumption n • Test description: • eps=, • OMP switch off • n is increased from 100 to 300 n n Table 2. Trend of increasing memory and factorization time on cube domain

  19. Conclusion • The first version of solver support SLAE with symmetric matrices both in real and complex arithmetic. • Effective of algorithm was demonstrated • There are prospects for the further development

  20. Road map • The next steps of developing: • HSS threading (OMP) version Q3’2014 • The first hybrid version of solver (MPI*OMP) Q4’2014, comparison with MUMPS, PARDISO. • Adapt solver for general sparse matrices (METIS optimization)

  21. Acknowledgments • The research described was partially supported by • U.S. Civilian Research & Development Foundation (CRDF) grant RUE1-30034-NO-13 • Russian Foundation for Basic Research  (RFBR) grants 14-01-31340, 14-05-31222, 14-05-00049.

  22. Thank you for attention!

More Related