1 / 20

A Large-Grained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integration

A Large-Grained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integration. Takeshi Amako, Yusaku Yamamoto and Shao-Liang Zhang Dept. of Computational Science & Engineering Nagoya University, Japan. Outline of the talk. Introduction

lela
Télécharger la présentation

A Large-Grained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Large-Grained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integration Takeshi Amako, Yusaku Yamamoto and Shao-Liang Zhang Dept. of Computational Science & Engineering Nagoya University, Japan

  2. Outline of the talk • Introduction • The nonlinear eigenvalue problem • Existing algorithms • Our objective • The algorithm • Formulation as a nonlinear equation • Application of Kravanja et al’s method • Detecting and removing spurious eigenvalues • Numerical results • Accuracy of the computed eigenvalues • Parallel performance • Conclusion

  3. Introduction • The nonlinear eigenvalue problem • Given A(z)∈Cn×n,z: complex parameter • Find z1 ∈C such that A(z1) x = 0 has a nonzero solution x = x1. • z1 and x1 are called the eigenvalue and the corresponding eigenvector, respectively. • Examples • A(z) = A – zB+ z2C : quadratic eigenvalue problem • A(z) = A – zB+ ezC : general nonlinear eigenvalue problem • Applications • Electronic structure calculation • Nonlinear elasticity • Theoretical fluid dynamics

  4. Existing algorithms difficult to obtain • Multivariate Newton’s method and its variants • Locally quadratic convergence • Requires good initial estimate both for z1 and x1. • Nonlinear Arnoldi methods • Nonlinear Jacobi-Davidson methods • Efficient for large sparse matrices • Not suitable for finding all eigenvalues within a specified region of the complex plane

  5. Our objective • Let • G: closed Jordan curve on the complex plane, • A(z)∈Cn×n:analytical function of z in G. • We propose an algorithm that • can find all the eigenvalues within G, and • has large-grain parallelism. Im z Assumption: In the following, we mainly consider the case where G is a circle centered at the origin and with radius r. G Re z O r Related work: Sakurai et al. propose an algorithm for linear generalized eigenvalue problems

  6. Our approach • The basic idea • Let f(z) = det(A(z)). • Then f(z) is an analytical function of z in G and the eigenvalues of A(z) are characterized as the zeros of f(z). • Use Kravanja’s method (Kravanja et al., 1999) to find the zeros of an analytic function.

  7. Finding zeros of f(z) computable unknown • Let • z1, z2, ..., zm : zeros of f(z) in G, and • n1, n2, ..., nm : their multiplicity. Then f(z) can be written as • Define the complex moments by Then f(z) = ×g(z) analytical and nonzero in G analytical in G

  8. Finding zeros of f(z) (cont'd) • To extract information on {zk} from {mp}, define the following matrices: • Then it is easy to see that

  9. Finding zeros of f(z) (cont'd) • Noting that Vm and Dm are nonsingular, we have the following equivalence relation: • That is, we can find the zeros of f(z) in G by • computing the complex moments m0,m1 , ...,m2m-1, • constructing HmandHm<, and • computing the eigenvalues of Hm< – lHm. l is an eigenvalue of Hm< – lHm ⇔ l is an eigenvalue of Lm – lI ⇔ ∃k,l=zk

  10. Application to the nonlinear eigenvalue problem Im z Re z O • In our case, f(z) = det(A(z)) and • By applying the trapezoidal rule with K points, we have where G

  11. The algorithm The computationally intensive part. Large-grain parallelism

  12. Detecting and removing spurious eigenvalues • Usually, we do not know m, the number of eigenvalues of A(z) in G, in advance and use some estimate M instead. • When M > m, the eigenvalues of Hm< – lHm include spurious solutions that do not correspond to an eigenvalue of A(z). • To detect them, we compute the corresponding eigenvector by inverse iteration and evaluate the relative residual defined by • Of course, this quantity can also be used to check the accuracy of the computed eigenvalues. relative residual =

  13. Numerical results • Test problem • A(z) = A – zI + eB(z), where • A(z) : real random nonsymmetric matrix • B(z) : antidiagonal matrix with antidiagonal elements ez • e : parameter to specify the strength of nonlinearity • Parameters • n =500, 1000,2000 • e = 0, 10–4, 10–3, 10–2, 10–1 • Computational environment • Fujitsu HPC2500 (SPARC 64IV), 1-16 processors • Program written with C and MPI • LAPACK routines were used to compute (A(z))–1 and to compute the eigenvalues of Hm< – lHm.

  14. Accuracy of the computed eigenvalues • Parameters • n = 500 and e = 0.1 • r = 0.85, K = 128 and M = 11. • There are 7 eigenvalues in G. • Results • Our algorithm succeeded in locating all the eigenvalues in G. • The relative residuals were all under 10–10. • Similar results for other cases. Im z Re z

  15. Effect of K and M on the accuracy • Effect of the number of sample points K • Usually K=128 gives sufficient accuracy. • Effect of the Hankel matrix size M • It is better to take M a few more than the number of eigenvalues within G (7 in this case). • This is to mitigate the perturbation from eigenvalues outside G. K M Residuals as a function of K. Residuals as a function of M.

  16. Detecting and removing spurious eigenvalues spurious eigenvalue • Parameters • n = 1000 and e = 0.01 • r = 0.7, K = 128 and M = 10. • There are 9 eigenvalues in G. • Eigenvalues of Hm< – lHm • 10 eigenvalues were found within G. • For 9 of the eigenvalues, the residual was less than 10–11. • For one eigenvalue, the residual was 10–2. Im z Re z

  17. Parallel performance • Performance on Fujitsu HPC2500 • Matrix size: n =500, 1000,2000 • Number of processors: P = 1, 2, 4, 8, 16 Almost linear speedup was obtained in all cases due to large-grain parallelism. Execution time (sec) Number of processors

  18. Parallel performance (cont'd) • Performance in a Grid environment • Matrix size: n =1000 • Machine: Intel Xeon Cluster • Master-worker type parallelization using OmniRPC (GridRPC) Good scalability was obtained for up to 14 processors. 2:00:00 16 Execution time Speedup 14 1:30:00 12 10 1:00:00 8 6 0:30:00 4 2 0:00:00 0 Number of processors 2 4 6 8 10 12 14

  19. Summary of this study • We proposed a new algorithm for the nonlinear eigenvalue problem based on complex contour integration. • Our algorithm can find all the eigenvalues within a closed curve on the complex plane. Moreover, it has large-grain parallelism and is expected to show excellent parallel performance. • These advantages have been confirmed by numerical experiments.

  20. Future work • Performance evaluation on large-scale grid environments. • Application to practical problems. • Computation of scaling exponent in theoretical fluid dynamics • Development of an efficient algorithm for computing

More Related