1 / 27

Gaussian Elimination

Gaussian Elimination. By Yequn Zhang, Yu Zhang. Contents. Introduction Problem Analysis Proposed Algorithm Evaluation. Contents. Introduction Problem Analysis Proposed Algorithm Evaluation. Gaus sian Elimination. Forward Elimination Back Substitution. Contents. Introduction

cruz
Télécharger la présentation

Gaussian Elimination

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gaussian Elimination By Yequn Zhang, Yu Zhang

  2. Contents • Introduction • Problem Analysis • Proposed Algorithm • Evaluation

  3. Contents • Introduction • Problem Analysis • Proposed Algorithm • Evaluation

  4. GaussianElimination • Forward Elimination • Back Substitution

  5. Contents • Introduction • ProblemAnalysis • Proposed Algorithm • Evaluation

  6. Problem Analysis • Data size used by kernels changes continuously • Difficult to find an appropriate block size to avoid divergence • Block-based approach • Assign a certain part of computation running on CPU-leave the irregularity to cpu • Manually make the data size changes with a step of block size • Block number per grid is easy to set

  7. Contents • Introduction • Problem Analysis • ProposedAlgorithm • Evaluation

  8. Forward Elimination • A block-based approach • Try to avoid divergence • Try to use GPU • Try to be fine-grained

  9. K 1 Find Max Row

  10. Now start to eliminate the block of data on cpu cpu Swap

  11. Calculate coefficients

  12. Elimination on CPU

  13. K 1 Calculate Coefficients

  14. K 2 K2 Elimination on CPU

  15. K3 K 3 Swap on GPU

  16. K4 K 4 Elimination on GPU

  17. K5 K 5 Elimination on GPU

  18. Intra-blockloop

  19. Inter-block loop

  20. Last inter-block loopprocessedon CPU

  21. BackSubstitution • Launch kernel when number of coefficients per row exceeds four block size (64*4=256) • A fine-grained way, use a similar way as forward elimination, part on CPU and part on GPU

  22. Contents • Introduction • Problem Analysis • Proposed Algorithm • Evaluation

  23. Block size effect

  24. The contribution of swap and find max row • Is it necessary to implement every part on GPU?

  25. Performance breakdown • Contribution of each part to the total performance,including kernels as well as CPU part

  26. Speedup

  27. Questions ?

More Related