Accelerated Linear Algebra Libraries
Accelerated Linear Algebra Libraries. James Wynne III NCCS User Assistance. Accelerated Linear Algebra Libraries. Collection of functions to preform mathematical operations on matrices
Accelerated Linear Algebra Libraries
E N D
Presentation Transcript
Accelerated Linear Algebra Libraries James Wynne III NCCS User Assistance
Accelerated Linear Algebra Libraries • Collection of functions to preform mathematical operations on matrices • Designer has re-written the standard LAPACK functions to make use of GPU accelerators to speed up execution on large matrices
Accelerated Linear Algebra Libraries Magma Host Interface
MAGMA - Host • MAGMA stands for Matrix Algebra on GPU and Multicore Architecture • Developed by the Innovative Computing Laboratory at the University of Tennessee • Host interface allows easy porting from CPU libraries (like LAPACK) to MAGMA’s accelerated library • Automatically manages data allocation and transfer between CPU (Host) and GPU (Device)
MAGMA - Host • Fortran: • To run MAGMA functions from Fortran, an Interface block needs to be written for each MAGMA function that’s being called. These interfaces will be defined file magma.f90 • Example: module magma Interface Integer Function magma_sgesv(n,nrhs,…)& BIND(C,name=“magma_sgesv”) use iso_c_binding implicit none integer(c_int), value :: n integer(c_int), value :: nrhs … end Function end Interface end module
MAGMA - Host • Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Integer(C_INT) :: piv(3), ok, status !Fill your `A` and `b` arrays then call !MAGMA_SGESV status = magma_sgesv(3,1,A,3,piv,b,3,ok) !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Host • Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Integer(C_INT) :: piv(3), ok, status !Fill your `A` and `b` arrays then call !MAGMA_SGESV status = magma_sgesv(3,1,A,3,piv,b,3,ok) !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Host • Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Integer(C_INT) :: piv(3), ok, status !Fill your `A` and `b` arrays then call !MAGMA_SGESV status = magma_sgesv(3,1,A,3,piv,b,3,ok) !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Host • Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded • magma.f90: Contains the Interface block module • sgesv.f90: Contains the Fortran source code • To compile: $ module swap PrgEnv-pgiPrgEnv-gnu $ module load cudatoolkit magma $ ftn magma.f90 –lcuda –lmagma –lmagmablas sgesv.f90
MAGMA - Host • C: • In C source code, no kind of interface block is needed like in Fortran • Simply #include<magma.h> in your code • When declaring variables to use with MAGMA functions, use magma_int_tinstead of C’s inttype. Matrices for MAGMA’s SGESV are of type float
MAGMA - Host • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }
MAGMA - Host • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }
MAGMA - Host • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }
MAGMA - Host • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }
MAGMA - Host • Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded • To compile: $ module swap PrgEnv-pgiPrgEnv-gnu $ module load cudatoolkit magma $ cc –lcuda –lmagma –lmagmablassgesv.c
Accelerated Linear Algebra Libraries Magma Device Interface
MAGMA - Device • MAGMA Device interface allows direct control over how the GPU (device) is managed • Memory allocation and transfer • Keeping matrices on the device
MAGMA - Device • Fortran: • To run MAGMA device functions from Fortran, an Interface block needs to be written for each MAGMA function that’s being called. This is not required in C source code • Device functions suffixed with _gpu • CUDA Fortran used to manage memory on the device • All interface blocks need to be defined in a module (module magma) • If module exists in a separate file, file extension must be .cuf, just like the source file
MAGMA - Device • Example: module magma Interface Integer Function magma_sgesv_gpu(n,nrhs,dA…)& BIND(C,name=“magma_sgesv_gpu”) use iso_c_binding use cudafor implicit none integer(c_int), value :: n integer(c_int), value :: nrhs real (c_float), device, dimension(:)::dA(*) … end Function end Interface …
MAGMA - Device • Also need the MAGMA initialize function (defined in the same interface block module): … Interface Integer Function magma_init() & BIND(C,name=“magma_init”) use iso_c_binding implicit none end Function end Interface end module
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use cudafor use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Real(C_FLOAT),device,dimension(:,:) :: dA Real(C_FLOAT),device,dimension(:) :: dB Integer(C_INT),value :: piv(3), ok, status
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use cudafor use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Real(C_FLOAT),device,dimension(:,:) :: dA Real(C_FLOAT),device,dimension(:) :: dB Integer(C_INT),value :: piv(3), ok, status
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use cudafor use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Real(C_FLOAT),device,dimension(:,:) :: dA Real(C_FLOAT),device,dimension(:) :: dB Integer(C_INT),value :: piv(3), ok, status
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Device • Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program
MAGMA - Device • Before compiling, Make sure the MAGMA module, CUDA toolkit and the PGI programming environment is loaded • magma.cuf: Contains module of Interface blocks • sgesv.cuf: Contains the Fortran source • To compile: $ module load cudatoolkit magma $ ftnmagma.cuf –lcuda –lmagma –lmagmablassgesv.cuf
MAGMA - Device • C: • In C Device source code, no kind of interface block is needed like in Fortran • Simply #include<magma.h> in your code • When declaring variables to use with MAGMA functions, use magma_int_tinstead of C’s inttype. Matrices for MAGMA’s SGESV are of type float • Before running any MAGMA Device code, magma_init()must be called.
MAGMA - Device • C: • To interact with the device (Allocate matrices, transfer data, etc) use the built in MAGMA functions • Allocate on the device: magma_dmalloc() • Copy matrix to device: magma_dsetmatrix() • Copy matrix to host: magma_dgetmatrix()
MAGMA - Device • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);
MAGMA - Device • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);
MAGMA - Device • Example code for C: #include<magma.h> #include<stdio.h> int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);
MAGMA - Device • Example code for C: //Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b }
MAGMA - Device • Example code for C: //Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b }
MAGMA - Device • Example code for C: //Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b }
MAGMA - Host • Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded • To compile: $ module swap PrgEnv-pgiPrgEnv-gnu $ module load cudatoolkit magma $ cc –lcuda –lmagma –lmagmablassgesv_gpu.c