1 / 16

High Level OpenCL Implementation

High Level OpenCL Implementation. By: Matthew Royle Supervisor: Prof. Shaun Bangay. Introduction. Multi-core CPUs Sequential algorithms to parallel algorithms GPUs used for more than just graphics Use of GPGPUs (General-Purpose Graphics Processing Unit). Introduction Cont….

Télécharger la présentation

High Level OpenCL Implementation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Level OpenCL Implementation By: Matthew Royle Supervisor: Prof. Shaun Bangay

  2. Introduction • Multi-core CPUs • Sequential algorithms to parallel algorithms • GPUs used for more than just graphics • Use of GPGPUs (General-Purpose Graphics Processing Unit)

  3. Introduction Cont… • Parallel programming languages for specific architectures, namely NVIDIA’s CUDA • Lack of a multi-platform open language • The OpenCL (Open Computing Language) standard • Heterogenous Parallel Programming

  4. Problem Statement • Parallel nature of GPUs • No Implementation • Implement OpenCL using existing technologies • High level translator • Use Parallel Frameworks

  5. Parallel Nature of GPUs

  6. Rationale and Motivation • GPU most likely form of implementation • NVIDIA and AMD plan to include OpenCL • Future Apple iPhones • Lack of implementation on CPU architecture

  7. Project Aims • Select a parallel processing framework • Create a high level translator • Create valid tests • Run created tests

  8. Proposed Implementation Method - OpenCL _kernel int add_vect (); //create computation unit cl_cmd_queue cmd_queue = CreateCommandQueue(); //create computation queue clEnqueueTask(kernel,i); //enqueue task and execute

  9. Proposed Implementation Method – C cl_cmd_queue CreateCommandQueue() { return cmd_queue[]; } void clEnqueueTask(kernel,i) { cmd_queue[i] = kernel; } #pragma omp parallel for{ for(int k = 0; k < cmd_queue.length; k++) Execute(cmd_queue[k]); }

  10. Possible Tests • John Conway’s Game Of Life • Fractal Flame algorithm

  11. Tools • OpenMP (Open Multi-Processing) framework • Parallel Processing Framework • Available with the GNU Compiler Collection • Free! • OpenCL header files

  12. OpenCL Header File Sample /* scalar types */ typedef int8_t cl_char; typedef uint8_t cl_uchar; typedef int16_t cl_short __attribute__((aligned(2))); typedef uint16_t cl_ushort __attribute__((aligned(2))); typedef int32_t cl_int __attribute__((aligned(4))); typedef uint32_t cl_uint __attribute__((aligned(4))); typedef int64_t cl_long __attribute__((aligned(8))); typedef uint64_t cl_ulong __attribute__((aligned(8))); typedef uint16_t cl_half __attribute__((aligned(2))); typedef float cl_float __attribute__((aligned(4))); typedef double cl_double __attribute__((aligned(8)));

  13. OpenMP example code //hello.c #include <omp.h> #include <stdio.h> int main() { #pragma omp parallel num_threads(10) printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads()); }

  14. OpenMP example code output

  15. Possible Extensions • Improve performance • Evaluation of OpenCL on various Architectures • Heterogenous execution

  16. Key Points • Lack of multi-platform open language • OpenCL standard • Most implementations for GPU • Implementation for CPU • High Level Translator • Use OpenMP framework

More Related