1 / 37

OpenCL ch. 2~4

OpenCL ch. 2~4. Jongeun Lee. Fall 2013. Preparation. download & build instructions https:// code.google.com/p/opencl-book-samples/wiki/Installation source code download svn checkout  http ://opencl-book-samples.googlecode.com/svn/trunk/ opencl -book-samples-read-only build instructions.

hedwig
Télécharger la présentation

OpenCL ch. 2~4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OpenCLch.2~4 Jongeun Lee Fall 2013

  2. Preparation • download & build instructions • https://code.google.com/p/opencl-book-samples/wiki/Installation • source code download • svn checkout http://opencl-book-samples.googlecode.com/svn/trunk/ opencl-book-samples-read-only • build instructions

  3. Code overview

  4. Code overview • note • some shortcuts • using only one platform, one device, etc. • error code • cl… functions returning cl_xxx objects: last arg • others: return value

  5. Platforms, Devices • platform • profile • full profile, embedded profile • capabilities of particular OpenCL version supported • how to query…

  6. Contexts • Context is container for • associated devices • memory objects (buffers, images) • command queues (interface) • Context can be created • with multiple devices • from a single platform • may use multiple contexts… • no auto data sharing • updated as program progresses

  7. OpenCL C • new features • new types • address space qualifiers • built-in functions

  8. New Features • vector data types • easy to use (supports operators similar to scalars’) • increase portability • address space qualifiers • additions for parallelism • work-items, work-groups, sync, etc. • images, samplers • a set of built-in functions

  9. Scalar types • int: 32-bit (fixed) • double: optional • half: 16-bit floating point • requires explicit type casting to/from float • vload_half(float), vstore_half(half) • size_t, ptrdiff_t, intptr_t, uintptr_t • size_t: result of sizeof • ptrdiff_t: result of subtracting two pointers • size_t, ptrdiff_t: 32-bit if CL_DEVICE_ADDRESS_BITS defined in clGetDeviceInfo is 32-bits, and 64-bit if it is 64-bits • intptr_t, uintptr_t: signed/unsigned integer for holding void pointer

  10. Vector types • {scalartype}N, where N = 2, 3, 4, 8, 16 • float8, long4, etc. • aligned to a power of 2 bytes • char3: has 4-byte size, aligned to 4-byte boundary • int3: has 16-byte size, aligned to 16-byte boundary

  11. API types • w/o API type • size_t • ptrdiff_t • intptr_t • uintptr_t

  12. Vector literals • examples • valid forms

  13. more examples • WRONG (compile error) • vector literal cannot be used as l-value

  14. Vector components • using .xy, .xyz, .xyzw

  15. Vector components • using numeric indexes: .[sS][0-9, A-F, a-f] float8 f f.s7 • cannot intermix numeric indexes with .xyzw • using .lo/.hi, .odd/.even

  16. Other data types • image & sampler types have no array or pointer types • no image & sampler in a struct

  17. Derived types • Generally, supported • struct type cannot contain pointer if it (or its pointer) is used as an argument to kernel • okay as a variable inside kernel

  18. Implicit type conversion • no surprise here (same as C99) • not allowed between built-in vector types

  19. Usual arithmetic conversion • C99 defines only scalars int a; short b; int c = a + b; • vectors • compile error if operands are more than one vector type • if there is only a single vector type and all the other operands are scalar, scalars are converted* and widened to vectors • *compile-error if any scalar has greater rank than vector element

  20. WRONG

  21. Explicit conversion • illegal between vector types • scalar to vector is okay

  22. There is a way… • motivation • solution

  23. in general • example

  24. Reinterpreting data • examples • can do some neat stuffs

  25. Vector operators • note • va < vb • va && vb

  26. Function qualifiers • kernel (or __kernel) void … • kernel’s return type must be void • kernel functions containing local variables cannot be called from another kernel function

  27. better way

  28. Address space qualifiers • global, local, constant, private (or __{}): disjoint • argument, local variable of a function: private • program-scope variable, static variable inside a function: global*/constant • pointerargument to kernel: must point to global, local, or constant • ERROR if NOT specified • no address space for nK function’s returnvalue (ERROR if specified)

  29. Global address space • for memory objects (buffers and images) • buffer can be declared as a pointer to scalar/vector/struct • “global” should not be specified for images • pointer to global addr space: allowed • as argument to K/nK function • as variable declared inside function • variabledeclared inside funct. cannot be allocated in global

  30. Constant address space • for variables allocated in global that are accessed read-only • accessible to all work-items • string literals are in constant • images cannot be allocated in constant • pointer to constant addr space: allowed • as argument to K/nK functions • as variable declared inside functions • variable in the outermost scope of kernel can be declared in constant (with proper initialization)

  31. In OpenCL 1.0: “All program scope variables must be declared in the constant address space.”  In OpenCL 2.0: global/constant ; global by default

  32. Local address space • for variables shared (r/w) by all work-items of a work-group • not across work-groups • allocated in local memory • local memory: similar to scratchpad memory • pointer to local addr space: allowed • as argument to K/nK function • as variable declared inside function • variables declared inside a kernel can be allocated in local • must be declared at kernel function scope • cannot be initialized

  33. Private address space • variables declared inside kernel without address space qualifier • all variables declared inside non-kernel • all function arguments

  34. Casting between addr spaces? • it’s not allowed • example

  35. Access & type qualifiers • access qualifiers: read_only, write_only • specified with image arguments • NO read_write • image reads are cached in texture cache • but image writes do not update texture cache • type qualifiers: const, volatile, restrict • cannot be used with image types

More Related