1 / 32

Multi-Core Development

Multi-Core Development. Kyle Anderson. Overview. History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism. History. First 4 bit microprocessor – 1971 60,000 instructions per second 2,300 transistors First 8 bit microprocessor – 1974 290,000 instructions per second

talib
Télécharger la présentation

Multi-Core Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Core Development Kyle Anderson

  2. Overview • History • Pollack’s Law • Moore’s Law • CPU • GPU • OpenCL • CUDA • Parallelism

  3. History • First 4 bit microprocessor – 1971 • 60,000 instructions per second • 2,300 transistors • First 8 bit microprocessor – 1974 • 290,000 instructions per second • 4,500 transistors • Altair 8800 • First 32 bit microprocessor – 1985 • 275,000 transistors

  4. History • First Pentium processor released – 1993 • 66 MHz • Pentium 4 released – 2000 • 1.5 GHz • 42,000,000 transistors • Approach 4GHz 2000 - 2005 • Core 2 Duo released – 2006 • 291,000,000 tranisitors

  5. History

  6. Pollack’s Law • Processor Performance grows with square root of area

  7. Pollack’s Law

  8. Moore’s Law • “The Number of transistors incorporated in a chip will approximately double every 24 months.” – Gordon Moore, Intel co-founder • Smaller and smaller transistors

  9. Moore’s Law

  10. CPU • Sequential • Fully functioning cores • 16 cores maximum Currently • Hyperthreading • Little Latency

  11. GPU • Higher latency • Thousands of cores • Simple calculations • Used for research

  12. OpenCL • Multitude of Devices • Run-time compilation ensures most up to date features on device • Lock-Step

  13. OpenCL Data Structures • Host • Device • Compute Units • Work-Group • Work-Item • Command Queue • Kernel • Context

  14. OpenCL Types of Memory • Global • Constant • Local • Private

  15. OpenCL

  16. OpenCL Example

  17. OpenCL Example

  18. OpenCL Example

  19. CUDA • NVidia's proprietary API for their GPU’s • Stands for “Compute Unified Device Architecture” • Compiles directly to hardware • Used by Adobe, Autodesk, National Instruments, Microsoft and Wolfram Mathematica • Faster than OpenCL because compiled directly on hardware and focus on a single architecture.

  20. CUDA Indexing

  21. CUDA Example

  22. CUDA Example

  23. CUDA Example

  24. CUDA Function Call cudaMemcpy( dev_a, a, N * sizeof(int),cudaMemcpyHostToDevice ); cudaMemcpy( dev_b, b, N * sizeof(int),cudaMemcpyHostToDevice ); add<<<N,1>>>( dev _ a, dev _ b, dev _ c );

  25. Types of Parallelism • SIMD • MISD • MIMD • Instruction parallelism • Task parallelism • Data parallelism

  26. SISD • Stands for Single Instruction, Single Data • Does not use multiple cores

  27. SIMD • Stands for “Single Instruction, Multiple Data Streams” • Can be process multiple data streams concurrently

  28. MISD • Stands for “Multiple Instruction, Single Data” • Risky because several instructions are processing the same data

  29. MIMD • Stands for “Multiple Instruction, Multiple Data” • Instructions are processed sequentially

  30. Instruction Parallelism • Mutually exclusive • MIMD and MISD often use this • Allows multiple instructions to be run at once • Instructions considered operations • Not programmatically done • Hardware • Compiler

  31. Task Parallelism • Dividing up of main tasks or controls • Runs multiple threads concurrently

  32. Data Parallelism • Used by SIMD and MIMD • A list of instructions is able to work concurrently on a several data sets

More Related