1 / 28

Hardware programming

Hardware programming. Hardware and programming models SIMD MIMD Dataflow Transformation grammars. SIMD hardware. SIMD array processor. SIMD instructions. Drawing a rectangle using one proc/pixel. Each array processor has an id (IDx, IDy) Drawing (x1, y1) (x2, y2)

totie
Télécharger la présentation

Hardware programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hardware programming • Hardware and programming models • SIMD • MIMD • Dataflow • Transformation grammars Hardware programming languages August 20, 2014

  2. SIMD hardware Hardware programming languages August 20, 2014

  3. SIMD array processor Hardware programming languages August 20, 2014

  4. SIMD instructions Hardware programming languages August 20, 2014

  5. Drawing a rectangle using one proc/pixel • Each array processor has an id (IDx, IDy) • Drawing (x1, y1) (x2, y2) • Use R0,R1 as condition registers • SUB R0  IDx – x1 • SUB R1  x2 – IDx • AND R0  R0 & R1 • SHR R0  R0 >> 31 • LD if R0 then pixel  1 • LD if !R0 then pixel  0 • Similar operations for triangles, circles, etc. Hardware programming languages August 20, 2014

  6. High-pass filter • Subtract adjacent pixels • LD R0  pixel • LD Nleft  R0 • SUB R0  R0 – Nright • LD pixel  R0 • Other kernel applications are similar Hardware programming languages August 20, 2014

  7. Matrix multiplication • Assume the matrix elements are in R1, R2 and the result should be in R0 • O(N) for square matrices of dimension NxN • Wraparound neighbors • LD R0  #0 • LD Nleft  R1 • LD Ntop  R2 • --cpu only-- for(i = 0; i != N; i++) { • LD R3  Nbot • LD Ntop  R3 • LD R4  Nright • LD Nleft  R4 • ADD R0 += R3 * R4 • } Hardware programming languages August 20, 2014

  8. SIMD source languages • Major changes • Some data is SIMD array data stored on the array processors • The CPU alternates between CPU operations and SIMD operations • The array processors do not branchso both branches of conditionals are always executed Hardware programming languages August 20, 2014

  9. SIMD language examples • Matrix multiplication • void mmult(simd int[N][N] m1, m2, m3) { • for(i : [0..N]) • m1[i][i] = m2[-][i] * m3[i][-]; • } • Drawing • void rect(int x1, int y1, int x2, int y2) { • for(i : [0..N]) • pixel[i][i]  (x1 <= i <= x2) && (y1 <= i <= y2) • } Hardware programming languages August 20, 2014

  10. Programming language history • Late 70s – early 90s • Tremendous effort to parallelize FORTRAN • DO 10 I=1, 100 • DO 10 J=1, 100 • 10 M(I, J) += A(I, J) * B(J, I) • This was really hard, and eventually ineffective • Much code was rewritten in C • SIMD programs require explicit arrays • But most current code is written in C… • Define a new language, or parallelize C? Hardware programming languages August 20, 2014

  11. Vector processors Hardware programming languages August 20, 2014

  12. Vector instructions • Vector arrays v[0..N] • ADD for i in [0..N] do v1[i]  v2[i] + v3[i] • MUL for i in [0..N] do v1[i]  v2[i] * v3[i] • ACC for i in [0..N] do v1 += v2[i] * v3[i] • Conditional operations • ADDcc for i in [0..N] do • if v4[i] then v1[i]  v2[i] + v3[i] • Scatter (some ~1985 Toshiba machines) • ADD for i in [0..N] do • v1[v4[i]]  v2[i] + v3[i] • Gather (same) • ADD for i in [0..N] do • v1[i]  v2[v4[i]] + v3[i] Hardware programming languages August 20, 2014

  13. Vector programming languages • As before, the basic data type is a vector of numbers (floats or ints) • Matrix multiply in a serial language • Inner product method O(n3) Hardware programming languages August 20, 2014

  14. MMult with middle products • Exchange the loop nesting Hardware programming languages August 20, 2014

  15. Vectorizing the middle product Hardware programming languages August 20, 2014

  16. MMult with outer products • Move the “k” loop to the outside Hardware programming languages August 20, 2014

  17. Vectorizing the outer product Hardware programming languages August 20, 2014

  18. Vector processing • The language is a language of vectors (arrays, matrices, multi-dimensional matrices, etc.) • Most current code is not written with vectors (but it could be) • Different vector organizations will give different performance on different hardware • Vector ops take (highly optimized) linear time • Vector programming is a form of SIMD programming • Can add conditionals (if c[0..N] then a[0..N] + b[0..N]) Hardware programming languages August 20, 2014

  19. Dataflow machines • Suppose you have a system with: • A programmable hardware device • Some finite amount of memory • Multiple special-purpose processors • What do you have? • An nVIDIA GPU… • A processor-in-memory… • An FPGA… • Good: • Easy to build (current tech gives 1000s of processors) • Extreme speed • Bad: • How do you program this thing? Hardware programming languages August 20, 2014

  20. Dataflow model Hardware programming languages August 20, 2014

  21. Dataflow vs von Neumann Hardware programming languages August 20, 2014

  22. System-on-a-chip • Programmable FPGAs • Small finite storage • Master CPU • I/O • Large amount of interconnect Hardware programming languages August 20, 2014

  23. Extracting dataflow from serial programs • Build the dataflow graph (for example, CS134b register allocation) • This method doesn’t really work • Poor parallelism • Memory allocation is a problem Hardware programming languages August 20, 2014

  24. Writing dataflow programs • Draw “circuit” diagrams • Use discrete-event-simulation models Hardware programming languages August 20, 2014

  25. Dataflow programs Hardware programming languages August 20, 2014

  26. PL and hardware • Hardware-specific programming languages • Can be fast, often not portable • Software usually has to be rewritten when platform changes • A lot of C programs are still rewritten for new platforms • Hardware • Extreme hardware design is limited by the software community • How does nVIDIA get away with it? Hardware programming languages August 20, 2014

  27. Transformation grammars Hardware programming languages August 20, 2014

  28. Transformation grammars Hardware programming languages August 20, 2014

More Related