1 / 23

Midterm 2 review

Midterm 2 review. Chapter 2 - 4. Instruction Set Architecture. Interface between the hardware and software Easy to program with and efficient to implement Regularity, simple tradeoff, constant operands Constant size, small instruction set What operations to include?

elenorj
Télécharger la présentation

Midterm 2 review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Midterm 2 review Chapter 2 - 4

  2. Instruction Set Architecture • Interface between the hardware and software • Easy to program with and efficient to implement • Regularity, simple tradeoff, constant operands • Constant size, small instruction set • What operations to include? • What type of operands to include ? • What addressing modes to include? • What memory addressing modes to include? • Should be optimized for the targeted application

  3. Operations • Computation • Add, sub, mult, div, shift, … • Control flow • Branch, jump, jal, ….

  4. Operands • Data type: int, float, immediate … • Internal storage • Stack, accumulator: old style • register-memory: • Smaller code size, but variable cycle per instruction and harder to encode both memory address and register in an instruction • register-register • Larger code size but relatively constant cycle per instruction. • List on page 98

  5. Memory Address Modes • Register • Immediate • Displacement • …. • A list of examples on page 104

  6. Branch prediction • 2 bit branch predictor • Correlating branch predictor • (m,n) predictor • Use last m prediction result to pick a n bit predictor • Tournament predictor • Branch Target Buffers

  7. Explore ILP • Dynamic • Tomasulo+Branch prediction=>Speculation • More info at dynamic for optimization, but smaller window • More hardware, more executable compatibility • Static • Bigger window but less info • Simple hardware, complex compiler.

  8. Example Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop

  9. Speculative Dynamic Machine specification • Issue rate of 1 • One broadcast per cycle for CDB • branch takes 1 cycle, • Load takes 1 cycle, • integer alu takes 1 cycle, • float add takes 2 cycle • float multiply takes 3 cycle. • These cycle count doesn’t include write to CDB

  10. Reorder buffer Cycle 0 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  11. Reorder buffer Cycle 1 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  12. Reorder buffer Cycle 2 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  13. Reorder buffer Cycle 3 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  14. Reorder buffer Cycle 4 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  15. Reorder buffer Cycle n Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  16. Reorder buffer Cycle n+1 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  17. Reorder buffer Cycle n+2 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop FP register status

  18. Reorder buffer Cycle n+3 Reservation table Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R1, Loop FP register status

  19. VLIW example Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop • Static machine specification • One delay slot between any true data flow dependency for floating point operations • One branch delay slot

  20. Register rename Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F0, 0(R2) Mult.D F0, F0, F2 S.D 0(R2), F0 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F1, 0(R2) Mult.D F1, F1, F2 S.D 0(R2), F1 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop

  21. Instruction reorder Loop: L.D F0, 0(R1) Add.D F0, F0, F2 S.D 0(R1), F0 L.D F1, 0(R2) Mult.D F1, F1, F2 S.D 0(R2), F1 SUBI R1, R1, 8 SUBI R2, R2, 8 BNEZ R2, Loop Loop: L.D F0, 0(R1) L.D F1, 0(R2) Add.D F0, F0, F2 Mult.D F1, F1, F2 S.D 0(R1), F0 S.D 0(R2), F1 SUBI R2, R2, 8 BNEZ R2, Loop SUBI R1, R1, 8 Loop can be unrolled to increase reorder freedom

  22. Software pipeline Code for one iteration. L.D F0, 0(R1) L.D F1, 0(R2) Add.D F0, F0, F2 Mult.D F1, F1, F2 S.D 0(R1), F0 S.D 0(R2), F1 SUBI R2, R2, 8 SUBI R1, R1, 8 BNEZ R2, Loop L.D F0, 0(R1) L.D F1, 0(R2) Add.D F0, F0, F2 Mult.D F1, F1, F2 S.D 0(R1), F0 S.D 0(R2), F1 SUBI R2, R2, 8 SUBI R1, R1, 8 BNEZ R2, Loop L.D F0, 0(R1) L.D F1, 0(R2) Add.D F0, F0, F2 Mult.D F1, F1, F2 S.D 0(R1), F0 S.D 0(R2), F1 SUBI R2, R2, 8 SUBI R1, R1, 8 BNEZ R2, Loop 8 copies

  23. Midterm detail • Take home • Available online on Monday12/9 morning • Due 12/16 11:59 pm • 3 questions

More Related