1 / 40

Designing a Pipelined CPU

Read 4.7 for next class. Designing a Pipelined CPU. Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Review -- Single Cycle CPU. Review -- Multiple Cycle CPU. Ifetch.

Télécharger la présentation

Designing a Pipelined CPU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Read 4.7 for next class Designing a Pipelined CPU Peer Instruction Lecture Materials for Computer ArchitecturebyDr. Leo Porteris licensed under aCreative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

  2. Review -- Single Cycle CPU

  3. Review -- Multiple Cycle CPU

  4. Ifetch Reg/Dec Exec Mem Wr Ifetch Reg/Dec Exec Mem Wr Ifetch Reg/Dec Exec Wr Review -- Instruction Latencies • Single-Cycle CPU Load • Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Load Add

  5. Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Instruction Latencies and Throughput • Single-Cycle CPU • Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 • Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

  6. Instruction Latencies and Throughput Which of the statements below is true about a pipelined processor?

  7. Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Instruction Latencies and Throughput • Single-Cycle CPU • Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 • Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

  8. Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Instruction Latencies and Throughput • Single-Cycle CPU • Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 • Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

  9. Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Instruction Latencies and Throughput • Single-Cycle CPU • Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 • Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8

  10. Pipelining Advantages PI throughput Vs. latency • Higher maximum throughput • Higher utilization of CPU resources • But, more complicated datapath, more complex control(?)

  11. Pipelining Throughput PI Matching Peek Throughput 1. Longest instruction 2. Average instruction 3. Cycle time

  12. A Pipelined Datapath IF: Instruction fetch ID: Instruction decode and register fetch EX: Execution and effective address calculation MEM: Memory access WB: Write back

  13. Idea for a Pipelined Datapath

  14. IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU Execution in a Pipelined Datapath CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 IF ID EX MEM WB lw IF ID EX MEM WB lw lw lw lw

  15. IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU Execution in a Pipelined Datapath CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 Make sure to point Out Steady State CPI = 1 IF ID EX MEM WB lw IF ID EX MEM WB lw lw lw lw steady state

  16. Pipeline Stages Should we force every instruction to go through all 5 stages? Can we break it up like we did for multi-cycle, with R-type taking 4 cycles instead of 5?

  17. IM Reg DM Reg ALU IM Reg Reg ALU Mixed Instructions in the Pipeline CC1 CC2 CC3 CC4 CC5 CC6 lw add

  18. IM Reg DM Reg ALU Pipeline Principles • All instructions that share a pipeline must have the same stages in the same order. • therefore, add does nothing during Mem stage • sw does nothing during WB stage • All intermediate values must be latched each cycle. • There is no functional block reuse IF ID EX MEM WB

  19. Pipelined Datapath Instruction Fetch Instruction Decode/ Register Fetch Execute/ Address Calculation Memory Access Write Back registers!

  20. The Pipeline in Execution add $10, $1, $2 Instruction Decode/ Register Fetch Execute/ Address Calculation Memory Access Write Back Draw active datapath through example

  21. The Pipeline in Execution lw $12, 1000($4) add $10, $1, $2 Execute/ Address Calculation Memory Access Write Back

  22. The Pipeline in Execution sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Memory Access Write Back

  23. The Pipeline in Execution Instruction Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Write Back

  24. The Pipeline in Execution Instruction Fetch Instruction Decode/ Register Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2

  25. The Pipeline in Execution Instruction Fetch Instruction Decode/ Register Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Write Register is wrong Did someone spot the error??

  26. The Pipeline in Execution Instruction Fetch Instruction Decode/ Register Fetch Execute/ Address Calculation sub $15, $4, $1 lw $12, 1000($4)

  27. The Pipeline, with controls But….

  28. Pipeline Control Control Logic X. Combinational Logic Y. FSM or Microprogram

  29. control instruction IF/ID ID/EX EX/MEM MEM/WB Pipelined Control

  30. Pipelined Control Signals

  31. The Pipeline with Control Logic

  32. Is it really this simple?

  33. IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU ALU CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 sub $2, $1, $3 and $12, $2, $5 or $13, $6, $2 add $14, $2, $2 IM Reg DM sw $15, 100($2) • What just happened here which is problematic (BEST ANSWER)? • The register file is trying to read and write the same register • The ALU and data memory are both active in the same cycle • A value is used before it is produced • Both A and B • Both A and C Neg edge!

  34. IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU IM Reg DM Reg ALU ALU Data Hazards R2 Available • When a result is needed in the pipeline before it is available, a “data hazard” occurs. CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 sub $2, $1, $3 and $12, $2, $5 R2 Needed or $13, $6, $2 add $14, $2, $2 Use red and black lines! IM Reg DM sw $15, 100($2)

  35. Hazards continued • What happens when... add $3, $10, $11 lw $8, 1000($3) sub $11, $8, $7 Draw dependencies

  36. The Pipeline in Execution add $3, $10, $11 Execute/ Address Calculation Memory Access Write Back lw $8, 1000($3)

  37. The Pipeline in Execution sub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Memory Access Write Back

  38. The Pipeline in Execution add $10, $1, $2 sub $11, $8, $7 lw $8, 1000($3) • add $3, $10, $11 Write Back

  39. Pipelining Key Points • ET = IC * CPI * CT • We achieve high throughput without reducing instruction latency. • Pipelining exploits a special kind of parallelism (parallelism between functionality required in different cycles). • Pipelining uses combinational logic to generate (and registers to propagate) control signals. • Pipelining creates potential hazards.

  40. Summary comparison (MIPS designs) CPI CT Single Cycle Multi-cycle Pipeline

More Related