1 / 53

Chapter 4:

Chapter 4:. 22540 - Computer Arch. & Org. (2). Instruction-Level Parallelism. CPU Operation Review. 0 MUX 1. 4. Opcode │ Operands. Shift Left 2. Register File. R s. PC. Sel A. Data A. R t. Addr. Data Memory. 1 MUX 0. Data B. Sel B. 0 MUX 1. Addr. Data. Data. R t.

gratia
Télécharger la présentation

Chapter 4:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4: 22540 - Computer Arch. & Org. (2) Instruction-Level Parallelism

  2. CPU Operation Review 0 MUX 1 4 Opcode │ Operands Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  3. CPU Operation Review ADD R3, R1, R2 0 MUX 1 4 0 │ Rs│ Rt│ Rd│ Function Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  4. CPU Operation Review ADD R2, R1, +5 0 MUX 1 4 Opcode │ Rs│ Rt│ Immediate Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  5. CPU Operation Review LD R2, M[R1 + 5] 0 MUX 1 4 Opcode │ Rs│ Rt│ Address Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  6. CPU Operation Review ST M[R1 – 4], R2 0 MUX 1 4 Opcode │ Rs│ Rt│ Address Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  7. CPU Operation Review JMP + 3 0 MUX 1 4 Opcode │ Rs│ 0 │ Offset Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  8. CPU Operation Review JE R1, R2, + 3 0 MUX 1 4 Opcode │ Rs│ Rt│ Offset Shift Left 2 Register File Rs PC SelA Data A Rt Addr Data Memory 1 MUX 0 Data B SelB 0 MUX 1 Addr Data Data Rt MUX Rd Sel C Instruction Memory Data Data C Adder Adder ALU Offset, Addr, Immediate Sign Extend

  9. Pipelining • Non Pipelined Process Execute Fetch Instr. Get Operands Store Result Instr.1 Instr.1 Instr.1 Instr.1 ƮALU ƮMem ƮMem ƮReg Register File PC Instr.Mem. ALU Data Mem.

  10. Pipelining • Non Pipelined Process Execute Fetch Instr. Get Operands Store Result Instr.2 Instr.2 Instr.2 Instr.2 ƮALU ƮMem ƮMem ƮReg Register File PC Instr.Mem. ALU Data Mem.

  11. Pipelining • Non Pipelined Process • Clock Period = • CPI (Clocks per Instruction) = ƮALU ƮMem ƮMem ƮReg Register File PC Instr.Mem. ALU Data Mem.

  12. Pipelining • Pipelined Process Execute Fetch Instr. Get Operands Store Result Instr.2 Instr.2 Instr.5 Instr.4 Instr.1 Instr.3 Instr.3 Instr.2 Instr.1 Instr.4 Instr.1 Instr.2 Instr.3 Instr.1 ƮALU ƮMem ƮMem ƮReg IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  13. Pipelining • Pipelined Process IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  14. Pipelining • Pipelined Process • Clock Period = • CPI = ƮALU ƮMem ƮMem ƮReg IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  15. Pipelining Hazards • Structural Hazards Hardware can’t support instruction combination at a certain time. Example: IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  16. Pipelining Hazards • Data Hazards One instruction has to wait for another to complete. Example: IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  17. Pipelining Hazards • Data Hazards One instruction has to wait for another to complete. IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  18. Pipelining Hazards • Data Hazards One instruction has to wait for another to complete. Forwarding: ADD R3, R1, R2 IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  19. Pipelining Hazards • Control Hazards Decision depends on the result of unfinished instruction. Example: IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  20. Pipelining Hazards • Control Hazards Decision depends on the result of unfinished instruction. • Stall • Predict • Delayed Branch IR Register File Result PC Instr.Mem. ALU Data Mem. X Y

  21. Multiple Issue • Multiple Instructions Execution (in single clock) • CPI < 1 or IPC > 1. • Static / Dynamic • Speculation Example:

  22. Static Multiple Issue • Compiler Assisted • Issue Packet • Set of instructions issued in a given clock cycle. • Simply, one large instruction with multiple operations. Very Long Instruction Word (VLIW)

  23. Single-Issue Datapath 4 + ST IF ID + EX ShiftLeft2 Register File DataA Sel A A L U Sel B DataB PC Instr.Mem. 0 1 2 Data Mem. ADDR Sel C 0 1 0 1 Data DataC Data Exception Address SignExtend

  24. Two-Issue Datapath + IF ID EX ST 4 + A L U 0 1 Sel A1 DataA1 Sel B1 DataB1 01 Data Mem. Sel A2 DataA2 Data PC Instr.Mem. 0 1 2 Sel B2 DataB2 + Sel C1 ADDR Sel C2 Data C1 Data C2 Exception Address SignExtend SignExtend

  25. Two-Issue Datapath Example • Two 32-bit instructions ALU/JMP LD/ST • NOP Replacement Example:

  26. Single-Issue Datapath Example 4 + IF ID + EX ST Register File DataA SelA A L U PC Instr.Mem. 0 1 2 SelB DataB Data Mem. ADDR 0 1 0 1 Data Sel C Data C Data SignExtend Exception Address

  27. Single-Issue Datapath Example Original Code:

  28. Single-Issue Datapath Example Optimized Code:

  29. Two-Issue Datapath Example + 4 + IF ID A L U EX ST 0 1 Sel A1 DataA1 Sel B1 DataB1 01 Data Mem. Sel A2 DataA2 Data PC Instr.Mem. 0 1 2 Sel B2 DataB2 + Sel C1 ADDR Sel C2 Data C1 Data C2 Exception Address SignExtend SignExtend

  30. Two-Issue Datapath Example

  31. Dynamic Multiple Issue • Compiler Assisted (to move dependencies apart) • Hardware Decided • 0, 1 or more instructions issued in a given clock cycle. Superscalar Processors. • Compiled code runs correctly independent of the issue rate or pipeline structure.

  32. Dynamic Pipeline Scheduling • Extension to Dynamic Multiple Issue • Hardware Decided • Choose which instruction to execute in a given clock cycle. • Compiled code runs correctly independent of the issue rate or pipeline structure Example:

  33. Dynamic Pipeline Scheduling • Instruction Fetch, Decode & Issue Unit • Multiple Functional Units • Commit Unit InstructionFetch & Decode ReservationStation ReservationStation ReservationStation ReservationStation IntegerFunctional Unit IntegerFunctional Unit Floating Point Functional Unit Floating Point Functional Unit Commit Unit

  34. Dynamic Pipeline Scheduling • Out-of-Order (O-o-O) Execution An operand may be in a register, reorder buffer or yet to be produced by a functional unit. • In-Order Issue • In-Order Commit

  35. Speculative Execution • Hardware-Based • Branch Predictions • Load Addresses • In-Order Commit • Assures correctness in case of wrong prediction

  36. Out-of-Order Scheduling Scoreboard • Scoreboarding (CDC 6600) Pipeline: • IF • IS • RD • EX • WB Scoreboard RegisterFile Floating Point Multiply Floating Point Multiply Floating Point Divide Floating Point Add Integer Unit

  37. Out-of-Order Scheduling Scoreboard • Scoreboarding (CDC 6600) Pipeline: • IF • IS • RD • EX • WB Instruction Issue: • If the functional unit is available. • If no other active instruction has the same destination register. 

  38. Out-of-Order Scheduling Scoreboard • Scoreboarding (CDC 6600) Pipeline: • IF • IS • RD • EX • WB Read Operands: • No previously issued instruction has my operand as its destination. 

  39. Out-of-Order Scheduling Scoreboard • Scoreboarding (CDC 6600) Pipeline: • IF • IS • RD • EX • WB Example: Write Back Results: • Stalls instructions which write results to registers pending reads. 

  40. Out-of-Order Scheduling Example Example:

  41. Out-of-Order Scheduling Example Example:

  42. Out-of-Order Scheduling Example Example:

  43. Out-of-Order Scheduling Example Example:

  44. Out-of-Order Scheduling Example Example:

  45. Out-of-Order Scheduling Example Example:

  46. Out-of-Order Scheduling Example Example:

  47. Out-of-Order Scheduling Example Example:

  48. Out-of-Order Scheduling Example Example:

  49. Out-of-Order Scheduling Example Example:

  50. Out-of-Order Scheduling Example Example:

More Related