1 / 55

EEM 486 : Computer Architecture Lecture 4 Designing a Multicycle Processor

EEM 486 : Computer Architecture Lecture 4 Designing a Multicycle Processor. Processor. Input. Control. Memory. Datapath. Output. The Big Picture. Designing a Multiple Clock Cycle Datapath. OPcode. Control Logic / Store (PLA, ROM). Decode. microinstruction. Conditions.

kameko-mack
Télécharger la présentation

EEM 486 : Computer Architecture Lecture 4 Designing a Multicycle Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EEM 486: Computer ArchitectureLecture 4Designing a Multicycle Processor

  2. Processor Input Control Memory Datapath Output The Big Picture • Designing a MultipleClock CycleDatapath

  3. OPcode Control Logic / Store (PLA, ROM) Decode microinstruction Conditions Instruction Control Points Datapath Single-CycleProcessor In our single-cycle processor, each instruction is realized by exactly one control command or microinstruction

  4. Main Control op ALU control fun ALUSrc Equal ExtOp MemWr MemWr RegWr MemRd RegDst nPC_sel ALUctr Reg. Wrt ALU Register Fetch Ext Mem Access PC Instruction Fetch Next PC Data Mem Abstract View ofSingle Cycle-Processor

  5. What’s Wrong with CPI=1 Processor? Arithmetic & Logical PC Inst Memory Reg File ALU setup • Long Cycle Time • All instructions take as much time as the slowest • Real memory is not as nice as our idealized memory • Cannot always get the job done in one (short) cycle mux mux Load PC Inst Memory Reg File ALU Data Mem setup mux mux Critical Path Store Inst Memory PC Reg File ALU Data Mem mux Branch PC Inst Memory Reg File cmp mux

  6. Storage Array selected word line storage cell address bit line address decoder sense amps mem. bus proc. bus memory L2 Cache Cache Processor 1 time-period 20 - 50 time-periods 2-3 time-periods Memory Access Time • Physics => fast memories are small (large memories are slow) • => Use a hierarchy of memories

  7. Multicycle Approach • Break up the instructions into steps: • Let each step take one “smaller” clock cycle - Balancethe amount of work to be done - Restrict each cycle to use only one major functional unit Major functional units: Memory, Register File, and ALU • Let different instructions take different numbers of cycles • Use a functional unit more than once within execution of one instruction (Less hardware) • A single memory unit for both instructions and data • A single ALU, rather than an ALU and two adders • At the end of a cycle • store values for use in later cycles • introduce additional “internal” registers

  8. Equal Partitioning the CPI=1 Datapath • Add registers between smallest steps MemWr MemWr RegWr MemRd RegDst nPC_sel ALUSrc ExtOp ALUctr Reg. File Exec Operand Fetch Instruction Fetch Mem Access PC Next PC Data Mem Write back Memory access Execution Instruction fetch Decode and Operand fetch

  9. Recall: Step-by-step Processor Design Step 1: ISA => Logical Register Transfers Step 2: Components of the Datapath Step 3: RTL + Components => Datapath Step 4: Datapath + Logical RTs => Physical RTs Step 5: Physical RTs => Control

  10. Step 4: R-type (add, sub, . . .) inst Logical Register Transfers ADDU R[rd]<–R[rs] + R[rt]; PC <– PC + 4 Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2. Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] Step 3. Execution ALUOut ← A op B Step 4. Write-back R[rd] ← ALUOut

  11. MemRead=1 nPCWrite=1 IRWrite=1 Address PC Memory Instruction register MemData ALU Write Data 4 ALUctr=Add R-type - Fetch

  12. RegWrite=0 nPCWrite=0 MemRead=0 IRWrite=0 PC Address Instruction [25-21] Rs Read data 1 Memory A MemData Instruction [20-16] Rt Registers Write data B Read data 2 4 Instruction [15-11] Rw Write data Instruction register ALU ALUctr=x R-type – Decode/Register Fetch

  13. ALUSrcA=1 RegWrite=0 nPCWrite=0 MemRead=0 IRWrite=0 PC Rs Address Instruction [25-21] Read data 1 0 Memory A 1 Rt MemData Instruction [20-16] Registers ALU ALU Out Write data Read data 2 0 B Rw Instruction [15-11] 1 Write data 4 Instruction register ALUctr= Func ALUSrcB=0 R-type - Execution

  14. ALUSrcA=x RegWrite=1 nPCWrite=0 MemRead=0 IRWrite=0 PC Address Instruction [25-21] Rs A B Read data 1 Memory MemData Instruction [20-16] Rt ALU Out Registers Write data Read data 2 Instruction [15-11] Rw 4 Write data Instruction register ALU 0 0 1 1 ALUctr=x ALUSrcB=x R-type – Write Back

  15. Step 4: Logical immed inst Logical Register Transfers ORI R[rt] <– R[rs] OR ZExt(Im16);PC <– PC + 4 Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2. Instruction Decode and Register Fetch A ← R[rs] Step 3. Execution ALUOut ← A OR ZExt(Im16) Step 4. Write-back R[rt] ← ALUOut

  16. Logical immediate - Execution ALUSrcA=1 RegWrite=0 nPCWrite=0 MemRead=0 IRWrite=0 Address PC Instruction [25-21] Rs 0 Read data 1 Memory A 1 MemData Instruction [20-16] Rt ALU Out ALU Registers Write data Inst [15-11] 0 B Read data 2 Instruction [15-0] Rw 1 4 Write data Instruction register 2 Zero extend 16 32 ALUctr=Or ALUSrcB=2

  17. RegDst=0 ALUSrcA=x RegWrite=1 nPCWrite=0 MemRead=0 IRWrite=0 Address PC Instruction [25-21] Rs 0 Read data 1 Memory A 1 MemData Instruction [20-16] Rt ALU Out ALU Registers Write data 0 B Read data 2 0 Inst [15-11] Instruction [15-0] Rw 1 1 4 Write data Instruction register 2 Zero extend 16 32 ALUctr=x ALUSrcB=x Logical immediate – Write Back

  18. Step 4 : Load inst Logical Register Transfers LW R[rt] <– MEM[R[rs] + SExt(Im16)];PC <– PC + 4 Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2. Instruction Decode and Register Fetch A ← R[rs] Step 3. Memory address computation ALUOut ← A + SExt(Im16) Step 4. Memory access MDR ← Memory[ALUOut] Step 5. Load completion R[rt] ← MDR

  19. Load: Address Calculation RegDst=x ALUSrcA=1 RegWrite=0 nPCWrite=0 MemRead=0 IRWrite=0 Address PC Instruction [25-21] Rs 0 Read data 1 Memory A 1 MemData Instruction [20-16] Rt ALU Out Registers Write data ALU 0 B Read data 2 0 Inst [15-11] Instruction [15-0] Rw 1 1 4 Write data Instruction register 2 Zero/ Sign extend 16 32 ALUctr=Add ALUSrcB=2 ExtOp=Sign

  20. 0 1 0 1 Load: Memory Read RegDst=x ALUSrcA=x MemRead=1 RegWrite=0 nPCWrite=0 IRWrite=0 Instruction [31-26] Address 0 PC Instruction [25-21] Rs Memory Read data 1 1 A MemData Instruction [20-16] Rt ALU Out ALU Registers Write data 0 B Read data 2 Inst [15-11] Instruction [15-0] Rw 1 4 Write data Instruction register 2 MDR Extender 16 32 ALUctr=x ALUSrcB=x ExtOp=x IorD=1

  21. RegDst=0 ALUSrcA RegWrite=1 nPCWrite=0 MemRead=0 IRWrite=0 Address PC Instruction [25-21] Rs 0 0 Memory Read data 1 A 1 1 MemData Instruction [20-16] Rt ALU ALU Out Registers Write data 0 B Read data 2 0 Inst [15-11] Instruction [15-0] Rw 1 1 4 Write data Instruction register 2 0 1 MDR Extender 16 32 ALUctr=x ALUSrcB=x ExtOp=x MemtoReg=1 IorD=x Load: Write Back

  22. Step 4 : Store inst Logical Register Transfers SW MEM[R[rs] + SExt(Im16)] <– R[rt];PC <– PC + 4 Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2. Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] Step 3. Memory address computation ALUOut ← A + SExt(Im16) Step 4. Memory access Memory[ALUOut] ← B

  23. RegDst=x ALUSrcA=1 RegWrite=0 nPCWrite=0 MemRead=0 IRWrite=0 Address PC Instruction [25-21] Rs 0 0 Memory Read data 1 A 1 1 MemData Instruction [20-16] Rt ALU Out Registers Write data ALU 0 B Read data 2 0 Inst [15-11] Instruction [15-0] Rw 1 1 4 Write data Instruction register 2 0 1 MDR Extender 16 32 ALUctr=Add ALUSrcB=2 ExtOp=Sign MemtoReg=x IorD=x Store: Address Calculation

  24. RegDst=x ALUSrcA=x RegWrite=1 MemWrite=1 MemRead=0 nPCWrite=0 IRWrite=0 Address 0 PC Instruction [25-21] Rs 0 Memory Read data 1 1 A 1 MemData Instruction [20-16] Rt ALU ALU Out Registers Write data B 0 Read data 2 0 0 Inst [15-11] Instruction [15-0] Rw 1 1 1 4 Write data Instruction register 2 Extender MDR 16 32 IorD=1 ALUctr=x ALUSrcB=x MemtoReg=x ExtOp=x Store: Memory Write

  25. Step 4 : Branch inst Logical Register Transfers BEQ if R[rs] == R[rt]then PC <= PC + 4+SExt(Im16) || 00 else PC <= PC + 4 Step 1. Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2. Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] ALUOut ← PC +SExt(Im16) || 00 Step 3. Branch completion If A = B, PC ← ALUOut

  26. RegDst=0 ALUSrcA=0 RegWrite=0 MemWrite=0 MemRead=0 IRWrite=0 Instruction [31-26] Address PC Instruction [25-21] 0 Rs 0 Memory Read data 1 1 A 1 MemData Instruction [20-16] Rt ALU Out Registers Write data ALU 0 B 0 0 Read data 2 Inst [15-11] Instruction [15-0] Rw 1 1 1 4 Write data Instruction register 2 3 Shift left 2 MDR Extender 16 32 IorD=x ALUctr=Add ExtOp=Sign ALUSrcB=3 MemtoReg=x Branch – Address Calculation

  27. PCWriteCond=1 PCWrite=0 PCSource=1 RegDst=x ALUSrcA=1 RegWrite=0 MemWrite=0 MemRead=0 IRWrite=0 1 0 Instruction [31-26] 0 Address PC Instruction [25-21] Rs 0 1 Memory Read data 1 A 1 MemData Instruction [20-16] Zero Rt ALU Out Registers Write data 0 0 ALU 0 B Read data 2 1 1 Inst [15-11] Instruction [15-0] Rw 1 4 Write data Instruction register 2 3 Shift left 2 MDR Extender 16 32 IorD=0 ALUctr=Sub ExtOp=x ALUSrcB=0 MemtoReg=x Branch:Execution

  28. PCWriteCond ALUOp PCWrite PCSource IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite ExtOp Op [5-0] MemtoReg RegDst IRWrite Instruction [31-26] 0 1 0 Address PC Instruction [25-21] 0 1 1 Rs Memory Read data 1 A MemData Instruction [20-16] Zero Rt ALU ALU Out 0 0 Registers Write data 0 B 1 1 Read data 2 Inst [15-11] Instruction [15-0] Rw 1 4 Write data Instruction register 2 3 Shift left 2 ALU Control MDR Extender 16 32 Instruction [ 5-0] Multicycle Processor

  29. Summary of Instruction Steps

  30. Simple Questions • How many cycles will it take to execute this code? lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label #assume not add $t5, $t2, $t3 sw $t5, 8($t3)Label: ... • What is going on during the 8th cycle of execution? • In what cycle does the actual addition of $t2 and $t3 takes place?

  31. inputs (conditions) Next State Logic Control State Output Logic outputs (control points) Finite State Machine (FSM) Controller • State specifies control points for Register Transfer • Transfer occurs upon exiting state (same falling edge)

  32. P C W r i t e P C W r i t e C o n d I o r D M e m R e a d M e m W r i t e I R W r i t e C o n t r o l l o g i c M e m t o R e g P C S o u r c e A L U O p O u t p u t s A L U S r c B A L U S r c A R e g W r i t e R e g D s t N S 3 N S 2 N S 1 I n p u t s N S 0 5 4 3 2 1 0 p p p p p p 3 2 1 0 O O O O O O S S S S I n s t r u c t i o n r e g i s t e r S t a t e r e g i s t e r o p c o d e f i e l d FSM for Control

  33. IR <= MEM[PC] PC <= PC + 4 instruction fetch decode / operand fetch A <= R[rs], B <= R[rt] S <= PC + SX || 00 R-type ORi BEQ PC <= Next(PC,Equal) execute S <= A fun B S <= A or ZX S <= A + SX LW SW M<=MEM[S] MEM[S] <= B memory R[rd] <= S R[rt] <= S write-back R[rt] <= M Step 4  Control Specification

  34. Step 5  (datapath + state diagram control) • Translate RTs into control points • Assign states • Then go build the controller

  35. Instruction fetch Instruction decode / register fetch PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 RegDst= 0 RegWrite MemtoReg= 0 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 RegDst= 1 RegWrite MemtoReg= 0 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 Branch LW / SW ORi R-type ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 SW LW Mapping RTs to Control Points

  36. IR <= MEM[PC] PC <= PC + 4 0000 A <= R[rs], B <= R[rt] S <= PC + SX || 00 0001 R-type ORi BEQ PC <= Next(PC,Equal) S <= A + SX S <= A fun B S <= A or ZX 1000 0110 0100 0011 LW SW M<=MEM[S] MEM[S] <= B 1001 1011 R[rd] <= S R[rt] <= S R[rt] <= M 0101 0111 1010 Assigning States

  37. Instruction fetch Instruction decode / register fetch PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 IorD= 1 MemWrite ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead RegDst= 0 RegWrite MemtoReg= 1 1 0 Branch LW / SW ORi R-type 3 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 4 6 8 SW LW 11 9 5 7 10 Control Logic – Datapath Control Outputs

  38. Instruction fetch Instruction decode / register fetch PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 IorD= 1 MemWrite ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead RegDst= 0 RegWrite MemtoReg= 1 1 0 Branch LW / SW ORi R-type 3 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 4 6 8 SW LW 11 9 5 7 10 Control Logic – Next State Function

  39. PLA Implementation

  40. Performance Evaluation • What is the average CPI? • State diagram gives CPI for each instruction type • Workload gives frequency of each type Type CPIi for type Frequency CPIi x freqIi Arith/Logic 4 40% 1.6 Load 5 30% 1.5 Store 4 10% 0.4 branch 3 20% 0.6 Average CPI:4.1

  41. Another Implementation Style

  42. Address Select Logic

  43. I n s t r u c t i o n d e c o d e / I n s t r u c t i o n f e t c h r e g i s t e r f e t c h 0 M e m R e a d 1 A L U S r c A = 0 I o r D = 0 A L U S r c A = 0 I R W r i t e A L U S r c B = 1 1 S t a r t A L U S r c B = 0 1 A L U O p = 0 0 A L U O p = 0 0 P C W r i t e ) P C S o u r c e = 0 0 ' ) e Q ) p y ' t E - J R ' B = ' p = = O ( ) p ' p W M e m o r y a d d r e s s S O ' O = B r a n c h ( J u m p p ( O ( c o m p u t a t i o n r o E x e c u t i o n c o m p l e t i o n ) ' c o m p l e t i o n W L ' = p O ( 2 6 8 9 A L U S r c A = 1 A L U S r c A = 1 A L U S r c B = 0 0 A L U S r c A = 1 P C W r i t e A L U S r c B = 1 0 A L U O p = 0 1 A L U S r c B = 0 0 P C S o u r c e = 1 0 A L U O p = 0 0 P C W r i t e C o n d A L U O p = 1 0 P C S o u r c e = 0 1 ( ) O ' p W = L ' ' S = W ' ) p M e m o r y M e m o r y O ( a c c e s s a c c e s s R - t y p e c o m p l e t i o n 3 5 7 R e g D s t = 1 M e m R e a d M e m W r i t e R e g W r i t e I o r D = 1 I o r D = 1 M e m t o R e g = 0 W r i t e - b a c k s t e p 4 R e g D s t = 0 R e g W r i t e M e m t o R e g = 1 Address Select Logic

  44. I n s t r u c t i o n d e c o d e / I n s t r u c t i o n f e t c h r e g i s t e r f e t c h 0 M e m R e a d 1 A L U S r c A = 0 I o r D = 0 A L U S r c A = 0 I R W r i t e A L U S r c B = 1 1 S t a r t A L U S r c B = 0 1 A L U O p = 0 0 A L U O p = 0 0 P C W r i t e ) P C S o u r c e = 0 0 ' ) e Q ) p y ' t E - J R ' B = ' p = = O ( ) p ' p W M e m o r y a d d r e s s S O ' O = B r a n c h ( J u m p p ( O ( c o m p u t a t i o n r o E x e c u t i o n c o m p l e t i o n ) ' c o m p l e t i o n W L ' = p O ( 2 6 8 9 A L U S r c A = 1 A L U S r c A = 1 A L U S r c B = 0 0 A L U S r c A = 1 P C W r i t e A L U S r c B = 1 0 A L U O p = 0 1 A L U S r c B = 0 0 P C S o u r c e = 1 0 A L U O p = 0 0 P C W r i t e C o n d A L U O p = 1 0 P C S o u r c e = 0 1 ( ) O ' p W = L ' ' S = W ' ) p M e m o r y M e m o r y O ( a c c e s s a c c e s s R - t y p e c o m p l e t i o n 3 5 7 R e g D s t = 1 M e m R e a d M e m W r i t e R e g W r i t e I o r D = 1 I o r D = 1 M e m t o R e g = 0 W r i t e - b a c k s t e p 4 R e g D s t = 0 R e g W r i t e M e m t o R e g = 1 Dispatch ROMs Dispatch ROM 1 Op Opcode name Value 000000 R-format 0110 jmp 000010 1001 beq 000100 1000 lw 100011 0010 sw 101011 0010 Dispatch ROM 2 Op Opcode name Value lw 100011 0011 sw 101011 0101

  45. Microprogramming Microprogramming: Designing the control as a program that implements the machine instructions in terms of microinstructions

  46. Microinstruction ??? User program plus Data this can change! Main Memory ADD SUB AND . . . one of these is mapped into one of these DATA execution unit AND microsequence e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s) CPU control memory

  47. Microprogramming a Multicycle Processor 1) Choose datapath and sequencer architecture 2) Assign states and sequence of each (multicycle)instruction (i.e., define the controller FSM) 3) Choose microinstruction format (minimum bits to describe all allowable functions of sequencer and datapath) 4) Map instructionsinto microinstruction sequences

  48. Designing a Microinstruction Set 1) Start with list of control signals 2) Group signals together that make sense: called “fields” 3) Place fields in some logical order (e.g., ALU operation & ALU operands first andmicroinstruction sequencing last) 4) Create a symbolic legend for the microinstruction format, showing name of field values and how they set the control signals 5) To minimize the width, encode operations that will never be used at the same time

  49. PCWriteCond ALUOp PCWrite PCSource IorD ALUSrcB MemRead ALUSrcA Control MemWrite RegWrite ExtOp Op [5-0] MemtoReg RegDst IRWrite Instruction [31-26] 0 1 0 Address PC Instruction [25-21] 0 1 1 Rs Memory Read data 1 A MemData Instruction [20-16] Zero Rt ALU ALU Out 0 0 Registers Write data 0 B 1 1 Read data 2 Inst [15-11] Instruction [15-0] Rw 1 4 Write data Instruction register 2 3 Shift left 2 ALU Control MDR Extender 16 32 Instruction [ 5-0] Multicycle Processor

  50. Microinstruction fields

More Related