1 / 70

Chapter 4

CprE 381 Computer Organization and Assembly Level Programming, Fall 2013. Chapter 4. The Processor. Zhao Zhang Iowa State University Revised from original slides provided by MKP. Week 8 Overview. CPU design overview Datapath and Control Control Unit ALU Control Unit. Announcements.

king
Télécharger la présentation

Chapter 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Chapter 4 The Processor Zhao Zhang Iowa State University Revised from original slides provided by MKP

  2. Week 8 Overview • CPU design overview • Datapath and Control • Control Unit • ALU Control Unit Chapter 1 — Computer Abstractions and Technology — 2

  3. Announcements • Mini-project B starts in week 9 • Mini-projects B and C will be revised • The grading scale will be discussed by Friday (week 8) Chapter 1 — Computer Abstractions and Technology — 3

  4. Introduction §4.1 Introduction • CPU performance factors • Instruction count • Determined by ISA and compiler • CPI and Cycle time • Determined by CPU hardware • We will examine two MIPS implementations • A simplified, single-cycle version • A more realistic, pipelined version Chapter 4 — The Processor — 4

  5. Nine-Instruction MIPS • We will first use a MIPS subset of nine instructions, then extend the subset • It’s enough to illustrate the most aspects of CPU design, particularly datapath and control design Memory reference: LW and SW Arithmetic/logic: ADD, SUB, AND, OR, SLT Branch: BEQ, BNE Chapter 1 — Computer Abstractions and Technology — 5

  6. Instruction Execution • PC  instruction memory, Fetchinstruction • Register numbers register file, Read registers Then, depending on instruction class • Execute: Use ALU to calculate • Arithmetic result • Memory address for load/store • Branch target address • Memory access: Access data memory for load/store • Register writeback: Write data back to registers PC update (for all): PC  target address or PC + 4 Chapter 4 — The Processor — 6

  7. CPU Overview A Sketchy view Next Sequential PC = PC + 4 Branch Target = (PC+4)+offset An instruction may change PC (all instructions) Some register (arithmetic/logic, load) Some memory word/halfword/byte (store) Chapter 4 — The Processor — 7

  8. Multiplexers • Can’t just join wires together • Use multiplexers What would happen if you just join signals in VHDL? Chapter 4 — The Processor — 8

  9. Control Control signals: mux select, read/write enable, ALU opcode, etc. Chapter 4 — The Processor — 9

  10. Logic Design Basics • Combinational element • Operate on data • Output is a function of input • State (sequential) elements • Store information • Output is a function of internal state and input §4.2 Logic Design Conventions Chapter 4 — The Processor — 10

  11. A Y B A A Mux I0 Y + Y Y I1 ALU B B S F Combinational Elements • AND-gate • Y = A & B • Adder • Y = A + B • Arithmetic/Logic Unit • Y = F(A, B) • Multiplexer • Y = S ? I1 : I0 Chapter 4 — The Processor — 11

  12. D Q Clk Clk D Q Sequential Elements • Register: stores data in a circuit • Uses a clock signal to determine when to update the stored value • Edge-triggered: update when Clk changes from 0 to 1 • Data output Q is stable for a clock cycle Chapter 4 — The Processor — 12

  13. Clk D Q Write Write D Clk Q Sequential Elements • Register with write control • Only updates on clock edge when write control input is 1 • VHDL: rising_edge(Clk) AND Write • Used when stored value is required later Chapter 4 — The Processor — 13

  14. Clocking Methodology • Combinational logic transforms data during clock cycles • Input from state elements • Output must stabilize within one cycle • Longest delay determines clock period • Output to state element at the next rising edge Chapter 4 — The Processor — 14

  15. Clocking Methodology • Processor is a big state machine • Works like a Moore machine in non-I/O phase • Output is a function of the state • States include PC, all registers and memory contents Chapter 1 — Computer Abstractions and Technology — 15

  16. Building a Datapath • Datapath elements • Elements that process data and addressesin the CPU • Registers, ALUs, mux’s, memories, … • We will build a MIPS datapath incrementally • Refining the overview design §4.3 Building a Datapath Chapter 4 — The Processor — 16

  17. Instruction Fetch Increment by 4 for next instruction 32-bit register Datapath elements: PC register, instruction memory, 32-bit adder Chapter 4 — The Processor — 17

  18. R-Format Instructions • Read two register operands • Perform arithmetic/logical operation • Write register result Datapath elements: Register file, ALU Chapter 4 — The Processor — 18

  19. Load/Store Instructions • Read register operands • Calculate address using 16-bit offset • Use ALU, but sign-extend offset • Load: Read memory and update register • Store: Write register value to memory Datapath elements: Data memory, sign extender Chapter 4 — The Processor — 19

  20. Branch Instructions • Read register operands • Compare operands • Use ALU, subtract and check Zero output • Calculate target address • Sign-extend displacement • Shift left 2 places (word displacement) • Add to PC + 4 • Already calculated by instruction fetch Chapter 4 — The Processor — 20

  21. Branch Instructions Justre-routes wires Sign-bit wire replicated New: Shifter, 2nd 32-bit Adder Chapter 4 — The Processor — 21

  22. Composing the Elements • First-cut data path does an instruction in one clock cycle • Each datapath element can only do one function at a time • Hence, we need separate instruction and data memories • Use multiplexers where alternate data sources are used for different instructions Chapter 4 — The Processor — 22

  23. R-Type/Load/Store Datapath Chapter 4 — The Processor — 23

  24. Full Datapath Chapter 4 — The Processor — 24

  25. Performance Issues • Longest delay determines clock period • Critical path: load instruction • Instruction memory  register file  ALU  data memory  register file • Not every instruction requires the same time Chapter 4 — The Processor — 25

  26. Performance Issues • Some instructions may take substantially longer time, e.g. multiply/division • Not feasible to vary clock cycle for different instructions • Must use the worst-case delay as the clock cycle • Violates design principle making the common case fast • We will improve performance by pipelining Chapter 4 — The Processor — 26

  27. ALU Control • ALU used for • Load/Store: F = add • Branch: F = subtract • R-type: F depends on funct field §4.4 A Simple Implementation Scheme Chapter 4 — The Processor — 27

  28. ALU Control • Assume 2-bit ALUOp derived from opcode • Combinational logic derives ALU control Chapter 4 — The Processor — 28

  29. VHDL Notes • How to program the ALU control? -- Behavior style process (alu_op, funct) begin case alu_op is when ‘00’ => alu_code <= ‘0010’; when ’01’ => … end case; end process; Chapter 1 — Computer Abstractions and Technology — 29

  30. 0 4 35 or 43 rs rs rs rt rt rt rd address address shamt funct 31:26 31:26 31:26 25:21 25:21 25:21 20:16 20:16 20:16 15:11 10:6 15:0 15:0 5:0 The Main Control Unit • Control signals derived from instruction R-type Load/Store Branch opcode always read read, except for load write for R-type and load sign-extend and add Chapter 4 — The Processor — 30

  31. Datapath With Control Chapter 4 — The Processor — 31

  32. Summary of Control Signals • RegDst: Write to register rt or rd? • ALUSrc: Immediate to ALU? • MemtoReg: Write memory or ALU output? • RegWrite: Write to regfile at all? • MemRead: Read from Data Memory? • MemWrite: Write to the Data Memory? • Branch: Is it a branch intruction? • ALUOp[1:0]: ALU control field Chapter 1 — Computer Abstractions and Technology — 32

  33. R-Type Instruction Chapter 4 — The Processor — 33

  34. R-Type: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] 1(write to rd) 0 (No immediate) 0 (wrote not from memory) 1 (does write regfile) 0 (no memory read) 0 (no memory write) 0 (does write regfile) 10 (R-type ALU op) Chapter 1 — Computer Abstractions and Technology — 34

  35. Load Instruction Chapter 4 — The Processor — 35

  36. Load: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] 0 1 1 1 1 0 0 00 Chapter 1 — Computer Abstractions and Technology — 36

  37. Store: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] X 1 X 0 0 1 0 00 Chapter 1 — Computer Abstractions and Technology — 37

  38. Branch-on-Equal Instruction Chapter 4 — The Processor — 38

  39. BEQ: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] X 0 X 0 0 0 1 01 Chapter 1 — Computer Abstractions and Technology — 39

  40. Control Signal Setting • What’re the control signal values for each instruction or instruction type? Note: “R-” means R-format Chapter 1 — Computer Abstractions and Technology — 40

  41. VHDL Notes • How to program the control? entity control is port (op_code : in m32_6bits; reg_dst : out m32_1bit; alu_src : out m32_1bit; mem_to_reg : out m32_1bit; reg_write : out m32_1bit; mem_read : out m32_1bit; mem_write : out m32_1bit; branch : out m32_1bit; alu_op : out m32_2bits); end control; Chapter 1 — Computer Abstractions and Technology — 41

  42. VHDL Notes architecture rom of control is subtype code_t is m32_vector(8 downto 0); type rom_t is array (0 to 63) of code_t; -- The ROM content for control signals signal rom : rom_t := ( 00 => "100100010", -- R-type 35 => "011110000", -- LW … -- More for other instructions others=>"000000000"); begin (reg_dst, alu_src, mem_to_reg, reg_write, mem_read, mem_write, branch, alu_op(1), alu_op(0)) <= rom(to_integer(unsigned(op_code))); end rom; Chapter 1 — Computer Abstractions and Technology — 42

  43. 2 address 31:26 25:0 Implementing Jumps • Jump uses word address • Update PC with concatenation of • Top 4 bits of old PC • 26-bit jump address • 00 • Need an extra control signal decoded from opcode Jump Chapter 4 — The Processor — 43

  44. Datapath With Jumps Added Chapter 4 — The Processor — 44

  45. Grading Scale • Tentative grading scale A: 90, A-: 87 B+: 84, B: 80, B-: 75 C+: 70, C: 65, C-: 60 D: 50 • There will be a bonus in lab projects Chapter 1 — Computer Abstractions and Technology — 45

  46. Mini-Project B, Tentative Implement single-cycle processor (SCP). There will be three parts • Part 1, SCPv1: Implement the nine-instruction ISA plus the J instruction • Part 2, SCPv2a: Support all the instructions needed to run bubble sorting • Part 3, SCPv2b: Detailed modeling of data elements Chapter 1 — Computer Abstractions and Technology — 46

  47. Mini-Project B • Bonus part, SCPv3: Support all integer instructions on the green sheet, due in the last lab • Some support files will be provided • High-level modeling of Register File, ALU, Adder, to be used in Parts 1 and 2 • Partial sample VHDL code will be provided Chapter 1 — Computer Abstractions and Technology — 47

  48. Mini-Project B • The CPU composition must be strongly structural • Parts 1 and 2 may use behavior/dataflow modeling for data elements • Part 3 must use detailed modeling for data elements – Reuse your VHDL code in the labs Chapter 1 — Computer Abstractions and Technology — 48

  49. Extend Single-Cycle MIPS Consider the following instructions • addi: add immediate • sll: Shift left logic by a constant • bne: branch if not equal • jal: Jump and link • jr: Jump register Chapter 1 — Computer Abstractions and Technology — 49

  50. SCPv0: R-Format, LW/SW, BEQ Chapter 4 — The Processor — 50

More Related