1 / 24

Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artif

Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. Two-Cycle SMIPS: Analysis. Register File. Fetch. Execute. stage. PC. Execute. Decode. fr. +4.

tomai
Télécharger la présentation

Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artif

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture: A Constructive Approach Pipelined Implementations of SMIPS Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology http://csg.csail.mit.edu/6.S078

  2. Two-Cycle SMIPS: Analysis Register File Fetch Execute stage PC Execute Decode fr +4 In any given clock cycle, lot of unused hardware ! Data Memory Inst Memory Pipelined execution of instructions to increase the through put http://csg.csail.mit.edu/6.S078

  3. f0 f1 f2 Instruction pipelining Valid/Invalid • Much more complicated than arithmetic pipelines, e.g., IFFT • The entities in an instruction pipeline are not independent of each other • This causes pipeline stalls or requires other fancy tricks to avoid stalls x inQ outQ sReg1 sReg2 http://csg.csail.mit.edu/6.S078

  4. Hazards in instruction pipelining • Structural hazard: Two instructions in the pipeline may require the same resource at the same time, e.g., contention for memory • Data hazard: An instruction in the pipeline may affect the state of the machine (rf, dMem or pc) – the next instruction must be fully cognizant of this change • Control hazard: An instruction in the pipeline may determine the next instruction to be executed, e.g., branches Notice that none of these hazards are present in the IFFT pipeline. http://csg.csail.mit.edu/6.S078

  5. The power of computers comes from the fact that the instructions in a program are not independent of each other must deal with hazard http://csg.csail.mit.edu/6.S078

  6. Two-Cycle SMIPS Register File stage PC Execute Decode ir +4 Data Memory Inst Memory Register ir is to hold a fetched instruction and register stage is to remember which stage (fetch/execute) we are in http://csg.csail.mit.edu/6.S078

  7. ir: The instruction register • You may recall from our earlier discussion of pipelining (e.g., IFFT) that there is a possibility that the intermediate or pipelined registers do not contain any meaningful data • We can associate a (Valid/Invalid) bit to ir • Equivalently we can think of a pipeline register as a one-element FIFO http://csg.csail.mit.edu/6.S078

  8. Two-stage SMIPS (Harvard) Register File PC Execute Decode ir +4 Let us assume we keep fetching instructions from pc, pc+4, pc+8, … and correct it when control hazard is detected Data Memory Inst Memory http://csg.csail.mit.edu/6.S078

  9. Two-stage pipeline SMIPS (Harvard) – first attempt modulemkProc(Proc); Reg#(Addr) pc <- mkRegU; RFilerf <- mkRFile; Memory mem <- mkTwoPortedMemory; letiMem = mem.iport; letdMem = mem.dport; PipeReg#(TypeFetch2Execute) ir<- mkPipeReg; ruledoProc; letinst <- iMem(MemReq{op: Ld, addr: pc, data: ?}); ir.enq(TypeFetch2Execute{pc: pc, inst: inst}); letnext_pc = pc + 4; http://csg.csail.mit.edu/6.S078

  10. Two-stage pipeline SMIPS (Harvard) – first attempt if(ir.notEmpty) begin letirpc= ir.first.pc; letinst = ir.first.inst; letdInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); leteInst = exec(dInst, rVal1, rVal2, irpc); if(memType(eInst.iType)) eInst.data <- dMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data}); if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data); if (eInst.brTaken) next_pc = eInst.addr; ir.deq; end pc <= next_pc; oops! http://csg.csail.mit.edu/6.S078

  11. Control Hazard Register File Epoch PC Execute Decode ir +4 Not only we must set the pc but throw away the fetched instruction in ir Data Memory Inst Memory A new Epoch register can keep track of whether the instruction in ir is from the current epoch http://csg.csail.mit.edu/6.S078

  12. Two-Stage SMIPS (unpipelined) - 1 modulemkProc(Proc); RFilerf <- mkRFile; Memory mem <- mkMemory; FIFO#(FBundle) fr <- mkFIFO; FIFOF#(Addr) nextPc <- mkFIFOF; let exec <- mkSMIPSExecutionCombinational; ruledoFetch; Addr pc = nextPc.first; nextPc.deq; let instResp <- mem.side(MemReq{op:Ld, addr:pc, data:?}); fr.enq(FBundle{predpc:pc, epoch:epoch, instResp:instResp}); endrule http://csg.csail.mit.edu/6.S078

  13. Two-Stage SMIPS (Unpipelined) - 2 ruledoDecodeExecuteWb; letpcPlus4 = fr.first.predpc + 4; letdecInst = decode(fr.first.instResp, pcPlus4); Data src1 = rf.rd1(decInst.op1); Data src2 = rf.rd2(decInst.op2); Instrinst = unpack(fr.first.instResp); letexecInst = exec.exec(decInst, src1, src2); if(execInst.itype==Ld || execInst.itype==St) begin execInst.data<- mem.side(MemReq{op:execInst.itype==Ld? Ld: St, addr:execInst.addr, data:execInst.data}); end nextPc.enq(exec.con? execInst.addr : pcPlus4); if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0) rf.wr(execInst.rdst, execInst.data); fr.deq; endrule http://csg.csail.mit.edu/6.S078

  14. Two-Stage SMIPS (Pipelined) - 1 modulemkProc(Proc); Reg#(Bool) fetchEpoch <- mkReg(False); Reg#(Bool) execEpoch <- mkReg(True); Reg#(Addr) fetchPc <- mkRegU; RFilerf <- mkRFile; Memory mem <- mkMemory; FIFO#(FBundle) fr <- mkFIFO; FIFOF#(Addr) nextPc <- mkFIFOF; let exec <- mkSMIPSExecutionCombinational; http://csg.csail.mit.edu/6.S078

  15. Two-Stage SMIPS (Pipelined) - 2 ruledoFetch; Addr pc = ?; Bool epoch = fetchEpoch; if(nextPc.notEmpty) begin pc = nextPc.first; epoch = !fetchEpoch; nextPc.deq; end else pc = fetchPc + 4; fetchPc <= pc; fetchEpoch <= epoch; letinstResp <- mem.side(MemReq{op:Ld, addr:pc, data:?}); fr.enq(FBundle{predpc:pc, epoch:epoch, instResp:instResp}); endrule http://csg.csail.mit.edu/6.S078

  16. Two-Stage SMIPS (Pipelined) - 3 ruledoDecodeExecuteWb; if(fr.first.epoch==execEpoch) begin letpcPlus4 = fr.first.predpc + 4; letdecInst = decode(fr.first.instResp, pcPlus4); Data src1 = rf.rd1(decInst.op1); Data src2 = rf.rd2(decInst.op2); Instrinst = unpack(fr.first.instResp); letexecInst = exec.exec(decInst, src1, src2); if(execInst.itype==Ld || execInst.itype==St) begin execInst.data <- mem.side(MemReq{op:execInst.itype==Ld ? Ld: St, addr:execInst.addr, data:execInst.data}); end if(execInst.cond) begin nextPc.enq(execInst.addr); execEpoch <= !execEpoch; end http://csg.csail.mit.edu/6.S078

  17. Two-Stage SMIPS (Pipelined) - 4 if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0) rf.wr(wbr.first.rdst, wbr.first.data); fr.deq; end else fr.deq; endrule http://csg.csail.mit.edu/6.S078

  18. 3-Stage SMIPS (Pipelined) - 1 ruledoDecodeExecuteWb; letwbStall = wbStallReg; if(wbRind.notEmpty) begin wbRind.deq; wbStall = False; end if(fr.first.epoch==execEpoch) begin letpcPlus4 = fr.first.predpc + 4; letdecInst = decode(fr.first.instResp, pcPlus4, cp0_statsEn, cp0_fromhost, cp0_tohost); Boolstall = wbStall; if(!stall) begin Data src1 = rf.rd1(decInst.op1); Data src2 = rf.rd2(decInst.op2); //execute Instrinst = unpack(fr.first.instResp); $display("E: Procinst: %d", inst); traceFull("Proc", "exec", inst); let execInst = exec.exec(decInst, src1, src2); if(execInst.itype==Ld || execInst.itype==St) begin execInst.data <- mem.side(MemReq{op:execInst.itype==Ld ? Ld : St, addr:execInst.addr, data:execInst.data}); end if (execInst.cond) begin nextPc.enq(execInst.addr); execEpoch <= !execEpoch; end if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0) begin wbr.enq(WBBundle{itype:execInst.itype, rdst:execInst.rdst, data:execInst.data}); wbStall = True; end fr.deq; if(decInst.wr_stats) cp0_statsEn <= unpack(truncate(src2)); if(decInst.wr_host) cp0_tohost <= src2; if(cp0_statsEn) num_inst <= num_inst + 1; end end else fr.deq; http://csg.csail.mit.edu/6.S078

  19. 3-Stage SMIPS (Pipelined) - 2 if(!stall) begin Data src1 = rf.rd1(decInst.op1); Data src2 = rf.rd2(decInst.op2); Instrinst = unpack(fr.first.instResp); letexecInst = exec.exec(decInst, src1, src2); if(execInst.itype==Ld || execInst.itype==St) begin execInst.data <- mem.side(MemReq{op:execInst.itype==Ld? Ld: St, addr:execInst.addr, data:execInst.data}); end if(execInst.cond) begin nextPc.enq(execInst.addr); execEpoch <= !execEpoch; end http://csg.csail.mit.edu/6.S078

  20. 3-Stage SMIPS (Pipelined) - 3 if((execInst.itype == Alu || execInst.itype == Ld) && execInst.rdst != 0) begin wbr.enq(WBBundle{itype:execInst.itype, rdst:execInst.rdst, data:execInst.data}); wbStall = True; end fr.deq; end end else fr.deq; wbStallReg <= wbStall; endrule http://csg.csail.mit.edu/6.S078

  21. Two-stage Pipelined SMIPS Princeton Architecture Register File Epoch PC Execute Decode ir +4 Just like the Harvard design except for an additional structural hazard when a memory-type instruction is in the execute phase Memory http://csg.csail.mit.edu/6.S078

  22. Pipelined SMIPS (Princeton) modulemkProc(Proc); Reg#(Addr) pc <- mkRegU; Reg#(Bool) epoch <- mkReg(True); RFilerf <- mkRFile; Memory mem <- mkOnePortedMemory; letuMem = mem.port; PipeReg#(FBundle) ir<- mkPipeReg; Wire#(Bool) dAcc<- mkDWire(False); ruledoProc; let next_pc = pc; if(!dAcc) begin let inst <- uMem(MemReq{op:Ld, addr:pc, data:?}); ir.enq(TypeFetch2Execute{pc:pc, epoch:epoch,inst: inst}); next_pc = pc + 4; end http://csg.csail.mit.edu/6.S078

  23. Pipelined SMIPS (Princeton) if(ir.notEmpty) begin letirpc= ir.first.pc; let inst = ir.first.inst; if(ir.first.epoch==epoch) begin letdInst = decode(inst); Data rVal1 = rf.rd1(dInst.rSrc1); Data rVal2 = rf.rd2(dInst.rSrc2); leteInst = exec(dInst, rVal1, rVal2, irpc); if(memType(eInst.iType)) begin eInst.data <- uMem(MemReq{ op: eInst.iType==Ld ? Ld : St, addr: eInst.addr, data: eInst.data}); dAcc= True; end http://csg.csail.mit.edu/6.S078

  24. Pipelined SMIPS (Princeton) if(regWriteType(eInst.iType)) rf.wr(eInst.rDst, eInst.data); if(eInst.brTaken) begin next_pc = eInst.addr; epoch = ! epoch; end end ir.deq; end pc <= next_pc; endrule endmodule next time -- Data Hazards http://csg.csail.mit.edu/6.S078

More Related