200 likes | 333 Vues
This module delves into the principles of speculation in computer architecture, specifically focusing on the unconditional early execution of predicted instructions. We explore the challenges posed by potential exceptions and side effects, as well as conservative approaches to speculation. Practical examples demonstrate speculative load instructions and methods for cooperative hardware and software management of exceptions. We also examine poison bits, register renaming, and sentinel methods that enhance the efficiency of speculation while maintaining program correctness.
E N D
Computer Architecture PrinciplesDr. Mike Frank CDA 5155Summer 2003 Module #24Speculation
What’s Speculation? • Unconditional early execution of an instruction that is expected to be needed (based on predicted branch outcome), but that may not be. • What makes this difficult? • Instruction may raise fatal (non-resumable) exceptions that shouldn’t have been raised. • Instruction may have side effects that affect the data-flow of later instructions that shouldn’t be affected. • The conservative approach (never speculate under these conditions) is overly constraining.
A Simple Speculation Example • C source: if (A==0) A=B; else A=A+4; • A in 0(R3), B in 0(R2), R14 available • Original assembly:With speculative load: LD R1,0(R3) LD R1,0(R3) BNEZ R1,L1 LD R14,0(R2) LD R1,0(R2) BEQZ R1,L3 J L2 DADDI R14,R1,#4 L1: DADDI R1,R1,#4 L3: SD R14,0(R3) L2: SD R1,0(R3) Note that this simple transformationdoes not preserve exception behavior! Note that then clause is now effectively unconditional. (Equivalent C code:T=B; if (A!=0) then T=A+4; A=T;). Note use of extra register R14.
Ambitious Speculation Methods Here are some alternatives: • Hardware and OS cooperatively ignore (or delay) exceptions for speculative instructions. • Poison bits mark register values written by speculative instructions that generated exceptions. • Results of speculative instructions are buffered (not committed) until the speculative branch prediction is confirmed. (Sentinel method.)
HW/SW-cooperation Method • A way of coping with non-resumable exceptions in speculative instructions. • Basic strategy: Simply ignore fatal errors in any speculative instructions. • Correct programs will never generate such errors anyway, so, no problem (no “false positives”). • But, incorrect programs may silently go haywire! • Treated as an unavoidable cost of optimization • a kind of imprecise exception handling • In this case, if a program misbehaves in testing, one could always recompile it with strict exception handling (& no speculation) to track down the error.
Example: Speculative Load Inst. • Previous example, with special “Speculative Load” (sLD) instruction: LD R1,0(R3) LD R1,0(R3) sLD R14,0(R2) sLD R14,0(R2) BEQZ R1,L3 BNEZ R1,L1 DADDI R14,R1,#4 SPECCK 0(R2) L3: SD R14,0(R3) J L2 L1: DADDI R14,R1,#4 L2: SD R14,0(R3) This version does not preserveexception behavior, but at leastavoids false positives. Using a separate “speculation check”(SPECCK) instruction to restore correct exception behavior.
Poison Bits • Speculative instructions are marked as such. • Like the “sLD” instruction we saw earlier. • Each ISA register has an associated “poison bit.” • When a speculative inst. generates a fatal exception, then, instead of invoking exception handling, the destination register is marked as “poison.” • Poison is propagated through data dependencies of subsequent speculative instructions. • If a non-speculative instruction ever uses a poisoned register, then that instruction generates a fatal exception which halts the program. • All fatal exceptions do eventually occur, but maybe a bit late vs. normally. (Still pretty easy to debug, tho.)
Poison Bit Example • C src:if (A==0) A=B+8 else A=A+4; LD R1,0(R3) ;Ld A non-speculatively sLD R12,0(R2) ;0(R2)ex.may poison R12 sDADDI R14,R12,#8 ;R14 inherits poison BEQZ R1,L3 ;skip next line if A=0 DADDI R14,R1,#4 ;clears R14 poison bit L3: SD R14,0(R3) ;exception happens here • Note if accessing B causes an exception, it still happens (but late) only if “then” clause runs. R12 R14 Poison bits:
Speculative Insts. w. renaming • Problem: What to do about data-flow when a speculative inst. writes a register that’s later used non-speculatively? • Ordinary solution: Compiler does register renaming, writes speculative results to different (separately allocated) registers. (See sLD example) • Problem: Have to move values between normal & speculative registers, and can run out of registers! • Alternative solution: (“Boosting”) Let the HW do the renaming & buffering of speculative results • Like in Tomasulo’s algorithm.
Sentinel Method • Special “sentinel” instruction marks original location of an instruction moved speculatively. • Write-back (& exception handling) of the speculative instruction is delayed until the corresponding sentinel is reached. • Note writeback never occurs if sentinel not reached! LD BEQ BEQLD sentinel
Hardware-Based Speculation • Combines 3 ideas: • Dynamic branch prediction chooses which instructions will be pre-executed. • Speculation executes conditional instructions early (before branch conditions are resolved). • Dynamic scheduling handles scheduling of different dynamic sequences of basic blocks encountered. • Dataflow execution: Execute instructions as soon as their operands are available. • Like with Tomasulo’s algorithm
Advantages of HW-based spec. • Dynamic speculation can disambiguate memory references, so a store can be moved before a load (if the locations addressed are different). • Speculation works better if more accurate dynamic branch predictions can be used. • Precise exception handling even for speculated instructions. • No extra bookkeeping code (speculation bits, register renaming code) in the program. • Code independent of implementation
Implementing HW-based spec. • Separate the execution of speculative instructions (including dataflow between them) from the committing of results permanently to registers/memory (if speculations are correct). • New structure called the reorder buffer holds results of instructions that have executed speculatively but cannot yet be committed. • The reorder buffer represents non-programmer-visible temporary storage, like the reservation stations in Tomasulo’s algorithm.
Fields of Reorder Buffer Entries • Instruction type field: • “Branch” (no dest.) • “Store” (dest.=memory) • “Register” (dest.=register). • Destination field: • Register number (for loads & ALU ops) • Memory address (for stores) • Value field: • Register or memory value to be stored permanently when instruction commits. • Ready field: Instruction has completed
Steps of Execution in HWBS • Issue (or dispatch): • Get next fetched instruction. • Issue if reservation station & reorder buffer not full. • Check ROB & registers for available operands • Execute: • Monitor CDB for operands until ready, then execute • Write result: • Write to CDB, reorder buffer, & reservation stations • Commit: • When instruction is first in reorder buffer (& wasn’t mispredicted), commit value to register/memory. • Committing mispredicted branch flushes reorder buffer.
HWBS execution example (3rd ed., p. 229) L.D F6,34(R2) IEWC L.D F2,45(R3) IEWC MUL.D F0,F2,F4 I EEEEEEEEEEWC SUB.D F8,F6,F2 IEW C DIV.D F10,F0,F6 I EEEEE…EWC ADD.D F6,F8,F2 IEW C Also go through figure 3.30 on p. 230… (40 cycles)
HWBS loop example 1 L.D F0,0(R1) IEWC 1 MUL.D F4,F0,F2 I EE…EWC 1 S.D F4,0(R1) IE WC 1 DADDIU R1,R1,#-8 1 BNE R1,R2,Loop 2 L.D F0,0(R1) 2 MUL.D F4,F0,F2 2 S.D F4,0(R1) 2 DADDIU R1,R1,#-8 2 BNE R1,R2,Loop
Explicit Register Renaming • An alternative to reorder buffers for HWBS: • Have more physical registers than architectural (programmer-visible) registers. • Dynamically map destination ISA register to unused physical register when instruction is issued. • Also track which mapping corresponds to last committed instruction, to support restarts. LastIssued LastCommitted Approach used in:PPC 603/604,MIPS R10000/12000,Alpha 21264,Pentium II/III/4 R1 ISARegisterMap PhysicalRegisters R2 … … … F31