1 / 24

Instruction Set Issues

Instruction Set Issues. MIPS easy Instructions are only committed at MEM  WB transition Other architectures are more difficult Instructions may update state early FP more difficult Memory updating ops (e.g. string moves). Instruction Set Issues (cont.). Difficult architectural features

trella
Télécharger la présentation

Instruction Set Issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instruction Set Issues • MIPS easy • Instructions are only committed at MEMWB transition • Other architectures are more difficult • Instructions may update state early • FP more difficult • Memory updating ops (e.g. string moves)

  2. Instruction Set Issues (cont.) • Difficult architectural features • “Odd” bits of state (e.g. condition codes) • May need saving/restoring on exceptions • Implicitly set condition codes • Complicate branch resolution • Explicit setting helps here (still a RAW hazard) • Multicycle operations • Widely differing execution times, lots of potential data hazards, etc.

  3. Instruction Set Issues • VAX suffers from many of these problems • Solution: pipeline the microcode • Intel 32-bit 80x86 processors since 1995 use a similar approach

  4. A.5. Handling Multicycle Operations • MIPS: FP operations • Long latency (EX repeated) • Several functional units • Structural hazards • Data hazards

  5. DLX: FP Design • Four functional units: • Integer ALU • as before • FP multiplier • also used for integer multiplication • FP adder • addition, subtraction and conversion • FP divider • also used for integer division

  6. MIPS Design with FP Units

  7. MIPS Multicycle Operations

  8. Hazards • Divides • Structural hazard • Multiple register writes possible in a cycle • Out-of-order completion • WAW hazards • Exception-handling complications • RAW hazards increase

  9. Potential RAW Hazards • Example (SPARC syntax): ldd [%fp-8], %f4 fmuld %f4, %f6, %f0 faddd %f0, %f8, %f2 std %f2, [%fp-16]

  10. Simpler: all stalls at one point Multiple Writes • Up to four instructions may need to write in the same cycle • Solution • Track writes in ID • Stall at instruction issue • Alternatively: • Stall at MEM or WB • Stall instruction with shorter latency (may free RAW hazards)

  11. WAW Hazards • Example: faddd %f4, %f6, %f2 … ! Integer op ldd [%fp-8], %f2

  12. WAW Hazards (cont.) • Rare • Compiler scheduling may result in unlikely instruction sequences, so must be caught • Solutions: • Stall issue of ldd • Prevent write by faddd

  13. Complete long before fdivd Maintaining Precise Exceptions • Out-of-order completion: fdivd %f2, %f4, %f0 faddd %f10, %f8, %f10 fsubd %f12, %f14, %f12 • Sub may cause an exception after add is complete, but not div • No longer precise

  14. Maintaining Precise Exceptions • It may be very difficult to handle exceptions precisely • E.g. the add has destroyed one of its operands! • Four solutions: • Accept imprecise exceptions • Needed for VM & IEEE FP • Allow switching between precise and imprecise modes

  15. Maintaining Precise Exceptions • Solutions (cont.) • Buffer results until earlier instructions complete • Buffers may grow very large, and extensive forwarding required • History files: restore original register values • Future files: store new register values • Software executes intervening instructions to get “up to date” before returning from exception

  16. Maintaining Precise Exceptions • Solutions (cont.) • Hybrid scheme • Instructions are only issued when it is certain that preceding instructions will not cause an exception • May require stalling the pipeline

  17. Performance of the MIPS FP Pipeline • Structural Hazards (divide unit) • Very low: 0-2 cycles per FP operation • RAW hazards • Divide: 12-24 cycles, average 14.2 • Add: 0.7-2.3 cycles, average 1.7 • In general, about 0.5 × latency

  18. Overall MIPS FP Performance • Stalls per instruction • 0.65-1.21 cycles • Average: 0.87 • 82% from FP RAW hazards

  19. A.6. Putting It All TogetherMIPS R4000 Pipeline • 64-bit instruction set • Eight stage pipeline • superpipelining • IF + IS: instruction fetch • RF: decode/register fetch • EX: execution • DF + DS + TC: data cache access • WB: write back

  20. MIPS R4000 Pipeline • Performance • Load delay: two cycles • Branch delay: three cycles • Delayed branch (one cycle) • Predict-not-taken strategy, with anulling • Increased forwarding requirements • Three stages between EX and WB now

  21. MIPS R4000 Pipeline • Floating Point • Three functional units • Divider, multiplier, adder • Shared components (8 sub-units) • Latency: 2–112 cycles • Initiation rate: 1–111 cycles • Complicated stall handling

  22. MIPS R4000 Pipeline • Performance: • CPI between 1.2 and 2.8 for SPEC92 benchmarks • Average: 2.0 • Integer: 1.54 • FP: 2.48 • Integer apps: mainly branch delays • FP apps: mainly FP data hazard stalls (RAW)

More Related