1 / 44

Topic II Instruction-Set Architecture

Topic II Instruction-Set Architecture. Introduction A Case Study: The MIPS Instruction-Set Architecture. Reading List. Slides: Topic2x Henn & Patt: Chapter 2 Other papers as assigned in class or homeworks. INPUT. OUTPUT. CONTROL (sequencer). DATAPATH (arithmetic). MEMORY.

tfarrow
Télécharger la présentation

Topic II Instruction-Set Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic IIInstruction-Set Architecture • Introduction • A Case Study: The MIPS Instruction-Set Architecture \course\cpeg323-05F\Topic2-323.ppt

  2. Reading List • Slides: Topic2x • Henn & Patt: Chapter 2 • Other papers as assigned in class or homeworks \course\cpeg323-05F\Topic2-323.ppt

  3. INPUT OUTPUT CONTROL (sequencer) DATAPATH (arithmetic) MEMORY The Stored Memory Computer Five parts of a computer • Datapath (channels/changes bits) • Control (directs operations) • Memory (places to keep bits) • Input (get data from outside) • Output (send data to outside \course\cpeg323-05F\Topic2-323.ppt

  4. Steps in Executing an Instruction Instruction Fetch: Fetch the next instruction from memory Instruction Decode: Examine instruction to determine: • What operation is performed by the instruction (e.g., addition) • What operands are required, and where the result goes Operand Fetch: Fetch the operands Execution: Perform the operation on the operands Result Writeback: Write the result to the specified location Next Instruction: Determine where to get next instruction \course\cpeg323-05F\Topic2-323.ppt

  5. What is Specified in an ISA? Instruction Decode: How are operations and operands specified? Operand Fetch: Where can operands be located? How many? Execution: What operations can be performed? What data types and sizes? Result Writeback: Where can results be written? How many? Next Instruction: How can we choose the next instruction? \course\cpeg323-05F\Topic2-323.ppt

  6. A Simple ISA: Memory-Memory • What operation becan performed? Basic arithmetic (for now) • What data types and sizes? 32-bit integers • Where can operands and results be located? Memory • How many operands and results ? 2 operands, 1 result • How are operations and operands specified? OP DEST, SRC1, SRC2 • How can we choose the next instruction? Next in sequence \course\cpeg323-05F\Topic2-323.ppt

  7. Memory Model Think of memory as being a large array of n integers, referenced by the index (random Access Memory, or RAM) For instance, M[1] contains the value 3. We can read and write these locations. These are the only locations available to us. All “abstract” locations (such as variables in a C program) must be assigned locations in M. Address Contents 0 14 1 3 2 99 . . . . . . N - 1 0 \course\cpeg323-05F\Topic2-323.ppt

  8. Simple Code Translation Given the C code A = B + C; we could decide that variable A uses location 100, B uses 48, and C uses 76. Convert the code above to the following “assembly” code: ADD M[100], M[48], M[76] How would we express A = (B + C) * (D + E); \course\cpeg323-05F\Topic2-323.ppt

  9. Using a Temporary Location Assume we put A in 100, B in 48, C in 76, D in 20, and E in 32. Now choose an unused memory location (e.g., 84). ADD M[100], M[48], M[76] # A = B + C ADD M[84], M[20], M[32] # temp = D + E MUL M[100], M[100], M[84] # A = A * temp \course\cpeg323-05F\Topic2-323.ppt

  10. Problems with Memory-Memory ISAs • Main memory much slower than arithmetic circuits • This was as true in 1950 as in 2003! • It takes a lot of room to specify memory addresses • Results are often used one or two instructions later Remember: make the common case fast! Solution: store temporary or intermediate results in fast memories near the arithmetic units. \course\cpeg323-05F\Topic2-323.ppt

  11. Accumulator Machines An “accumulator” machine keeps a single high-speed buffer (e.g., a set of D latches or flip-flops, one for each data bit) near the arithmetic logic. In the simplest kind, only one operand can be specified; the accumulator is implicit: “OP operand” means: acc. = acc. OP operand Example: LOAD M[48] # Load B into acc. ADD M[76] # Add C to acc. (now has B+C) STORE M[100] # Write acc. To A \course\cpeg323-05F\Topic2-323.ppt

  12. Accumulator Machines Does A=(B+C)*(D+E) LOAD M[20] # Load D into acc. ADD M[32] # Add E to acc. (now has D+E) STORE M[100] # Write acc. To A LOAD M[48] # Load B into acc. ADD M[76] # Add C to acc. (now has B+C) MUL M[100] # Multiply A to acc. STORE M[100] # Write (B+C) * (D+E) to A \course\cpeg323-05F\Topic2-323.ppt

  13. Shortcomings of Accumulator Machines • Still requires storing lots of temporary and intermediate values in memory • Accumulator only really beneficial for a chain (sequence) of calculations where the result of one is the input to the next. \course\cpeg323-05F\Topic2-323.ppt

  14. Still, Accumulator Machines Were Common in Early Computers • A simple design, and hence popular, especially for • Early computers • Early microprocessors (4004, 8008) • Low-end (cheap) models • Reason: accumulator logic much more expensive than memory • Vacuum tubes vs. core memory • D flip-flops vs. DRAM • Precious space on processor chip vs. off-chip DRAM \course\cpeg323-05F\Topic2-323.ppt

  15. Alternatives to Accumulator Machines If more hardware resources are available, put more fast storage locations alongside the accumulator: • Stack machines • Register machines • Special purpose • General purpose \course\cpeg323-05F\Topic2-323.ppt

  16. Stack Machines Idea: A pile of fast storage locations with a top and a bottom. An instruction can only get at the top value, or maybe the top two or three values. We can put new values on the top (“push”) or take them off the top (“pop”) but that’s it. We can’t get to locations underneath the top unless we remove everything above. Address Contents top 14 2nd from top 3 3rd from top 99 . . . . . . bottom 0 \course\cpeg323-05F\Topic2-323.ppt

  17. Stack Machine ISA Basic operations include: Load: get value from memory and push onto stack Store: pop value off of stack and put into memory Arithmetic: pop 1 or 2 values off of stack; push result on stack Dup: Get value at top of stack without removing; push new copy onto stack (why is this useful?) \course\cpeg323-05F\Topic2-323.ppt

  18. XXX Stack Machine Does A=(B+C)*(D+E) (stack top at start) (D+E) ADD XXX (D) LOAD M[20] XXX (B) (D+E) LOAD M[48] XXX (E) (D) (continued next slide) LOAD M[32] XXX \course\cpeg323-05F\Topic2-323.ppt

  19. (C) XXX Stack Machine (cont.) ((B+C)*(D+E)) (B) XXX MULT (D+E) LOAD M[76] XXX STORE M[100] (B+C) (D+E) ADD XXX Note that the stack is now the same as when we began. \course\cpeg323-05F\Topic2-323.ppt

  20. Stack Machines Used • Some early computers • 8086 floating point unit (sort of…) • Java Virtual Machine (JVM) \course\cpeg323-05F\Topic2-323.ppt

  21. Register Machines Idea: Put more storage locations (“registers”) near the accumulator • Regs have names/numbers and can be used instead of memory • Accessed much faster than main memory • (1-2 CPU cycles vs. ~ 10s to 100 cycles) • Far fewer registers than memory locations • MIPS has 32 32-bit registers • Fewer regs, smaller addresses, fewer bits to name them • A scarce resource – use them carefully! \course\cpeg323-05F\Topic2-323.ppt

  22. Special- vs. General-Purpose Registers • A special-purpose register is used for specific purposes and there may be limitations on which operations can use it • Easier on the HW design: put the reg right where it’s needed • More difficult for the compiler to use effectively • A general-purpose register can be used in any operation - Datapaths more general, hence routing is more difficult \course\cpeg323-05F\Topic2-323.ppt

  23. Special-Purpose Registers: The Z-80 CPU • Seven 8-bit registers: A, B, C, D, E, H, L (BC, DE, HL can be pairs) • Three 16-bit registers: SP, IX, IY, plus PC (Program counter) • Add, subtract, shift can only be done to A (8-bit accumulator) • Increment and decrement can be done to all regs and reg pairs • Can fetch from memory at address (HL) and put in any 8-bit reg • A fetch from address (BC) or(DE) can only go to A • Fetches from (BC), (HL) and (IX) take different numbers of cycles Anyone want to write a compiler for this? \course\cpeg323-05F\Topic2-323.ppt

  24. General Purpose Register (GPR) Machines The MIPS (and similar processors) has 32 General Purpose Registers (GPRs), each 32 bits long. All can be read or written, except register 0, whichis always 0 and can’t be changed. Register access time is uniform. Address Contents $0 0 $1 3 $2 99 . . . . . . $31 14 \course\cpeg323-05F\Topic2-323.ppt

  25. GPR Machine Does A=(B+C)*(D+E) ADD $1 M[48], M[76] $R1 = B + C ADD $2 M[20], M[32] $R2 = D + E MUL M[100], $1, $2 $A = R1 * R2 \course\cpeg323-05F\Topic2-323.ppt

  26. Some Trend • From hardware technology: number of Rs can be put on chip has potential grow very fast (Moore’s Law ?) • Very large register set will have slow access time. • Instruction set evolution is slow to accommodate the change of # of Rs \course\cpeg323-05F\Topic2-323.ppt

  27. Memory and Data Sizes So far, we’ve only talked about uniform data sizes. Actual data come in many different sizes: • Single bits: (“boolean” values, true or false) • Bytes (8 bits): Characters (ASCII), very small integers • Halfwords (16 bits): Characters (Unicode), short integers • Words (32 bits): Long integers, floating-point (FP) numbers • Double-words (64 bits): Very long integers, double-precision FP • Quad-words (128 bits): Quad-precision floating-point numbers \course\cpeg323-05F\Topic2-323.ppt

  28. Different Data Sizes How do we handle different data sizes? • Pick one size to be the unit stored in a single address • Store larger datum in a set of contiguous memory locations • Store smaller datum in one location; use shift & mask ops Today, almost all machines (including MIPS) are “byte-addressable” – each addressable location in memory holds 8 bits. \course\cpeg323-05F\Topic2-323.ppt

  29. MIPS Memory On a byte-addressable machine such as the MIPS, if we say a word (32 bits) is stored “at” address 80, we mean it occupies locations 80-83. (The next word would start at 84.) Normally, multi-byte loads and stores must be “aligned.” The address of an n-byte load/store must be a multiple of n. For instance, halfwords can only be stored at even addresses. MIPS allow non-aligned loads and stores using special instructions, but they may be slower. (Most processors don’t allow this at all!) \course\cpeg323-05F\Topic2-323.ppt

  30. Byte-Order (“Endianness”) • For a multi-byte datum, which part goes in which byte? • If $1 contains 1,000,000 (F4240H) and we store it into address 80: • On a “big-endian” machine, the “big” end goes into address 80 • On a “little-endian” machine, it’s the other way around 00 0F 42 40 … 79 80 81 82 83 84 … 40 42 0F 00 … 79 80 81 82 83 84 … \course\cpeg323-05F\Topic2-323.ppt

  31. Big-Endian vs. Little-Endian • Big-endian machines: MIPS, Sparc, 68000 • Little-endian machines: most Intel processors, Alpha, VAX, Intel 8086 • No real reason one is better than the other… • Compatibility problems transferring multi-byte data between big-endian and little-endian machines – CAREFUL! [Read Appendix A-43 for more information.] \course\cpeg323-05F\Topic2-323.ppt

  32. Addressing Modes - An ISA’s addressing modes answer the question: “where can operands be located?” • We have two types of storage in the MIPS (and most other machines): registers and main memory. • We can go to either or both for operands. A single operand can come from either a register or a memory location • and addressing modes offer various ways of specifying this location. \course\cpeg323-05F\Topic2-323.ppt

  33. Simple Addressing Modes In these modes, a location or datum is given directly in the instruction: \course\cpeg323-05F\Topic2-323.ppt

  34. Indirect Addressing Modes One or more registers are used to produce a memory address: \course\cpeg323-05F\Topic2-323.ppt

  35. Advanced Addressing Modes Extra features to support features in high-level languages or reduce the number of instructions during common memory accesses: \course\cpeg323-05F\Topic2-323.ppt

  36. Choices in Addressing Modes Anything goes: Any addressing mode may be used for any operand at any time - Easier to map high-level statements directly to instructions - Hard to design processor, due to all the complexity Limited addressing: Only allow a few modes, and/or restrict some operands to certain modes - Harder for compiler/programmer to follow all the rules - Code may be longer \course\cpeg323-05F\Topic2-323.ppt

  37. Frequency of Addressing Modes 3 programs measured on VAX, which supports all kinds of modes: Frequency of mode (%) Min. ave. max. Mode Name \course\cpeg323-05F\Topic2-323.ppt

  38. Empirical Data on Addressing Modes • How big do the displacements need to be? • In study of SPECin92 and SPECfp92, 99% of displacements fell within ± 215 • How big do the immediates (constants) need to be? • Studies show: 50% - 60% fit within 8 bits • 75%-80% fit within 16 bits  \course\cpeg323-05F\Topic2-323.ppt

  39. How Do We Represent Instructions? • We need some bits to tell what operation is performed (e.g., add, sub, mul, etc.) – this is called the opcode. • We need some bits for each operand and result (3 total, in our case): • What type of addressing mode • Number of the register, memory address and/or immediate constant \course\cpeg323-05F\Topic2-323.ppt

  40. Variable-Length Instructions Since the VAX allows any mode for any operand, there could be an instruction with three 32-bit addresses (direct addressing)  > 12 bytes in this instruction. But registers need only a few bits to specify, so 12 bytes would be wasteful for an instruction using 3 registers only! Must use variable-length instructions. On the VAX, instructions can vary from 1 to 17 bytes! \course\cpeg323-05F\Topic2-323.ppt

  41. Fixed-Length Instructions If every instruction has the same number of bits (preferable a nice even number like 16 or 32), many components of the processor will be simpler. But we either waste some amounts of space or can’t support all the addressing modes! \course\cpeg323-05F\Topic2-323.ppt

  42. Loading Small Integers • All registers in MIPS are 32 bits • What if we load a byte or halfword into a reg? • Load the bits into the lowest 8 or 16 bits of the reg. Unsigned load: All upper bits set to 0 Signed load: All upper bits set to sign bit (MSB of byte/halfword) \course\cpeg323-05F\Topic2-323.ppt

  43. The RISC Approach In a Reduced Instruction Set Computer • All instructions are the same size (32 bits on the MIPS) • Few addressing modes are supported (only the frequent ones) • Only a few instruction formats (makes decoding easier!) • Arithmetic instructions can only work on registers • Data in memory must be loaded into registers before processing - This is called a “load-store” architecture \course\cpeg323-05F\Topic2-323.ppt

  44. RISC Criteria[Colwell 85] • Single cycle operation • Load/store machine • Hardwired control • Relative few instructions and addressing modes • Fixed instruction format • More compile time effort \course\cpeg323-05F\Topic2-323.ppt

More Related