Download
instruction set architecture n.
Skip this Video
Loading SlideShow in 5 Seconds..
Instruction Set Architecture PowerPoint Presentation
Download Presentation
Instruction Set Architecture

Instruction Set Architecture

281 Views Download Presentation
Download Presentation

Instruction Set Architecture

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Instruction Set Architecture Pradondet Nilagupta Spring 2001 (original notes from Prof. Mike Schulte ) 204521 Digital System Architecture

  2. Overview ISA (1/2) • Concentrate on ISA • Introduce wide variety of design alternative to instruction set architecture • Focus on four topics • Classification of instruction set alternative • Give some qualitative assessment of the advantage and disadvantage of various approach • Present and analyze some instruction set measurement that are largely independent of a specific instruction 204521 Digital System Architecture

  3. Overview ISA (2/3) • Address the issue of a languages and compiler and their bearing on ISA • Show how these idea are reflected in DLX instruction set, which is typical of recent instruction set architectures • Examine a wide variety of architectural measurement • Measurements depend on the programs measured and on the compiler used in making these measurements 204521 Digital System Architecture

  4. Hot Topics in Computer Architecture • 1950s and 1960s: • Computer Arithmetic • 1970 and 1980s: • Instruction Set Design • ISA Appropriate for Compilers • 1990s: • Design of CPU • Design of memory system • Design of I/O system • Multiprocessors • Instruction Set Extensions 204521 Digital System Architecture

  5. Instruction Set Architecture • Instruction set architecture is the structure of a computer that a machine language programmer must understand to write a correct (timing independent) program for that machine. • The instruction set architecture is also the machine description that a hardware designer must understand to design a correct implementation of the computer. 204521 Digital System Architecture

  6. software instruction set hardware Instruction Set Architecture • The instruction set architecture serves as the interface between software and hardware 204521 Digital System Architecture

  7. Interface Design • A good interface: • Lasts through many implementations (portability, compatibility) • Is used in many different ways (generality) • Provides convenient functionality to higher levels • Permits an efficient implementation at lower levels 204521 Digital System Architecture

  8. What Are the Components of an ISA? (1/2) • Sometimes known as The Programmer’s Model of the machine • Storage cells • General and special purpose registers in the CPU • Many general purpose cells of same size in memory • Storage associated with I/O devices • The machine instruction set • The instruction set is the entire repertoire of machine operations • Makes use of storage cells, formats, and results of the fetch/execute cycle • i.e., register transfers 204521 Digital System Architecture

  9. What Are the Components of an ISA? (2/2) • The instruction format • Size and meaning of fields within the instruction • The nature of the fetch-execute cycle • Things that are done before the operation code is known 204521 Digital System Architecture

  10. M 6 8 0 0 I 8 0 8 6 V A X 1 1 P P C 6 0 1 ( i n t r o d u c e d 1 9 7 5 ) ( i n t r o d u c e d 1 9 7 9 ) ( i n t r o d u c e d 1 9 8 1 ) ( i n t r o d u c e d 1 9 9 3 ) 7 0 1 5 8 7 0 3 1 0 0 6 3 0 A A X R 0 3 2 6 4 - b i t B X 1 5 B D a t a 1 2 g e n e r a l f l o a t i n g p o i n t r e g i s t e r s p u r p o s e C X R 1 1 I X 6 s p e c i a l r e g i s t e r s r e g i s t e r s p u r p o s e D X 3 1 A P S P r e g i s t e r s P C F P 0 3 1 S P S t a t u s A d d r e s s S P 0 3 2 3 2 - b i t B P a n d P C g e n e r a l c o u n t S I p u r p o s e r e g i s t e r s D I r e g i s t e r s P S W 3 1 C S M e m o r y 0 3 1 D S 0 0 3 2 s e g m e n t 2 b y t e s 1 6 2 b y t e s S S r e g i s t e r s o f m a i n M o r e t h a n 5 0 o f m a i n m e m o r y 3 2 - b i t s p e c i a l m e m o r y E S c a p a c i t y p u r p o s e c a p a c i t y 3 2 1 6 2 – 1 2 – 1 r e g i s t e r s I P M o r e t h a n 3 0 0 S t a t u s F e w e r i n s t r u c t i o n s t h a n 1 0 0 0 0 i n s t r u c t i o n s 5 2 2 b y t e s 2 0 2 b y t e s o f m a i n o f m a i n m e m o r y m e m o r y c a p a c i t y c a p a c i t y 5 2 2 0 2 – 1 2 – 1 M o r e t h a n 2 5 0 M o r e t h a n 1 2 0 i n s t r u c t i o n s i n s t r u c t i o n s Programmer’s Models of Various Machines 204521 Digital System Architecture

  11. What Must an Instruction Specify?(1/2) • Which operation to perform • add r0, r1, r3 • Ans: Op code: add, load, branch, etc. • Where to find the operand or operands • add r0, r1, r3 • In CPU registers, memory cells, I/O locations, or part of instruction • Place to store result • add r0, r1, r3 • Again CPU register or memory cell 204521 Digital System Architecture

  12. What Must an Instruction Specify?(2/2) • Location of next instruction add r0, r1, r3 br endloop • Almost always memory cell pointed to by program counter—PC • Instruction Format (encoding) • How is it decoded? • Sometimes there is no operand, or no result, or no next instruction. Can you think of examples? 204521 Digital System Architecture

  13. Instructions Can Be Divided into Classes (1/2) • Data movement instructions • Move data from a memory location or register to another memory location or register without changing its form • Load—source is memory and destination is register • Store—source is register and destination is memory • Arithmetic and logic (ALU) instructions • Change the form of one or more operands to produce a result stored in another location • Add, Sub, Shift, etc. 204521 Digital System Architecture

  14. Instructions Can Be Divided into 3 Classes (2/2) • Branch instructions (control flow instructions) • Alter the normal flow of control from executing the next instruction in sequence • Br Loc, Brz Loc2,—unconditional or conditional branches 204521 Digital System Architecture

  15. Examples of Data Movement Instructions Instruction Meaning Machine MOV A, B Move 16 bits from memory location A to VAX11 Location B LDA A, Addr Load accumulator A with the byte at memory M6800 location Addr lwz R3, A Move 32-bit data from memory location A to PPC601 register R3 li $3, 455 Load the 32-bit integer 455 into register $3 MIPS R3000 mov R4, dout Move 16-bit data from R4 to output port dout DEC PDP11 IN, AL, KBD Load a byte from in port KBD to accumulator Intel Pentium LEA.L (A0), A2 Load the address pointed to by A0 into A2 M6800 204521 Digital System Architecture

  16. Examples of ALUInstructions Instruction Meaning Machine MULF A, B, C multiply the 32-bit floating point values at VAX11mem loc’ns. A and B, store at C nabs r3, r1 Store abs value of r1 in r3 PPC601 ori $2, $1, 255 Store logical OR of reg $ 1 with 255 into reg $2 MIPS R3000 DEC R2Decrement the 16-bit value stored in reg R2 DEC PDP11 SHL AX, 4 Shift the 16-bit value in reg AX left by 4 bit pos’ns. Intel 8086 • Notice again the complete dissimilarity of both syntax and semantics. 204521 Digital System Architecture

  17. Examples of Branch Instructions Instruction Meaning Machine BLSS A, Tgt Branch to address Tgt if the least significant VAX11bit of mem loc’n. A is set (i.e. = 1) bun r2Branch to location in R2 if result of previous PPC601floating point computation was Not a Number (NAN) beq $2, $1, 32 Branch to location (PC + 4 + 32) if contents MIPS R3000of $1 and $2 are equal SOB R4, Loop Decrement R4 and branch to Loop if R4 > 0 DEC PDP11 JCXZ Addr Jump to Addr if contents of register CX > 0. Intel 8086 204521 Digital System Architecture

  18. ISA Metrics • Orthogonality • No special registers, few special cases, all operand modes available with any data type or instruction type • Completeness • Support for a wide range of operations and target applications • Regularity • No overloading for the meanings of instruction fields • Streamlined • Resource needs easily determined • Ease of compilation (programming?), Ease of implementation, Scalability 204521 Digital System Architecture

  19. Instruction Set Design Issues (1/2) • Instruction set design issues include: • Where are operands stored? • registers, memory, stack, accumulator • How many explicit operands are there? • 0, 1, 2, or 3 • How is the operand location specified? • register, immediate, indirect, . . . • What type & size of operands are supported? • byte, int, float, double, string, vector. . . 204521 Digital System Architecture

  20. Instruction Set Design Issues (2/2) • What operations are supported? • add, sub, mul, move, compare . . . • How to encode them into instruction format? • Instructions should be multiples of Bytes. 204521 Digital System Architecture

  21. Single Accumulator (EDSAC 1950) Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953) Separation of Programming Model from Implementation High-level Language Based Concept of a Family (B5000 1963) (IBM 360 1964) General Purpose Register Machines Complex Instruction Sets Load/Store Architecture (CDC 6600, Cray 1 1963-76) (Vax, Intel 8086 1977-80) RISC (Mips,Sparc,88000,IBM RS6000, . . .1987+) Evolution of Instruction Sets 204521 Digital System Architecture

  22. Evolution of Instruction Sets • Major advances in computer architecture are typically associated with landmark instruction set designs • Ex: Stack VS. GPR (System 360) • Design decisions must take into account: • technology • machine organization • programming languages • compiler technology • operating systems • The design decisions in turn influence these. 204521 Digital System Architecture

  23. Classifying ISAs Accumulator (before 1960): 1 address add A acc -> acc + mem[A] Stack (1960s to 1970s): 0 address add tos -> tos + next Memory-Memory (1970s to 1980s): 2 address add A, B mem[A] -> mem[A] + mem[B] 3 address add A, B, C mem[A] -> mem[B] + mem[C] Register-Memory (1970s to present): 2 address add R1, A R1 -> R1 + mem[A] load R1, A R1 -> mem[A] Register-Register (Load/Store) (1960s to present): 3 address add R1, R2, R3 R1 -> R2 + R3 load R1, R2 R1 -> mem[R2] store R1, R2 mem[R1] -> R2 204521 Digital System Architecture

  24. Comparison of ISA Classes • Code Sequence for C = A+B • Memory efficiency? Instruction access? Data access? 204521 Digital System Architecture

  25. Stack Accumulator Register Register (register- Mem) (load/store) Push A Load A Load R1, A Load R1, A Push B Add B Add R1, B Load R2, B Add Store C Store C, R1 Add R3, R1, R2 Pop C Store C, R3 Comparison of ISA Classes • Memory efficiency? Instruction access? Data access? 204521 Digital System Architecture

  26. Ex. Expression Evaluation for 3-, 2-, 1-, and 0-Address Machines • Number of instructions & number of addresses both vary • Discuss as examples: size of code in each case 204521 Digital System Architecture

  27. A C B B*C A+B*C result A B A*B A*B A C A A*B A A*B A A*B A*B Stack Architectures • Instruction set: add, sub, mult, div, . . . push A, pop A • Example: A*B - (A+C*B) push A push B mul push A push C push B mul add sub 204521 Digital System Architecture

  28. I n s t r u c t i o n f o r m a t s M e m o r y C P U p u s h O p 1 ( T O S ฌ O p 1 ) B i t s : 8 2 4 O p 1 A d d r : O p 1 F o r m a t p u s h O p 1 A d d r T O S O p e r a t i o n R e s u l t S O S e t c . a d d ( T O S ฌ T O S + S O S ) B i t s : 8 S t a c k a d d F o r m a t P r o g r a m N e x t i A d d r : N e x t i 2 4 W h i c h o p e r a t i o n c o u n t e r W h e r e t o f i n d W h e r e t o f i n d o p e r a n d s , n e x t i n s t r u c t i o n a n d w h e r e t o p u t r e s u l t ( o n t h e s t a c k ) The 0-Address, or Stack, Machine and Instruction Format 204521 Digital System Architecture

  29. Stacks: Pros and Cons • Pros • Good code density (implicite top of stack) • Low hardware requirements • Easy to write a simpler compiler for stack architectures • Cons • Stack becomes the bottleneck • Little ability for parallelism or pipelining • Data is not always at the top of stack when need, so additional instructions like TOP and SWAP are needed • Difficult to write an optimizing compiler for stack architectures 204521 Digital System Architecture

  30. B B*C A+B*C A+B*C A A*B result Accumulator Architectures • Instruction Set add A, sub A, mult A, div A, . . . load A, store A • Example: A*B-(A+C*B) load B mul C add A store D load A mul B sub D 204521 Digital System Architecture

  31. C P U M e m o r y a d d O p 1 ( A c c ฌ A c c + O p 1 ) O p 1 A d d r : O p 1 W h e r e t o f i n d o p e r a n d 2 , a n d w h e r e t o p u t r e s u l t A c c u m u l a t o r I n s t r u c t i o n f o r m a t P r o g r a m 2 4 B i t s : 8 2 4 N e x t i A d d r : N e x t i c o u n t e r a d d O p 1 A d d r W h e r e t o f i n d n e x t i n s t r u c t i o n W h i c h W h e r e t o f i n d Need instructions to load and store operands: LDA OpAddr STA OpAddr o p e r a t i o n o p e r a n d 1 1-Address Machine and Instruction Format • Special CPU register, the accumulator, supplies 1 operand and stores result • One memory address used for other operand 204521 Digital System Architecture

  32. Accumulators: Pros and Cons • Pros • Very low hardware requirements • Easy to design and understand • Cons • Accumulator becomes the bottleneck • Little ability for parallelism or pipelining • High memory traffic 204521 Digital System Architecture

  33. Memory-Memory Architectures • Instruction set: (3 operands) add A, B, C sub A, B, C mul A, B, C (2 operands) add A, B sub A, B mul A, B • Example: A*B - (A+C*B) • 3 operands 2 operands mul D, A, B mov D, A mul E, C, B mul D, B add E, A, E mov E, C sub E, D, E mul E, B add E, A sub E, D 204521 Digital System Architecture

  34. C P U M e m o r y a d d O p 2 , O p 1 ( O p 2 ฌ O p 2 + O p 1 ) O p 1 A d d r : O p 1 O p 2 A d d r : O p 2 , R e s I n s t r u c t i o n f o r m a t P r o g r a m 2 4 N e x t i A d d r : N e x t i c o u n t e r B i t s : 8 2 4 2 4 W h e r e t o f i n d a d d O p 2 A d d r O p 1 A d d r n e x t i n s t r u c t i o n W h i c h W h e r e t o f i n d o p e r a n d s o p e r a t i o n W h e r e t o p u t r e s u l t The 2-Address Machine and Instruction Format • Result overwrites Operand 2 • Needs only 2 addresses in instruction but less choice in placing data 204521 Digital System Architecture

  35. Memory - Memory:Pros and Cons • Pros • Requires fewer instructions (especially if 3 operands) • Easy to write compilers for (especially if 3 operands) • Cons • Very high memory traffic (especially if 3 operands) • Variable number of clocks per instruction • With two operands, more data movements are required 204521 Digital System Architecture

  36. Register-Memory Architectures • Instruction Set: add R1, A sub R1, A mul R1, B load R1, A store R1, A • Example: A*B - (A+C*B) mul R1, B /* A*B */ store R1, D load R2, C mul R2, B /* C*B */ add R2, A /* A + CB */ sub R2, D /* AB - (A + C*B) */ 204521 Digital System Architecture

  37. Memory-Register: Pros and Cons • Pros • Some data can be accessed without loading first • Instruction format easy to encode • Good code density • Cons • Operands are not equivalent (poor orthorganality) • Variable number of clocks per instruction • May limit number of registers 204521 Digital System Architecture

  38. Load-Store Architectures • Instruction Set: add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3 load R1, R4 store R1, R4 • Example: A*B - (A+C*B) load R2, &B load R3, &C load R4, R1 load R5, R2 load R6, R3 mul R7, R6, R5 /* C*B */ add R8, R7, R4 /* A + C*B */ mul R9, R4, R5 /* A*B */ sub R10, R9, R8 /* A*B - (A+C*B) */ 204521 Digital System Architecture

  39. CPU Memory a d d , R e s , O p 1 , O p 2 ( R e s ฌ O p 2 + O p 1 ) O p 1 A d d r : O p 1 O p 2 A d d r : O p 2 R e s A d d r : R e s I n s t r u c t i o n f o r m a t B i t s : 8 2 4 2 4 2 4 P r o g r a m 2 4 N e x t i A d d r : N e x t i c o u n t e r a d d R e s A d d r O p 1 A d d r O p 2 A d d r W h e r e t o f i n d W h i c h W h e r e t o W h e r e t o f i n d o p e r a n d s n e x t i n s t r u c t i o n o p e r a t i o n p u t r e s u l t The 3-Address Machine and Instruction format • Address of next instruction kept in processor state register—the PC (except for explicit branches/jumps) • Rest of addresses in instruction • Discuss: savings in instruction word size 204521 Digital System Architecture

  40. Load-Store: Pros and Cons • Pros • Simple, fixed length instruction encoding • Instructions take similar number of cycles • Relatively easy to pipeline • Cons • Higher instruction count • Not all instructions need three operands • Dependent on good compiler 204521 Digital System Architecture

  41. Registers:Advantages and Disadvantages • Advantages • Faster than cache (no addressing mode or tags) • Deterministic (no misses) • Can replicate (multiple read ports) • Short identifier (typically 3 to 8 bits) • Reduce memory traffic • Disadvantages • Need to save and restore on procedure calls and context switch • Can’t take the address of a register (for pointers) • Fixed size (can’t store strings or structures efficiently) • Compiler must manage 204521 Digital System Architecture

  42. C P U I n s t r u c t i o n f o r m a t s R e g i s t e r s M e m o r y l o a d R 8 , O p 1 ( R 8 ฌ O p 1 ) l o a d R 8 O p 1 A d d r : O p 1 l o a d R 8 O p 1 A d d r R 6 R 4 a d d R 2 , R 4 , R 6 ( R 2 ฌ R 4 + R 6 ) a d d R 2 R 4 R 6 R 2 P r o g r a m N e x t i c o u n t e r General Register Machine and Instruction Formats 204521 Digital System Architecture

  43. General Register Machine and Instruction Formats • It is the most common choice in today’s general-purpose computers • Which register is specified by small “address” (3 to 6 bits for 8 to 64 registers) • Load and store have one long & one short address: 1- addresses • Arithmetic instruction has 3 “half” addresses 204521 Digital System Architecture

  44. Real Machines Are Not So Simple • Most real machines have a mixture of 3, 2, 1, 0, and 1- address instructions • A distinction can be made on whether arithmetic instructions use data from memory • If ALU instructions only use registers for operands and result, machine type is load-store • Only load and store instructions reference memory • Other machines have a mix of register-memory and memory-memory instructions 204521 Digital System Architecture

  45. Byte Ordering • Idea • Bytes in long word numbered 0 to 3 • Which is most (least) significant? • Can cause problems when exchanging binary data between machines • Big Endian: Byte 0 is most, 3 is least • IBM 360/370, Motorola 68K, Sparc. • Little Endian: Byte 0 is least, 3 is most • Intel x86, VAX • Alpha • Chip can be configured to operate either way • DEC workstation are little endian • Cray T3E Alpha’s are big endian 204521 Digital System Architecture

  46. c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] s[0] s[1] s[2] s[3] i[0] i[1] l[0] Byte Ordering Example (1/2) union { unsigned char c[8]; unsigned short s[4]; unsigned int i[2]; unsigned long l[1]; } dw; 204521 Digital System Architecture

  47. Byte Ordering Example (2/2) int j; for (j = 0; j < 8; j++) dw.c[j] = 0xf0 + j; printf("Characters 0-7 == [0x%x,0x%x,0x%x,0x%x,0x%x,0x%x,0x%x,0x%x]\n", dw.c[0], dw.c[1], dw.c[2], dw.c[3], dw.c[4], dw.c[5], dw.c[6], dw.c[7]); printf("Shorts 0-3 == [0x%x,0x%x,0x%x,0x%x]\n", dw.s[0], dw.s[1], dw.s[2], dw.s[3]); printf("Ints 0-1 == [0x%x,0x%x]\n", dw.i[0], dw.i[1]); printf("Long 0 == [0x%lx]\n", dw.l[0]); 204521 Digital System Architecture

  48. Byte Ordering on Alpha Little Endian f0 f1 f2 f3 f4 f5 f6 f7 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] LSB MSB LSB MSB LSB MSB LSB MSB s[0] s[1] s[2] s[3] LSB MSB LSB MSB i[0] i[1] LSB MSB l[0] Print Output on Alpha: 204521 Digital System Architecture

  49. Byte Ordering on x86 Little Endian f0 f1 f2 f3 f4 f5 f6 f7 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] LSB MSB LSB MSB LSB MSB LSB MSB s[0] s[1] s[2] s[3] LSB MSB LSB MSB i[0] i[1] LSB MSB l[0] Print Output on Pentium: 204521 Digital System Architecture

  50. Byte Ordering on Sun Big Endian f0 f1 f2 f3 f4 f5 f6 f7 c[0] c[1] c[2] c[3] c[4] c[5] c[6] c[7] MSB LSB MSB LSB MSB LSB MSB LSB s[0] s[1] s[2] s[3] MSB LSB MSB LSB i[0] i[1] MSB LSB l[0] Print Output on Sun: Characters 0-7 == [0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7] Shorts 0-3 == [0xf0f1,0xf2f3,0xf4f5,0xf6f7] Ints 0-1 == [0xf0f1f2f3,0xf4f5f6f7] Long 0 == [0xf0f1f2f3] 204521 Digital System Architecture