480 likes | 602 Vues
SCSC 311 Information Systems: hardware and software. Objectives. CPU execution cycle CPU instructions Instruction format CPU design: CISC vs. RISC CPU registers Enhancing CPU performance The limitations of semiconductor-based microprocessors. Review: CPU Components. Control unit
E N D
Objectives • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
Review: CPU Components • Control unit • Moves data and instructions between main memory and registers • Arithmetic logic unit (ALU) • Performs computation and comparison operations • Set of registers • Storage locations that hold inputs and outputs for the ALU
A complex chain of events occurs when CPU executes a program.
Index • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
Instructions and Instruction Sets • Instruction • Is lowest-level command to the CPU • A bit string, logically divided into components (op code and operands) • Op code: is the unique binary number represents a instruction • Operands: the input values for the instruction (data or address) • Instruction sets is a collection of instructions that a CPU can process • Vary among different CPUs • Three types of instruction • data movement • data transformation • sequence control
Three types of instructions • Data Movement Instructions • Copy data among registers, primary storage, secondary storage, and I/O devices • Load: data transfer from RAM to a register • Store: data transfer from a register to RAM • Data Transformation Instructions (details later) • Boolean operations (NOT, AND, OR, XOR) • Addition (ADD) • Bit manipulation (SHIFT) • Logical shift • Arithmetic shift
Three types of instructions • Sequence control instructions alter the flow of instruction execution • Branch instruction • Normally the control unit fetches the next sequential instruction from RAM at the end of each execution cycle; Q: how would the control unit accomplish this? • Ans: In Branch instruction, one operand contains the RAM address for the next instruction is loaded into PC register • Unconditional branch always depart from the normal sequential execution sequence • Conditional branch depart from the normal sequential execution sequence only if a specific condition is met • The control unit checks a register which contains the result from a Boolean operation • Halt instruction suspends the flow of instruction execution in the current program. Q: what happens when an executing program halt?
Six Data Transformation Instructions • The rules of NOT, AND, OR, XOR, ADD (self-study) • ADD • SHIFT (next slide)
SHIFT Instruction • Two types of SHIFT instruction • Logic SHIFT and Arithmetic SHIFT • Logic SHIFT Q: What can a logic SHIFT instruction do?
Logical SHIFT Ans: A logic SHIFT instruction can extract a signal bit from a bit string. • E.g. 1, extract the fourth bit and put this bit on the rightmost • E.g. 2, extract and check the sign bit of a 2’s complement number
Arithmetic SHIFT Arithmetic SHIFT instructions perform multiplication or division. • For unsigned binary numbers: e.g.1: multiple by 2, e.g.2: divided by four • For 2’s complement numbers: need to preserve the sign bit first
Complex Processing Operations • Complex processing operations can be implemented by combining the primitive instructions • Examples … • Most CPUs provide a much larger instruction set • Directly support complex instructions, such as multiplication, division, … Q: Why include these complex instructions in CPU instruction set?
Complex Processing Operations • Ans: Tradeoff of CPU design • Tradeoff between CPU complexity and programming simplicity • directly support complex instruction complicates circuitry but reduces programming complexity • Tradeoff between CPU complexity and program execution speed • multi-step instruction sequences execute faster if they are executed within hardware as a single instruction – avoiding overhead • Note that: additional instructions are required when new data types are added • E.g., If double precision floating point data type are supported by CPU, a set of instructions are required for this data type
Index • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
Instruction Format • Instruction format is a template • specifies the number of operands • specifies the position and length of the op code and operand(s) • Since instructions vary in the number and type of operands, CPUs support multiple instruction formats (in the next slide) • Instruction formats vary among CPUs: • op code size • meaning of specific op code • length and coding format of operand • Etc. ...
(1) A 20-bit instruction uses register inputs and output: 8-bit Op code, 3 4-bit operands store register numbers (2) A 32-bit load and store instruction: 8-bit Op code, 2 4-bit operands, 1 16-bit operand
Index • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
Complex Instruction Set Computing (CISC) • CISC • In early days, memory is expensive, slow • CPU designer provided complex instructions do more work per instruction each complex instruction require less memory and execution time • Features of CISC • Need less memory for program storage and execution • Complex instructions usually have variable instruction length • A large instruction set complicates CPU design CISC CPUs are complicated hard to manufacture
Reduced Instruction Set Computing (RISC) • RISC is a relatively new philosophy of CPU design (1980s - ) • Absence of some complex instructions from the instruction set • RISC CPUs do not combine data transformation and data movement in one instruction • RISC uses fixed length instructions, short instruction length, large number of general-purpose registers • Feature of RISC • Need more memory for program storage and execution • Inefficient at executing complex instructions • RISC CPU is simple, easy to manufacture
RISC vs. CISC • RISC vs. CISC • OSs are implemented based on CPU design • RISC chip: e.g. Hewlett Packard's PA-RISC processor Apples’ power Macintosh and some version of Linux are implemented based on RISC processor • CISC chip : e.g. Intel Pentium, Xeon, Itanium Win OS are implemented based on CISC processor • Pros and Cons of CISC and RISC e.g. c = a * b Q 1: Which CPU design is better ? Q 2: Why does Intel use CISC design?
Index • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
CPU Registers • Two primary roles of registers • general-purpose registers: hold data for currently executing program that is needed quickly or frequently • special-purpose registers: store information about currently executing program and status of CPU
General-Purpose Registers • General-purpose registers hold intermediate results and frequently needed data items • like a scratch-pad for CPU • Used only by currently executing program • Implemented within the CPU, so that contents can be read or written quickly • Increasing general-purpose registers usually decreases program execution time, to a point Q: Why is that?
Special-Purpose Registers • CPU uses special-purpose registers to track processor and program status • Some special purpose registers • Instruction register • Instruction pointer (a.k.a. program counter) • Program status word (PSW) • each bit in PSW is called a flag • At the end of each execution cycle, control unit tests PSW flags to determine whether an error has occurred. • Examples of PSW bits …
Word Size • A word is a unit of data that contains a fixed number of bits. • the amount of data that a CPU processes at a time • 32-bit CPU, 64-bit CPU • Word size • matches the size of general purpose registers • is a fundamental CPU design decision • Implications for system bus design and implementation of RAM • the bus width should be at least as large as word size • RAM should be able to read / write a word at a time • Increasing word size usually increases CPU efficiency, up to a point Q: why is that? E.g.,: Doubling word size generally increases the number of CPU components by 2.5 to 3 times
Index • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU Registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
Clock Rate • System clock • A digital circuit that generates timing pulses (ticks) and transmits the pulses to other components • All devices coordinate their activities with the system clock. • The clock rate: the frequency at which the system clock generates timing pulses, measured in MHz or GHz • The CPU cycle time – inverse of clock rate, measured in nanosecond Q: A computer has clock rate 5 GHz, what is CPU cycle time?
Clock Rate • The clock rate / cycle time is only a part of CPU performance measurement • The rate of actual / average instruction execution is measured in MIPS or MFLOPS • Simple instructions need one cycle time • Complex instructions usually need multiple cycle time • CPU relies on slower devices (RAM or HD) for supply of data • The performance of a computer system is not only decided by the CPU, but also by other devices (RAM, system bus … ) • Example … • Wait state: each clock cycle that the CPU spends waiting for a slower device Q: How to enhance the performance of a computer system ?
Pipelining • Basic observation: each step in fetch and execution cycle is performed by a separate portion or stage of the CPU circuitry. • Fetch • Increment IP • Decode • Access ALU imputs • Execute arithmetic / comparison • Store ALU output • Pipelining • Organizing CPU circuitry to enable multiple instructions to be in different stages of execution at the same time. • Similar to an assembly line
Challenges in Pipelining It is difficult to fully realize the theoretical improvement of pipelining • hard to design a CPU that finishes different stages in one clock cycle. • Some instructions in a program are not executed sequentially. • Conditional branch – CPU does not know which branch to take until evaluate a condition has to break the pipelining
Branch prediction and speculative execution • Some solutions to the problem of conditional branch in pipelining • Early evaluation • CPU gets several instructions ahead of the current one, exam whether there is conditional branch instruction, if so, try to evaluate the conditional branch instruction earlier. • But not always possible • Branch prediction • CPU guesses which branch to take based on past experience (maybe CPU executed this portion of code before) • Cannot guarantee taking the correct path • Simultaneous execution • CPU executes both paths until the condition branch is evaluated, then aborts the incorrect path • Requires redundant CPU stages and registers
Multiprocessing • Multiprocessing: CPU architecture duplicates CPUs or processor stages can execute in parallel. • Some approaches of multiprocessing: • Duplicate circuitry for some or all processing stages within a single CPU (80’s) e.g. Sun UltraSparc CPU duplicates ALU, Registers • Embedding multiple CPUs in a computer system and sharing memory (90’s) • Multiple CPUs to be placed on the same chip and sharing memory • Enable multiple CPUs communicate at higher speed (2000 -)
Index • CPU execution cycle • CPU instructions • Instruction format • CPU design: CISC vs. RISC • CPU Registers • Enhancing CPU performance • The limitations of semiconductor-based microprocessors
The Physical CPU • CPU is a complex system • Contains millions of switches, which perform basic processing functions • From early CPUs with hundreds of switches to modern CPUs with millions of switches • The physical implementation of CPUs • Switches and Gates • Are basic building blocks of computer processing circuits
Switches and Gates • Electronic switches • Control electrical current flow in a circuit • Implemented as transistors • a solid state semiconductor device • control the flow of electronic current • Gates • An interconnection of switches • A circuit that can perform a processing function on an individual binary electrical signal, or bit • The construction of switches and the properties of electricity determine the CPU’s speed and reliability.
Addition circuit: combines a half-adder and an array of full adders (The detailed layout is not required.)
Processor Fabrication • Performance and reliability of processors has increased with improvements in materials and fabrication techniques • Transistors and integrated circuits (ICs) • Microchips and microprocessors • First microprocessor (1971) 2,300 transistor • Current memory chip – 300 million transistors • Small circuit size, low-resistance materials, and heat dissipation ensure fast and reliable operation • Fabricated using expensive processes - etching process (details are not required)
Current Technology Capabilities and Limitations Moore’s Law: rate of increase in transistor density on microchips doubles every 18-24 months with no increase in unit Cost
Current Technology Capabilities and Limitations • Rock’s Law • Arthur Rock made a short addendum to Moore’s Law • Cost of fabrication facilities for the latest chip generation doubles every four years • E.g., A fabrication facility using latest production processes costs at least $10 B. Q1: Does Rock’s Law mean CPUs are becoming more expensive? Q2: Would Moore’s Law always be true?
Future Trends • Semiconductors are approaching fundamental physical size limits • Further miniaturization will be more difficult to achieve • The nature of etching process • The limits of semiconducting materials • Some technologies may improve performance beyond semiconductor limitations (details are not required) • Optical processing • Hybrid optical-electrical processing • Quantum processing