Machine Organization (Part 2)

Machine Organization (Part 2) CHAPTER 3

CPU processing methods

CPU processing methods • In this topic, you will be introduced to the different and interrelated CPU processing methods. • The common goal is to increase the performance of the CPU. Among the methods are: • CISC and RISC • Multiprocessing • Pipelining • Superscalar

CPU Architecture • CISC – Complex Instruction Set Computer • RISC – Reduced Instruction Set Computer • CISC vs. RISC Comparisons

CISC Architecture • Examples • Intel x86, IBM Z-Series Mainframes, older CPU architectures • Characteristics • Few general purpose registers • Many addressing modes • Large number of specialized, complex instructions • Instructions are of varying sizes

Limitations of CISC Architecture • Complex instructions are infrequently used by programmers and compilers • Memory references, loads and stores, are slow and account for a significant fraction of all instructions • Procedure and function calls are a major bottleneck • Passing arguments • Storing and retrieving values in registers

RISC (Reduced Instruction Set Computer) • Attempts to produce more CPU power by eliminating major bottlenecks to instruction execution speed: • Reducing number of data memory access by using registers more effectively. • Simplifying the instruction set.

RISC (Reduced Instruction Set Computer) • Features: • Examples: • Power PC, Sun Sparc, Motorola 68000 • Limited and simple instruction set. • Fixed length, fixed format instruction words • Enable pipelining, parallel fetches and executions

RISC (Reduced Instruction Set Computer) • Features: (cont.) • Limited addressing modes. • Reduce complicated hardware • Register-oriented instruction set • Reduce memory accesses • Large bank of registers • Reduce memory accesses • Efficient procedure calls

CISC vs. RISC Processing

CISC vs. RISC Performance Comparison • RISC  Simpler instructions CISC  more instructions CISC  more memory accesses • RISC  more bus traffic and increased cache memory misses • More registers would improve CISC performance but no space available for them • Modern CISC and RISC architectures are becoming similar

Multiprocessing • The use of more than 1 CPU to process instructions. • Reasons for using multiprocessing: • Increase the processing power of a system. • Enables parallel processing – programs can be divided into independent pieces and the different parts executed simultaneously on multiple processors.

Multiprocessing

Multiprocessing • Since the execution speed of a CPU is directly related to the clock speed, equivalent processing power can be achieved at much lower clock speeds, reducing power consumption, heat and stress within the various computer components. • Adding more CPUs is relatively inexpensive. • If a CPU encounters a problem, other CPUs can continue instruction execution, increasing overall throughput.

Multiprocessing • Two types: a) tightly coupled system b) loosely coupled system

Tightly Coupled System • Identical access to programs, data, shared memory, I/O, etc. • Easily extends multi-tasking, and redundant program execution • Two ways to configure • Master-slave multiprocessing • Symmetrical multiprocessing (SMP)

Multiprocessing • Typical multiprocessing system configuration

Multiprocessing - Master-slave Multiprocessing • Master CPU • Manages the system • Controls all resources and scheduling • Assigns tasks to slave CPUs

Multiprocessing - Master-slave Multiprocessing • Advantages • Simplicity • Protection of system and data • Disadvantages • Master CPU becomes a bottleneck • Reliability issues – if master CPU fails entire system fails • Applications • Game, Finance, Economics, Biology, Physics

Multiprocessing – Symmetrical Multiprocessing • Each CPU has equal access to resources • Each CPU determines what to run using a standard algorithm

Multiprocessing – Symmetrical Multiprocessing • Disadvantages • Resource conflicts – memory, i/o, etc. • Complex implementation • Advantages • High reliability • Fault tolerant support is straightforward • Balanced workload

Pipelining • A Pipelining is a technique used in advanced microprocessors where the microprocessor begins executing a second instruction before the first has been completed. • That is, several instructions are in the pipeline simultaneously, each at a different processing stage.

Pipelining • Computer processors can handle millions of instructions each second. Once one instruction is processed, the next one in line is processed, and so on. • A pipeline allows multiple instructions to be processed at the same time. While one stage of an instruction is being processed, other instructions may be undergoing processing at a different stage. • Without a pipeline, each instruction would have to wait for the previous one to finish before it could even be accessed.

Pipelining • With pipelining, the processor begins fetching the second instruction before it completes the execution of the first instruction. • This way, the processor does not have to wait for one instruction to complete before fetching the next. • In computing, a pipeline is a set of data processing elements connected in series, so that the output of one element is the input of the next one. • The elements of a pipeline are often executed in parallel or in time‐sliced fashion.

Pipelining • The pipeline is divided into segments and each segment can execute its operation concurrently with the other segments. • When a segment completes an operation, it passes the result to the next segment in the pipeline and fetches the next operation from the preceding segment. • The final results of each instruction emerge at the end of the pipeline in rapid succession.

Pipelining

Pipelining • It is not useful to pipe different types of instructions through a single pipeline – different execution units are created based on general types of instructions: • Load/store unit • Integer arithmetic unit • Floating point arithmetic unit • Branch unit

Pipelining • Pipeline hazards • Situations that prevent the next instruction in the instruction stream from executing during its designated clock cycle. • The instruction is said to be stalled. • Effect – stall following instructions too. No new instructions are fetched during the stall.

Pipelining • Types of hazards: • Structural hazard • Control Hazard • Data hazard

Pipelining • Structural hazard – attempt to use the same resource two different ways at a time. • Eg: use the register for multiplication and division operation at the same time.

Pipelining • Data hazard – attempt to use data before it is ready • Eg: the following instruction depends on the result of prior instruction in the pipeline. • Control hazard – attempt to make a decision before a condition is evaluated • Eg: branch instructions

Pipelining • How to overcome hazards? • Instruction reordering – separate dependent instructions so they are not executed one right after the other. • Prediction, superscalar processing.

Superscalar • It means processing more than one instruction per clock cycle. • It is a standard feature in modern computer systems. • Superscalar processing can increase the throughput by double or more. • Separate fetch and execute cycles as much as possible • Buffers for fetch and decode phases • Parallel execution units

Scalar VS Superscalar

Superscalar Technical Issues • Out-of-order processing • Branch instruction processing • Conflict of resources

Out of Order Processing • Hazard / dependency • later instruction depend on the result of earlier instruction. • Data dependency • later instruction completes ahead of the earlier one. • Implication • wrong order. • Solution • provide reservation station within each execution unit to hold suspended instructions.

Out of Order Processing (cont.) • Another solution – search ahead for instructions without apparent dependencies to keep execution units busy. • For instance, Intel x86: can search 20 – 30 instructions ahead if necessary.

Branch Instruction Processing • Flow / branch dependencies • conditional branch instructions. • Solution can be broken into 2 parts: • optimize correct branch selection • methods to prevent errors. Instructions may beexecuted out of order but they must be completed in the correct order

Branch Instruction Processing (cont.) • Speculative execution (prevent errors): • separate bank of registers used to hold results from later instructions until previous instructions are complete. • result transferred into actual register and memory locations.

Branch Instruction Processing (cont.) • Optimization: • maintain more than two pipelines • predict the correct path based on program usage and performance – branch history table

Conflict of Resources • Conflict • between instructions that use the same registers • Solution • use the same bank of registers • Bank of registers • hold the results of speculative instructions until instruction complete • Concept • rename register / logical registers / register alias tables. This would allow two instructions to use “same” register to execute simultaneously without holding up each other’s work.

Machine Organization (Part 2)