Enhancing Processor Performance through Pipeline Architecture Techniques

Parallel architecture Technique

Pipelining Processor • Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process being executed in a special dedicated segment that operates concurrently with all other segments. Any operation that can be decomposed into a sequence of sub-operations of about the same complexity can be implemented by a pipeline processor

Linear Pipeline processor • Alinear pipeline processoris a cascade of processing stages which are linearly connected to perform a fixed function over a stream of data flowing from one end to the other. In modern computers, linear pipelines are applied for instruction execution, arithmetic computation and memory access operations.

Asynchronous and Synchronous Models • A linear pipeline is constructed with k processing stages (segments). External inputs (operands) are fed into the pipeline at the first stage S1. The processed results are passed from stage Si to stage Si+1 for all i = 1, 2, …., k-1. The final result emerges from the pipeline at the last stage. Depending on the control of data flow along the pipeline, linear pipeline is modeled into two categories: asynchronous and synchronous.

Asynchronous Model • data flow between adjacent stages in asynchronous pipeline is controlled by a handshaking protocol. When stage Si is ready to transmit, it sends a ready signal to stage Si+1. After stage Si+1 receives the incoming data, it returns an acknowledge signal to Si.

Synchronous Model • The operands pass through all segments in a fixed sequence. Each segment consists of a combinational circuit Sithat performs a suboperation over the data stream flowing through the pipe Isolating registersR (latches) are used to interface between stages and hold the intermediate results between the stages. Upon the arrival of a clock pulse, all registers transfer data to the next stage simultaneously.

Memory Interleaving • An instruction pipeline may require the fetching of an instruction and an operand at the same time from two different segments. Interleaving is an advanced technique used by high-end motherboards/chipsets to improve memory performance. Memory interleaving increases bandwidth by allowing simultaneous access to more than one chunk of memory. This improves performance because the processor can transfer more information to/from memory in the same amount of time, and helps alleviate the processor-memory bottleneck that is a major limiting factor in overall performance.

Interleaving works by dividing the system memory into multiple blocks. The most common numbers are two or four, called two-way or four-way interleaving, respectively. Each block of memory is accessed using different sets of control lines, which are merged together on the memory bus and has its own address register AR and data register DR. When a read or write is begun to one block, a read or write to other blocks can be overlapped with the first one. The more blocks, the more that overlapping can be done. In an interleaved memory, different sets of addresses are assigned to different memory modules.

organization of a four-way memory interleaving.

Example • Assign addresses to an array of 1024 words to be stored in a four-way memory interleaving

To use a one memory module to store 1024 words, the address register AR of this module must be at least 10-bit register. If we used a four-way memory interleaving (four memory modules) instead, 256 words can be stored in each module and the address register of each module can be 8-bit register. Two bits should be used to select the memory module (one of four). To assign addresses to 1024 words, this will be as follows: • Address within module • the module select x x x x x x x x X X • 8-bit 2-bit

Enhancing Processor Performance through Pipeline Architecture Techniques

Enhancing Processor Performance through Pipeline Architecture Techniques

Presentation Transcript

Overview of Parallel Architecture

CS 258 Parallel Computer Architecture Lecture 1 Introduction to Parallel Architecture

Parallel Computer Architecture

Parallel Processing: Architecture Overview

Principles of Parallel Architecture

GPU Parallel Execution Model / Architecture

CMPE 421 Parallel Computer Architecture

CMPE 421 Parallel Computer Architecture

CMPE 421 Parallel Computer Architecture

Parallel computer architecture classification

CMPE 421 Parallel Computer Architecture

The TickerTAIP Parallel RAID Architecture

Parallel Architecture Models

Parallel Architecture

Parallel Architecture is Ubiquitous

Parallel computing technique for EM modeling

Overview of Parallel Architecture

GPU Parallel Execution Model / Architecture

Computer Architecture Parallel Processors

Inter-Processor Parallel Architecture

Parallel Processing: Architecture Overview

Lecture 1: Parallel Architecture Intro