570 likes | 708 Vues
Section IV: Digital System Organization. CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University. Introduction. What is memory?
E N D
Section IV: Digital System Organization CEG 360/560; EE 451/651 Digital System Design Dr. Travis Doom, Assistant Professor Department of Computer Science and Engineering Wright State University
Introduction • What is memory? • What is the difference between cache/processor memory and main/system memory? • What is a microoperation? • How are register-transfer-level operations controlled? • How is control organized implemented in a microprocessor? • How does pipelining work? • What is the relationship between a C-program and the instructions that executed by the microprocessor? • What are the defining characteristics of modern RISC architectures? • Why is memory organized hierarchically?
Outline • Memory • SRAM • DRAM • Microprocessor Organization • Data paths • The Control Word • Pipelining • Control unit • Microprogrammed Control • Simple Computer Architecture • Computer Instructions • Instruction Set Architecture • Issues in Computer Design • CISC vs. RISC • Contemporary processors
Read/Write Memory (RWM / RAM) • RWM = RAM (Random Access Memory) • Highly structured like ROM’s and PLD’s • Can store and retrieve data at (relatively) the same speed • Static RAM (SRAM) retains data in latches (while powered) • Dynamic RAM (DRAM) stores data as capacitor charge; all capacitors must be recharged periodically. • Volatile Memory: Both Static and Dynamic RAM • Nonvolatile Memory: Data retained when power lost = ROMs, NVRAM (w/battery), Flash Memory
Basic Structure of SRAM n 2 x b RAM A0 • Address/Control/Data Out lines like a ROM (Reading) + Write Enable (WE) and Data In (DIN) (Writing) A1 An-1 DIN0 DOUT0 DIN1 DOUT1 DINb-1 DOUTb-1 CS OE WE
One Bit of SRAM • SEL and WR asserted ® IN data stored in D-latch (Write) • SEL only asserted ® D-latch output enabled (Read) • SEL not asserted ® No operation IN D Q OUT IN OUT SEL /SEL WR C /WR
Structure of 8x4 SRAM • 3-to-8 Decoder selects one row of cells for Read or Write • If ADDR + CS + OE are asserted, DOUT (3-0) is enabled. (Read) • If ADDR + DIN + CS + WEare asserted, DIN ® latches (Write)
DIN3 DIN2 DIN1 DIN0 8 x 4 SRAM 0 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 1 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 2 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 3 3-to-8 Decoder IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 4 A2 2 IN OUT IN OUT IN OUT IN OUT A1 1 SEL SEL SEL SEL A0 0 WR WR WR WR 5 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 6 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 7 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR /WE /CS /OE DOUT3 DOUT2 DOUT1 DOUT1
64K x 8 RAM 64K x 8 RAM DATA DATA ADRS ADRS CS CS R/W’ R/W’ Composite 128K x 8 Memory 216 = 64k 217 = 128k 218 = 256k Data Read/Write / 8 Decoder Mem En En 1 ADR 16 0 16 / ADR 15-0 How many 64K x 8 RAMs do we need to create a 256k x 8 Memory? Output
64K x 8 RAM 64K x 8 RAM DATA DATA ADRS ADRS CS CS R/W’ R/W’ Composite 64K x 16 Memory ADR 15-0 Read/Write CS 16 / 8 Data (16 lines) / / 8 8 / 8 / Output
Physical SRAM Array Should Be Square Example: 16 x 1 SRAM ® 4 x 4 Array DI 0 IN OUT IN OUT IN OUT IN OUT 2-to-4 Decoder SEL SEL SEL SEL WR WR WR WR 1 IN OUT IN OUT IN OUT IN OUT A1 1 SEL SEL SEL SEL A0 0 WR WR WR WR 2 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR 3 IN OUT IN OUT IN OUT IN OUT SEL SEL SEL SEL WR WR WR WR A3-A2 /WE /CS 2-to-4 Decoder /OE S 4-to-1 Mux E DO
SRAM Timing • During READ, outputs are combinational functions of ADDR, CS, OE (like ROM) • Inputs can freely change without problems (except for propagation delay from last input change to output) • During WRITE, data stored in latches, NOT FF’s. • Thus, Setup & Hold on Data IN relative to trailing edge of /WR • Address must be stable • for setup time before /WR asserted, and • for hold time after /WR deasserted • to prevent “spraying” data to multiple rows • /WR asserted when BOTH /CS and /WE asserted • /WR deasserted when EITHER /CS or /WE deasserted
max(tAA, tACS) ADDR stable stable stable ³ t AA /CS t OH t ACS /OE t t OZ OE t t t OZ OE AA valid DOUT valid valid Primary Spec for SRAMs READ Timing (SRAM) Like a ROM!
(WE-controlled write) (CS-controlled write) ADDR stable stable t t t CSW AS CSW /CS t t t t WP AH t AS WP AH /WE t t t t DS DH DS DH DIN valid valid WRITE Timing (SRAM)
bit line word line 1-bit DRAM cell 64K x 1 DRAM 256 x 256 array Row decoder Row ADDR Col ADDR 8 Control / ADDR Row register, Data mux/demux Control 64K x 1 DRAM /RAS /CAS 8 /WE / ADDR Din RAS CAS Dout WE Dout Din
DRAM Timing • Refresh cycle - RAS Only • (RAS asserted) Entire row is latched in row-address register • (RAS deasserted) The data in the row-address register is rewritten • Read cycle - RAS + CAS • (RAS asserted) Entire row is latched in row-address register • (CAS asserted) Data in register is multiplexed to output • (RAS deasserted) Data is row-address register is rewritten • (CAS deasserted) Output is released • Write cycle - RAS + WE + CAS • (RAS asserted) Entire row is latched in row-address register • (WE asserted) Data_in is stable • (CAS asserted) Demultiplex Data_in to row address register • (WE deasserted) Data_in is no longer stable • (RAS deasserted) Data in row-address register is rewritten • (CAS deasserted) Operation complete
Modern DRAM Timing • Fast-Page Mode (One RAS, multiple CAS) • Multiple bits of a row can be written before rewrite • Complex control, but much faster • Extended Data Out (One RAS, multiple CAS) • Latches the column address so that the next address can be prepared while the output is read • Saves ~10ns/read, and increase of 10-15% • Even more complex control! • SDRAM - Synchronous DRAM • Unlike normal DRAM, SDRAM is clocked! • Multiple signals and banks (row-address registers) allow “pipelined” operation
Small systems or very fast applications (cache memory) Large Memories: PC’s Mainframes RAM Summary SRAM: • Fast • Simple Interface • Moderate bit density (4 gates ® 4 to 6 transistors) • Moderate cost/bit DRAM (Dynamic RAM): • moderate speed • complex interface • High bit density (1 transistor cell) • Low cost/bit
Outline • Memory • SRAM • DRAM • Microprocessor Organization • Data paths • The Control Word • Pipelining • Control unit • Microprogrammed Control • Simple Computer Architecture • Computer Instructions • Instruction Set Architecture • Issues in Computer Design • CISC vs. RISC • Contemporary processors
Control Word • The selection variables for the datapath select the microoperation to be executed within the datapath for any given clock pulse. • Control Word: the combined values of the datapath control inputs in a specified order. • Control words consist of multiple fields, each of which represents part of the microoperation functionality. • e.g. Destination Address, Operand A Address, Operand B Address, Read Immediate/Constant, Function Select, Write Immediate/Constant, Read/Write
Pipelined Datapath • Datapath throughput can be increased by breaking up the datapath with registers and increasing the clock speed. • This is known as pipelining the datapath. • Throughput vs. Turnaround time • Throughput: The number of operations (units) per second (time period). • Turnaround time: The amount of time which elapses from the beginning of the operation’s execution (unit processing) to its completion. • Consider an assembly line (or fast-food drive-thru!) • Basic stages of a datapath microoperation: • Operand Fetch (OF) • Execute Function (EX) • Write Back Results (WB)
DatapathTiming Stage One Stage Two Pipeline Platforms Stage Three Min. Clock Period = 5 ns Min. Clock Period = 12 ns 200 MHz 83.3 MHz
Pipeline Execution Pattern • Not all of the pipelined units are necessarily active at all times • Filling and emptying the pipeline “wastes” time • Hazards exist!
The Control Unit • The control unit generates the signals for sequencing the operations in the datapath • A sequential circuit with states that dictate the control signals for the system • Using status conditions and control inputs, the sequential control unit determines the next state in which additional microoperations are activated. • Hardwired Control • The control unit is implemented to provide a particular digital function • Microprogrammed Control • The control unit’s binary control values are stores as words in a microprogrammed control (usually ROM). • Each word in the control contains a microinstruction • A sequence of microinstructions constitutes a microprogram • Firmware!
Programmable Control Units • The binary information stored in a digital computer can be classified as either data or control information (machine language instructions) • Von Neuman - Stored Program Model • Non-programmable Control Unit • A control unit which not responsible for obtaining instructions from memory, it determines the operations to be performed and the sequence of those operations based only upon its inputs and status bits • Programmable Control Unit: • A portion of the input to the system consists of a sequence of instructions. • Each instruction contains the information necessary for the control unit to determine a sequence of microoperations or which instruction to execute next • The address of the next instruction comes from a Program Counter (PC) • The PC can count (+1 instruction, normal operation) • The PC can load (branch instruction) • Control Units (Programmable or Non) can be single-cycle or multi-cycle • If any instruction requires more than one microoperation the machine is multi-cycle. • How does this complicate the design?
Diagram of a Instruction Decoder • The instruction decoder takes the inputs to the control unit (it this case an “instruction”) and creates a corresponding control word for a datapath microoperation. • This might be used in a single-cycle computer
Diagram of a (Hardwired)Programmable Computer Data Memory This is a single-cycle computer One instruction = One microoperation
Diagram of a Microprogrammed Control Unit • The values of the control signals (and outputs) are determined by the contents of the Control Memory (a.k.a. the Control Store) • A portion of the contents of the Control Memory is used (along with the next set of inputs). • The “next-address” field maintains internal state
A Simple Multi-cycle Computer This PC only increments… usually the PC must also be able to load
Diagram of aPipelined Control Unit • Basic stages of instruction execution: • Instruction Fetch (IF) • Decode & Operand Fetch (DOF) • Execute Function (EX) • Write Back Results (WB) • More stages are possible! • Relate this to the “Machine-cycle”
-processor RAM 4 Address Out Address 4 16 Instruction Data Out Data In Data Out Write Enable Write Start Reset 4 Data In Clock > Ready A Simple Computer Instruction Format: Opcode [15:12], Op A [11:8], Op B [7:4], Op C [3:0] Example Instruction: x2021 Opcode 2 Format: M[A] M[B] Instruction: M[0] M[2] Ram Contents (Before) ADDR DATA x0 x4 x1 xF x2 x9 … ... Ram Contents (After) ADDR DATA x0 x9 x1 xF x2 x9 … ...
-processor Data Unit Control Unit 4 4 4 4 4 Control Word 16 Instruction Address Out 14 Start Data Out Constant Data In VNCZ > Write Enable Reset > Ready Clock A Simple Microprocessor • R0 Constant B • Address Out R0, R1 Data In • R0 Constant A • Address Out R0, Data Out R1, Write Enable • Return to IDLE state (Sequencing Instruction) Example Instruction: x2021 Opcode 2 Format: M[A] M[B] Instruction: M[0] M[2]
Control Unit Constant Operand Select 16 12:0 Instruction 16:13 ? ? 4 4 Start Control Word Generate Data Signal Logic Generate Load Signal Logic 14 Mode Condition Code Next Address Operand Select VNCZ CONTROL STORE (ROM) Reset CLR DATA LD/CNT’ CAR Clock > microinstruction Address Data Write Enable Generate Ready Ready A (not so) Simple Control Unit • R0 Constant B • Address Out R0, R1 Data In • R0 Constant A • Address Out R0, Data Out R1, Write Enable • CAR x00 (IDLE STATE)
Outline • Memory • SRAM • DRAM • Microprocessor Organization • Data paths • The Control Word • Pipelining • Control unit • Microprogrammed Control • Simple Computer Architecture • Computer Instructions • Instruction Set Architecture • Issues in Computer Design • CISC vs. RISC • Contemporary processors
Processor ASIC ASIC Bus Memory I/O Device I/O Device A simple computer architecture • Generic Computer System. • Current architectures are performance driven, and vary widely. • Processor • Uniprocessor systems • ASIC (Application Specific Integrated Circuit) • Performs a specific task, not a general purpose processor (e.g. Voodoo) • I/O Device • Accesses data devices (e.g. Graphics Adapter, Disk Controller, et al.)
A simple microprocessor • Central Processing Unit • Control Unit, Integer Datapath (Load/Store, Integer ALU) • Floating Point Unit • Floating Point Datapath • Internal Cache • SRAM for Instruction Cache (i-cache) and Data Cache (d-cache) • Memory Management Unit • Controls communication with Main Memory and other I/O FPU Internal Cache Bus CPU MMU
Instruction Set Architecture • Microprocessors can only perform certain operations • Users determine which operations will be performed and in what order though the use of a program. • A program consists of a sequence of machine-executable instructions. • An instruction is a collection of bits that instructs the computer to perform a specific operation. • The set of instructions that a particular microprocessor can execute is its instruction set. • Instruction set architecture: A thorough description of the instruction set of a computer. • Users can not easily produce meaningful programs using the instruction set directly. • Compilers convert programs specified in high-level languages into the instruction set equivalent (a machine language program).
Computer Instructions (1) • High-Level Language - C • A = B + C; • Memory-Transfer Equivalent • Mem[A] Mem[B] + Mem[C] • Mem[EA00] Mem[EA08] + Mem[EA10] • Machine-Level Equivalent • Assembly (human readable) Machine RTL • Load R2, B E2EA08 R2 M[xEA08] • Load R3, C E3EA10 R3 M[xEA10] • Add R2, R2, R3 0223 R2 R2 + R3 • Store A, R2 F2EA00 M[xEA00] R2 • The bits of a machine instruction are divided into fields • eg: E2EA08 • E: Operation “Load”; 2: Destination Address R2; EA08: Address Field • The operation field (opcode) defines the format for the instruction
Computer Instructions (2) • There are three basic types of computer instructions • Register Instructions: operate on values stored in registers • Arithmetic, Shift, and Logic instructions • Move Instructions: move data between memory and registers • Load/Store instructions • Move/Copy portions of memory • Branch Instructions: select one of two possible next instructions to execute • Branch on condition, Unconditional branch (Jump) • Only one address is explicit, the other operand is implict • e.g.: Beq R2, R3, A • If the contents of R2 = R3 then execute the instruction at location A next (explict) • otherwise, execute the next instruction in the normal order (using the PC) (implict)
Instructions Vs. Microoperations • What is the difference between a computer instruction and a hardware microoperation? • Computer instruction: an operation stored in binary in the computer’s memory • The control unit uses the address or addresses provided by the program counter (PC) to retrieve the next instruction from memory • The control unit then decodes the instruction fields to perform the required microoperations for the execution of the instruction. • Thus, in microprogrammed control, each computer instruction corresponds to a microprogram!
Destination Register Destination Register Source Register A Source Register A Source Register B Operand B Opcode Opcode 15 15 9 9 8 8 6 6 5 5 3 3 2 2 0 0 Instruction Formats • Different Instructions have different types of instruction formats • Register, Implied, Immediate, Direct, Indirect, Relative, Indexed • Register: operands are hardware registers (e.g. Add R3, R2, R1) • Immediate: one operand is a constant (eg. Add R2, 3)
The Instruction-execution Cycle • Instruction Fetch (IF) stage • Get next instruction from the memory address referenced by the PC • Place the new instruction in the Instruction Register • Increment the PC to the next instruction address • Instruction decode (ID) stage • The instruction is recognized, or decoded • Determine the instruction format by examining the opcode • Operand fetch (OF) stage • Perform any calculations necessary to fetch the operand values • If necessary, fetch operands from memory to temporary registers • Execute operation (EX) stage • Execute the operation specified in the opcode • Branch instructions may update the PC • Writeback (WB) stage • Store the result of the operation in as determined by the instruction • Repeat the instruction-execution cycle (a.k.a. the machine cycle).
Outline • Memory • SRAM • DRAM • Microprocessor Organization • Data paths • The Control Word • Pipelining • Control unit • Microprogrammed Control • Simple Computer Architecture • Computer Instructions • Instruction Set Architecture • Issues in Computer Design • CISC vs. RISC • Contemporary processors