EECS150 - Digital Design Lecture 20 - Memory

EECS150 - Digital DesignLecture 20 - Memory April 4&9, 2002 John Wawrzynek EECS150 - Lec19-memory

Uses: data & program storage general purpose registers buffering table lookups CL implementation Whenever a large collection of state elements is required. Types: RAM - random access memory ROM - read only memory EPROM, FLASH - electrically programmable read only memeory Example RAM: Register file regid = register identifier sizeof(regid) = log2(# of reg) WE = write enable Memory Basics EECS150 - Lec19-memory

Functionally the regfile is equivalent to a 2-D array of flip-flops: Cell with write logic: Register File Internals How do we go from "regid" to "SEL"? EECS150 - Lec19-memory

Regid (address) Decoding EECS150 - Lec19-memory

Special circuit tricks are used for the cell array to improve storage density. (We will look at these later) RAM/ROM naming convention: examples: 32 X 8, "32 by 8" => 32 8-bit words 1M X 1, "1 meg by 1" => 1M 1-bit words Standard Internal Memory Organization EECS150 - Lec19-memory

Functional Equivalence: Of course, full tri-state buffers are not needed at each cell point. Single transistors are used to implement zero cells. Logic one’s are derived through precharging or bit-line pullup transistor. Read Only Memory (ROM) EECS150 - Lec19-memory

Controls physical aspect ratio In DRAM, allows reuse of chip address pins Column MUX in ROMs and RAMs: EECS150 - Lec19-memory

example 256 X 8 ROM using 256 X 4 parts: example: 1K X * ROM using 256 X 4 parts: each module has tri-state outputs: Cascading Memory Modules (or chips) EECS150 - Lec19-memory

Definitions • Bandwidth: Total amount of data accross out of a device or across an interface per unit time. (usually Bytes/sec) • Latency: A measure of the time from a request for a data transfer until the data is received. Memory Interfaces for Acessing Data • Asynchronous (unclocked): A change in the address results in data appearing • Synchronous (clocked): A change in address, followed by an edge on CLK results in data appearing. Somtimes, multiple request may be outstanding. • Volatile: Looses its state when the power goes off. EECS150 - Lec19-memory

Example Memory Components: • Volatile: • Random Access Memory (RAM): • DRAM "dynamic" • SRAM "static" • Non-volatile: • Read Only Memory (ROM): • Mask ROM "mask programmable" • EPROM "electrically programmable" • EEPROM "erasable electrically programmable" • FLASH memory - similar to EEPROM with programmer integrated on chip EECS150 - Lec19-memory

SRAM Cell Larger cell  lower density, higher cost/bit No refresh required Simple read  faster access Standard IC process  natural for integration with logic DRAM Cell Smaller cell  higher density, lower cost/bit Needs periodic refresh, and refresh after read Complex read  longer access time Special IC process  difficult to integrate with logic circuits Volatile Memory Comparison word line word line bit line bit line bit line EECS150 - Lec19-memory

SRAM(lower density, higher speed) used in CPU register file, on- and off-chip caches. DRAM(higher density, lower speed) used in main memory In Desktop Computer Systems: • Closing the GAP: Innovation targeted towards higher bandwidth for memory systems: • SDRAM - synchronous DRAM • RDRAM - Rambus DRAM • EDORAM - extended data out SRAM • Three-dimensional RAM • hyper-page mode DRAM video RAM • multibank DRAM EECS150 - Lec19-memory

EDO - extended data out (similar to fast-page mode) RAS cycle fetched rows of data from cell array blocks (long access time, around 100ns) Subsequent CAS cycles quickly access data from row buffers if within an address page (page is around 256 Bytes) SDRAM - synchronous DRAM clocked interface uses dual banks internally. Start access in one back then next, then receive data from first then second. DDR - Double data rate SDRAM Uses both rising (positive edge) and falling (negative) edge of clock for data transfer. (typical 100MHz clock with 200 MHz transfer). RDRAM - Rambus DRAM Entire data blocks are access and transferred out on a highspeed bus-like interface (500 MB/s, 1.6 GB/s) Tricky system level design. More expensive memory chips. Important DRAM Examples: EECS150 - Lec19-memory

Mask ROM Used with logic circuits for tables etc. Contents fixed at IC fab time (truly write once!) EPROM (erasable programmable) & FLASH requires special IC process (floating gate technology) writing is slower than RAM. EPROM uses special programming system to provide special voltages and timing. reading can be made fairly fast. rewriting is very slow. erasure is first required , EPROM - UV light exposure Non-volatile Memory Used to hold fixed code (ex. BIOS), tables of data (ex. FSM next state/output logic), slowly changing values (date/time on computer) EECS150 - Lec19-memory

Electrically erasable In system programmability and erasability (no special system or voltages needed) On-chip circuitry (FSM) to control erasure and programming (writing) Erasure happens in variable sized "sectors" in a flash (16K - 64K Bytes) FLASH Memory See: http://developer.intel.com/design/flash/ for product descriptions, etc. EECS150 - Lec19-memory

Memory blocks can be (and often are) used to implement combinational logic functions: Examples: LUTs in FPGAs 1Mbit x 8 EPROM can implement 8 independent functions each of log2(1M)=20 inputs. The decoder part of a memory block can be considered a “minterm generator”. The cell array part of a memory block can be considered an OR function over a subset of rows. Relationship between Memory and CL • The combination gives us a way to implement logic functions directly in sum of products form. • Several variations on this theme exist in a set of devices called Programmable logic devices (PLDs) EECS150 - Lec19-memory

A ROM as AND/OR Logic Device EECS150 - Lec19-memory

PLD Summary EECS150 - Lec19-memory

PLA Example EECS150 - Lec19-memory

PAL Example EECS150 - Lec19-memory

LUTs can double as small RAM blocks: 5-LUT is a 16x1 memory achieves 16x density advantage over using CLB flip-flops Newer FPGA families include additional on chip RAM blocks (usually dual ported) Called “block-rams” in Xilinx Virtex series Memory Blocks in FPGAs EECS150 - Lec19-memory

Memory Specification in Verilog • Memory modeled by an array of registers: reg[15:0] memword[0:1023]; // 1,024 registers of 16 bits each //Example Memory Block Specification //----------------------------- //Read and write operations of memory. //Memory size is 64 words of 4 bits each. module memory (Enable,ReadWrite,Address,DataIn,DataOut); input Enable,ReadWrite; input [3:0] DataIn; input [5:0] Address; output [3:0] DataOut; reg [3:0] DataOut; reg [3:0] Mem [0:63]; //64 x 4 memory always @ (Enable or ReadWrite) if (Enable) if (ReadWrite) DataOut = Mem[Address]; //Read else Mem[Address] = DataIn; //Write else DataOut = 4'bz; //High impedance state endmodule EECS150 - Lec19-memory

Error Correction Codes (ECC) • Memory systems generate errors (accidentally fliped-bits) • DRAMs store very little charge per bit • “Soft” errors occur occasionally when cells are struck by alpha particles or other environmental upsets. • Less frequently, “hard” errors can occur when chips permanently fail. • Where “perfect” memory is required • servers, spacecraft/military computers, … • Memories are protected against failures with ECCs • Extra bits are added to each data-word • extra bits are used to detect and/or correct faults in the memory system • in general, each possible data word value is mapped to a unique “code word”. A fault changes a valid code word to an invalid one - which can be detected. EECS150 - Lec19-memory

Each data value, before it is written to memory is “tagged” with an extra bit to force the stored word to have even parity: Each word, as it is read from memory is “checked” by finding its parity (including the parity bit). b7b6b5b4b3b2b1b0p b7b6b5b4b3b2b1b0p + + c Simple Error Detection Coding Parity Bit • A non-zero parity indicates an error occurred: • two errors (on different bits) is not detected (nor any even number of errors) • odd numbers of errors are detected. EECS150 - Lec19-memory

Use more parity bits to pinpoint bit(s) in error, so they can be corrected. Example: SEC on 4-bit data use 3 parity bits, with 4-data bits results in 7-bit code word 3 parity bits sufficient to identify any one of 7 code word bits overlap the assignment of parity bits so that a single error in the 7-bit work can be corrected Group parity bits so they correspond to subsets of the 7 bits: p1 protects bits 1,3,5,7 p2 protects bits 2,3,6,7 p3 protects bits 4,5,6,7 1 2 3 4 5 6 7 p1 p2 d1 p3 d2 d3 d4 Bit position number 001 = 110 011 = 310 101 = 510 111 = 710 010 = 210 011 = 310 110 = 610 111 = 710 100 = 410 101 = 510 110 = 610 111 = 710 p1 p2 p3 Hamming Error Correcting Code EECS150 - Lec19-memory

Example: c = c1c2c3= 101 error in 4,5,6, or 7 (by c3=1) error in 1,3,5, or 7 (by c1=1) no error in 2, 3, 6, or 7 (by c2=0) Therefore error must be in bit 5. Note the check bits point to 5 By our clever positioning and assignment of parity bits, the check bits always address the position of the error! c=000 indicates no error 1 2 3 4 5 6 7 p1 p2 d1 p3 d2 d3 d4 Note: parity bits occupy power-of-two bit positions in code-word. On writing parity bits are assigned to force even parity over their respective groups. On reading, check bits (c1,c2,c3) are generated by finding the parity of the group along with its parity bit. If an error occurred in a group, the corresponding check bit will be 1, if no error the check bit will be 0. Hamming Code Example EECS150 - Lec19-memory

Overhead involved in single error correction code: let p be the total number of parity bits and d the number of data bits in a p + d bit word. If p error correction bits are to point to the error bit (p + d cases) plus indicate that no error exists (1 case), we need: 2p >= p + d + 1, thus p >= log(p + d + 1) for large d, p approaches log(d) Adding on extra parity bit covering the entire word can provide double error detection 1 2 3 4 5 6 7 8 p1 p2 d1 p3 d2 d3 d4 p4 On reading the C bits are computed (as usual) plus the parity over the entire word, P: C=0 P=0, no error C!=0 P=1, correctable single error C!=0 P=0, a double error occurred C=0 P=1, an error occurred in p4 bit Hamming Error Correcting Code Typical modern codes in DRAM memory systems: 64-bit data blocks (8 bytes) with 72-bit code words (9 bytes). EECS150 - Lec19-memory

EECS150 - Digital Design Lecture 20 - Memory

EECS150 - Digital Design Lecture 20 - Memory

Presentation Transcript

EECS150 - Digital Design Lecture 5 - Field Programmable Gate Arrays (FPGAs)

EECS150 - Digital Design Lecture 2 - CMOS

Lecture 20: Virtual Memory

Lecture 20: Cache Hierarchies, Virtual Memory

EECS150 - Digital Design Lecture 3 - Field Programmable Gate Arrays FPGAs

Design Realization lecture 20

EECS150 - Digital Design Lecture 10 - Combinational Logic Circuits Part 1

Lecture 20: Main Memory

Lecture 11 Memory Design

Lecture 20: Main Memory

EECS150 - Digital Design Lecture 14 - Sequential Circuits I (State Elements)

EECS150 - Digital Design Lecture 9 - Finite State Machines 1

EECS150 - Digital Design Lecture 12 - Combinational Logic Circuits Part 3

Lecture 10: Memory Hierarchy Design

Lecture 20: OOO, Memory Hierarchy

Design Realization lecture 20

Design of Digital Circuits Lecture 20: SIMD Processors

EECS150 - Digital Design Lecture 2 - CMOS

Digital Design Lecture 13

EECS150 - Digital Design Lecture 9 - Finite State Machines 1