ITEC 352

ITEC 352 Lecture 25 Memory(2)

Review • RAM • Why it isn’t on the CPU • What it is made of • Building blocks to black boxes • How it is accessed • Problems with decoders • Row / Column solution

Objectives • ROM • Cache memory

Other elements • We will look briefly at one more memory element: ROM (Read only memory) • For the other memory: cache, and virtual memory – we will look at the main concepts but not at the hardware details.

Read Only Memory • Good news: we do not need any Flip flops to develop ROM. Why not? • You can think of a ROM as a “hard-coded” memory, i.e., memory that cannot be changed. • However, these days ROMs are being replaced by special chips that load a software called firmware.

A ROM that Stores Four Four-Bit Words (the two four’s in the title are not typos!)

Cache • Cache is a small memory area that is faster to access than main memory (RAM). The speed is due to two reasons: • It is connected to the CPU either by a faster bus (or) it may be present on the CPU. E.g., L1/L2 caches on chip. • It stores data that have been most recently used (or) most often used – hence lookups for such data is fast.

Hypothetical case: We are up to 1600mhz on current cpus Cache Placement • Cache improves memory access to the data stored in the cache. Here you can see that the cache can be accessed by a 400 MHz bus vs. main memory which is accessible using a 66 Mhz bus. • Mhz: Mega hertz or million clock cycles per second. E.g., 400 Mhz is 400 million clock cycles per second. Which is faster 400 Mhz or 66 Mhz?

Speed of a cache • How can we make the cache more effective: • (Brainstorm and think of all the ways that we can improve performance of our programs when using a cache).

Speed of a cache (2) • Key factors effecting speed: • Speed of the data bus from CPU to cache. • Determining what data must be stored in the cache. • Strategies to use if the the cache is full – how can we add new informatin to the cache? This is the cache replacement policy. What data do you think should be stored in the cache?

Replacement Policies • • When there are no available slots in which to place a block, a replacement policy is implemented. The replacement policy governs the choice of which slot is freed up for the new block. • • Replacement policies are used for associative and set-associative mapping schemes, and also for virtual memory. • • Least recently used (LRU) • • First-in/first-out (FIFO) • • Least frequently used (LFU) • • Random • • Optimal (used for analysis only – look backward in time and reverse-engineer the best possible strategy for a particular sequence of memory references.)

Data missing? • We have specific cache read and write policies.

Read / Write Policies

Performance • Next: We use hit ratios to measure cache performance….

Performance of a cache Memory address range • Assume a program (instructions) is loaded into the memory from address 0 to 57. Also assume: cache access time: 80 ns and memory access time: 2500 ns. • What is the performance when using a cache. Memory Cache 0 - 16 17 - 33 34-40 41-57

Performance of a cache Memory address range • What is the memory address of the first instruction accessed? • Address: 0 • Is this in the cache? • Initially, cache is empty. Hence, no! • So what do you do? • Depends on the policy Memory Cache 0 - 16 17 - 33 34-40 41-57

Performance of a cache Memory address range • As the data is not in the cache, the block 0-16 is loaded into the slot1 of the cache. • The next instruction is at address 1. • Is this in the cache? • Yes – so the next 15 instructions are in the cache. Memory Cache 0 - 16 17 - 33 34-40 41-57

Performance of cache • Hence: Event Location Time 1 miss 0 2500 ns 15 hits 1-16 80ns X 15 …

Hit Ratios and Effective Access Times • Hit ratio and effective access time for single level cache: • Hit ratios and effective access time for multi-level cache:

Multilevel caches • As size of ICs have increased, packing density also has increased. • Multilevel caches have been developed • Fastest level L1 is on the chip. • Usually data and instructions are kept separate on this cache. Called split cache. • Level L2 and L3 are slower than L1 and are unified caches.

Question • Lets say there are 10000 memory references to execute a process. • 90 cause L1 misses and of these 10 cause L2 misses. • Let L1 hit time: 5 ns (this is the time to access a memory location if it is in the cache). • Let L2 hit time: 20 ns • Let the L2 miss time: 100 ns (time to access memory in the main memory). • What is the effective access time to access a memory?

Summary so far … • We have seen different types of memory elements and gone up the hierarchy of memory. • We have developed RAM, ROM, Registers and looked at Caches. • Next: we will see how programs that we develop are allocated memory. • Some terminology: • Process: any program in execution is called a process. • E.g., A java program that you write is simply a program, unless you execute it. During its execution it becomes a process. • There can be multiple processes of the same program.

Summary • Cache access

ITEC 352

ITEC 352

Presentation Transcript

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 352

ITEC 320

352

ITEC 352 Computer Organization

ITEC 370