ITEC 352 Lecture 25 Memory(2)
Review • RAM • Why it isn’t on the CPU • What it is made of • Building blocks to black boxes • How it is accessed • Problems with decoders • Row / Column solution
Objectives • ROM • Cache memory
Other elements • We will look briefly at one more memory element: ROM (Read only memory) • For the other memory: cache, and virtual memory – we will look at the main concepts but not at the hardware details.
Read Only Memory • Good news: we do not need any Flip flops to develop ROM. Why not? • You can think of a ROM as a “hard-coded” memory, i.e., memory that cannot be changed. • However, these days ROMs are being replaced by special chips that load a software called firmware.
A ROM that Stores Four Four-Bit Words (the two four’s in the title are not typos!)
Cache • Cache is a small memory area that is faster to access than main memory (RAM). The speed is due to two reasons: • It is connected to the CPU either by a faster bus (or) it may be present on the CPU. E.g., L1/L2 caches on chip. • It stores data that have been most recently used (or) most often used – hence lookups for such data is fast.
Hypothetical case: We are up to 1600mhz on current cpus Cache Placement • Cache improves memory access to the data stored in the cache. Here you can see that the cache can be accessed by a 400 MHz bus vs. main memory which is accessible using a 66 Mhz bus. • Mhz: Mega hertz or million clock cycles per second. E.g., 400 Mhz is 400 million clock cycles per second. Which is faster 400 Mhz or 66 Mhz?
Speed of a cache • How can we make the cache more effective: • (Brainstorm and think of all the ways that we can improve performance of our programs when using a cache).
Speed of a cache (2) • Key factors effecting speed: • Speed of the data bus from CPU to cache. • Determining what data must be stored in the cache. • Strategies to use if the the cache is full – how can we add new informatin to the cache? This is the cache replacement policy. What data do you think should be stored in the cache?
Replacement Policies • • When there are no available slots in which to place a block, a replacement policy is implemented. The replacement policy governs the choice of which slot is freed up for the new block. • • Replacement policies are used for associative and set-associative mapping schemes, and also for virtual memory. • • Least recently used (LRU) • • First-in/first-out (FIFO) • • Least frequently used (LFU) • • Random • • Optimal (used for analysis only – look backward in time and reverse-engineer the best possible strategy for a particular sequence of memory references.)
Data missing? • We have specific cache read and write policies.
Performance • Next: We use hit ratios to measure cache performance….
Performance of a cache Memory address range • Assume a program (instructions) is loaded into the memory from address 0 to 57. Also assume: cache access time: 80 ns and memory access time: 2500 ns. • What is the performance when using a cache. Memory Cache 0 - 16 17 - 33 34-40 41-57
Performance of a cache Memory address range • What is the memory address of the first instruction accessed? • Address: 0 • Is this in the cache? • Initially, cache is empty. Hence, no! • So what do you do? • Depends on the policy Memory Cache 0 - 16 17 - 33 34-40 41-57
Performance of a cache Memory address range • As the data is not in the cache, the block 0-16 is loaded into the slot1 of the cache. • The next instruction is at address 1. • Is this in the cache? • Yes – so the next 15 instructions are in the cache. Memory Cache 0 - 16 17 - 33 34-40 41-57
Performance of cache • Hence: Event Location Time 1 miss 0 2500 ns 15 hits 1-16 80ns X 15 …
Hit Ratios and Effective Access Times • Hit ratio and effective access time for single level cache: • Hit ratios and effective access time for multi-level cache:
Multilevel caches • As size of ICs have increased, packing density also has increased. • Multilevel caches have been developed • Fastest level L1 is on the chip. • Usually data and instructions are kept separate on this cache. Called split cache. • Level L2 and L3 are slower than L1 and are unified caches.
Question • Lets say there are 10000 memory references to execute a process. • 90 cause L1 misses and of these 10 cause L2 misses. • Let L1 hit time: 5 ns (this is the time to access a memory location if it is in the cache). • Let L2 hit time: 20 ns • Let the L2 miss time: 100 ns (time to access memory in the main memory). • What is the effective access time to access a memory?
Summary so far … • We have seen different types of memory elements and gone up the hierarchy of memory. • We have developed RAM, ROM, Registers and looked at Caches. • Next: we will see how programs that we develop are allocated memory. • Some terminology: • Process: any program in execution is called a process. • E.g., A java program that you write is simply a program, unless you execute it. During its execution it becomes a process. • There can be multiple processes of the same program.
Summary • Cache access