CSL718 : Memory Hierarchy

CSL718 : Memory Hierarchy Cache Memories 6th Feb, 2006 Anshul Kumar, CSE IITD

Memory technologies • Semiconductor • Registers • SRAM Random Access • DRAM • FLASH • Magnetic • FDD • HDD • Optical Random + sequential • CD • DVD Anshul Kumar, CSE IITD

Hierarchical structure S p e e d C P U S i z e C o s t / b i t F a s t e s t M e m o r y S m a l l e s t H i g h e s t M e m o r y S l o w e s t M e m o r y B i g g e s t L o w e s t Anshul Kumar, CSE IITD

Anshul Kumar, CSE IITD

Main Memory for Pentium IVDDR (double data rate) DRAM Anshul Kumar, CSE IITD

Disk drives Seagate Baracuda 7200 RPM Anshul Kumar, CSE IITD

Data transfer between levels hit P r o c e s s o r access miss D a t a t r a n s f e r unit of transfer = block Anshul Kumar, CSE IITD

Principle of locality • Temporal Locality • references repeated in time • Spatial Locality • references repeated in space • Special case: Sequential Locality Anshul Kumar, CSE IITD

Memory Hierarchy Analysis Memory Mi: M1, M2, …. , Mn Capacity si: s1< s2< …. < sn Unit cost ci: c1> c2> …. > cn Total cost Ctotal: ici . si Access time ti : 1+ 2+ …. +i (i at level i) 1< 2< …. < n Hit ratios hi(si): h1< h2< …. < hn = 1 Effective time Teff: imi . hi . ti = imi . i Miss before level i, mi: (1-h1)(1-h2) …. (1-hi-1) Anshul Kumar, CSE IITD

Cache Types Instruction | Data | Unified | Split Split vs. Unified: • Split allows specializing each part • Unified allows best use of the capacity On-chip | Off-chip • on-chip : fast but small • off-chip : large but slow Single level | Multi level Anshul Kumar, CSE IITD

Cache Policies • Placement what gets placed where? • Read when? from where? • Load order of bytes/words? • Fetch when to fetch new block? • Replacementwhich one? • Write when? to where? Anshul Kumar, CSE IITD

Block placement strategies Direct mapped Set associative Fully associative Block Set # 0 1 2 3 4 5 6 7 # 0 1 2 3 D a t a D a t a D a t a 1 1 1 T a g T a g T a g 2 2 2 S e a r c h S e a r c h S e a r c h Anshul Kumar, CSE IITD

Organization/placement policy Set 1 Cache Set S Set Sector 1 Sector 2 Sector SE LRU Sector Tag Block 1 Block 2 Block B Block V D S AU 1 AU 2 AU A Anshul Kumar, CSE IITD

Addressing Cache Sector Name Set Index Block Displacement Address Selects set Compared to Tags Selects Block Selects AU Early select: access data after tag matching Late select: access data while tag matching Anshul Kumar, CSE IITD

Cache organization example Sector Sector Block Block Block Block Tag V D AU AU V D AU AU Tag V D AU AU V D AU AU 1 2 3 4 Sets 5 6 7 8 Anshul Kumar, CSE IITD

Cache access mechanism Address 31 0 18 12 2 Hit Tag Data byte offset index index v tag data 0 1 ... ... 4095 32 18 = Anshul Kumar, CSE IITD

Cache with 4 word blocks Address 31 0 18 Data 10 2 2 Hit Tag byte offset block offset index index v tag data 0 1 ... ... 1023 32 32 32 32 18 = Mux Anshul Kumar, CSE IITD

4-way set associative cache 31 0 tag 20 byte offset 8 2 2 index block offset v tag data v tag data v tag data v tag data 0 ... ... ... 255 20 20 20 20 128 128 128 128 = = = = Mux Mux Mux Mux 32 32 32 32 Hit Mux Data Anshul Kumar, CSE IITD

Read policies • Sequential or concurrent • initiate memory access only after detecting a miss • initiate memory access along with cache access in anticipation of a miss • With or without forwarding • give data to CPU after filling the missing block in cache • forward data to CPU as it gets filled in cache Anshul Kumar, CSE IITD

Read Policies Sequential Simple: 1 1 1 Teff=(1-pm).1 + pm . (T+2) Cache T Memory Concurrent Simple: 1 1 1 Teff=(1-pm).1 + pm . (T+1) Cache T Memory Sequential Forward: 1 1 Teff=(1-pm).1 + pm . (T+1) Cache T Memory Concurrent Forward: 1 1 Teff=(1-pm).1 + pm . (T) Cache T Memory Anshul Kumar, CSE IITD

Load policies 4 AU Block 2 3 1 0 Cache miss on AU 1 Block Load Load Forward Fetch Bypass (wrap around load) Anshul Kumar, CSE IITD

Fetch Policies • Fetch on miss (demand fetching) • Software prefetching • Hardware Prefetching Anshul Kumar, CSE IITD

Fetch Policies • Demand fetching • fetch only when required (miss) • Hardware prefetching • automatically prefetch next block • Software prefetching • programmer decides to prefetch questions: • how much ahead (prefetch distance) • how often Anshul Kumar, CSE IITD

Software Control of Cache Software visible cache • mode selection (WT, WB etc) • block flush • block invalidate • block prefetch Anshul Kumar, CSE IITD

Replacement Policies • Least Recently Used (LRU) • Least Frequently Used (LFU) • First In First Out (FIFO) • Random Anshul Kumar, CSE IITD

Write Policies • Write Hit • Write Back • Write Through • Write Miss • Write Back • Write Through (with or without Write Allocate) Buffers are used in all cases to hide latencies Anshul Kumar, CSE IITD

CSL718 : Memory Hierarchy