1 / 11

Chapter 19 Translation Lookaside Buffer

Chapter 19 Translation Lookaside Buffer. Chien -Chung Shen CIS, UD cshen@cis.udel.edu. Introduction. H igh performance overheads of paging large amount of mapping information (in memory) e xtra memory access for each virtual address Hardware support

jalila
Télécharger la présentation

Chapter 19 Translation Lookaside Buffer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 19Translation Lookaside Buffer Chien-Chung Shen CIS, UD cshen@cis.udel.edu

  2. Introduction • High performance overheads of paging • large amount of mapping information (in memory) • extra memory access for each virtual address • Hardware support • translation-lookasidebuffer(TLB) • part of MMU • hardware cache of popular virtual-to-physical address translations • better name would be address-translation cache • Upon each virtual memory reference, hardware first checks TLB to see if the desired translation is held therein; if so, the translation is performed (quickly) without having to consult the page table (which has all translations)

  3. TLB Algorithm VPN = (VirtualAddress & VPN_MASK) >> SHIFT (Success, TlbEntry) = TLB_Lookup(VPN)if (Success == True) // TLB Hit if (CanAccess(TlbEntry.ProtectBits) == True) Offset = VirtualAddress & OFFSET_MASK PhysAddr= (TlbEntry.PFN << SHIFT) | Offset AccessMemory(PhysAddr) else RaiseException(PROTECTION_FAULT) else // TLB Miss PTEAddr= PTBR + (VPN * sizeof(PTE)) PTE = AccessMemory(PTEAddr) if (PTE.Valid == False) RaiseException(SEGMENTATION_FAULT) else if (CanAccess(PTE.ProtectBits) == False) RaiseException(PROTECTION_FAULT) else TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits) RetryInstruction()

  4. Example: Access Array • 8-bit virtual address space and 16-byte pages • 10 4-byte integers starting at VA 100 • 4-bit VPN and 4-bit offset int sum = 0;for (i = 0; i < 10; i++) { sum += a[i]; } • TLB hit rate: 70% • Spatial locality • Any other way to improve hit rate? • larger pages • Quick re-reference of memory in time • temporal locality

  5. Caching and Locality • Caching is one of the most fundamental performance techniques in computer systemsto make common-case faster • Idea behind caching is to take advantage of locality in instruction and data references • Temporal locality: an instruction or data item that has been recently accessed will likely be re-accessed soon in the future (e.g., instructions in a loop) • Spatiallocality: if program accesses memory x, it will likely soon access memory near x

  6. Who handles TLB Misses • For CISC(complex-instruction set computers) architecture, by hardware • using page-table base register • For RISC (reduced-instruction set computers) architecture, by software(where hardware simply raises an exception and jumps to a trap handler) • advantage: flexibility (OS may use any data structure to implement page table) and simplicity • return-from-trap returns to the same instruction that caused the trap • avoid causing an infinite chain of TLB misses • keep TLB miss handlers in physical memory (not subject to address translation) • reserve some entries in TLB for permanently-valid translations and use some of those permanent translation slots for the handler code itself

  7. TLB Contents • 32, 64, or 128 entries • Fully associative: any given translation can be anywhere in TLB, and hardware will search the entire TLB in parallel to find the desired translation • An entry looks like: VPN | PFN | other bits • e.g., valid bit • TLB valid bit ≠ page table valid bit • in page table, when a PTE is marked invalid, it means that the page has not been allocated by the process • aTLB valid bit refers to whether a TLB entry has a valid translation within it

  8. Context Switch • TLB contains virtual-to-physical translations that are only valid for the currently running process, which are not meaningful for other processes • What to do on a context switch? • flush TLB on context switches by sets all valid bits to 0 • Incur TLB misses after context switches: what can you do better? VPN PFN validprotASID (Address Space ID) 10 100 1 rwx 1 — — 0 — — 10 170 1 rwx 2 — — 0 — — • With ASID, TLB mayholdtranslations from differentprocesses VPN PFN validprotASID 10 101 1 rwx 1 — — 0 — — 50 101 1 rwx 2 — — 0 — — • Sharing of page

  9. Replacement Policy • Cache replacement with goal of minimizing miss rate • Policies • evict the least-recently-used (LRU) entry • how about a loop accessing n + 1 pages, a TLB of size n, and an LRU replacement policy ? • random

  10. A Real TLB Entry • MIPS R4000 with software-managed TLB

  11. Culler’s Law • The term random-access memory(RAM)implies that you can access any part of RAM just as quickly as another. While it is generally good to think of RAM in this way, because of hardware/OS features such as TLB, accessing a particular page of memory may be costly, particularly if that page isn’t currently mapped by TLB. Thus, it is always good to remember the implementation tip: RAM isn’t always RAM. Sometimes randomly accessing your address space, particular if the number of pages accessed exceeds the TLB coverage, can lead to severe performance penalties. -- David Culler • TLB is the source of many performance problems

More Related