1 / 94

Virtual Memory

Virtual Memory. virtual memory --- When the combined size of the program, it’s data and stack exceed the amount of physical memory the computer keeps only those parts it is currently using in memory and keeps the rest on disk. <1%. 15%. 35%. 20%. <1%. 30%. Execution time.

linus
Télécharger la présentation

Virtual Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Virtual Memory

  2. virtual memory --- When the combined size of the program, it’s data and stack exceed the amount of physical memory the computer keeps only those parts it is currently using in memory and keeps the rest on disk.

  3. <1% 15% 35% 20% <1% 30% Execution time Key Concept: Locality Address Space for pi • Address space is logically partitioned • code, data, stack, heap • initialization, main, error handlers • Different parts have different reference patterns Initialization code (used once) Code for phase 1 Code for phase 2 Code for phase 3 Code for error 1 Code for error 2 Code for error 3 Data & stack

  4. Memory Management Unit CPU Memory MMU BUS The mmu converts virtual addresses into physical addresses

  5. Virtual memory is divided into pages for a 32 bit address 4,294,967,296 Real memory is divided into page frames The size of virtual memory is defined by the number of bits in an address a page a page frame Real Memory 0 Pages and page frames are the same size. Virtual Memory

  6. virtual memory For a computer with 16 bit addresses (64K) 60K - 64K X 216 = 64k . . . X 40K - 44K …but with 32K of real Memory (15 bits of addressability) 36K - 40K 5 page frame 32K - 36K X 7 28K - 32K X 28K - 32K 6 24K - 28K X 24K - 28K 5 20K - 24K 3 20K - 24K 4 16K - 20K 4 16K - 20K 3 12K - 16K X 12K - 16K 2 8K - 12K 6 8K - 12K 1 1 4K - 8K 4K - 8K 0 0 - 4K 2 0 - 4K define a page as 4K

  7. mov reg, 000 60K - 64K X The mmu sees that the virtual address is in the range 0 to 4095, and this maps to page frame number 2 (8k – 12k). 40K - 44K X 36K - 40K 5 32K - 36K X 28K - 32K X 7 28K - 32K 24K - 28K X 6 24K - 28K 20K - 24K 3 5 20K - 24K 16K - 20K 4 4 16K - 20K 12K - 16K 0 3 12K - 16K 8K - 12K 6 2 8K - 12K 1 4K - 8K 1 4K - 8K 0 0 - 4K 2 0 - 4k

  8. mov reg, 000 60K - 64K X So it then transforms the address to 8192 and puts this address on the address bus. 40K - 44K X 36K - 40K 5 32K - 36K X 28K - 32K X 28K - 32K 24K - 28K X 24K - 28K 20K - 24K 3 20K - 24K 16K - 20K 4 16K - 20K 12K - 16K 0 12K - 16K 8K - 12K 6 8K - 12K 1 4K - 8K 4K - 8K 0 - 4k 0 - 4K 2 0 - 4k

  9. mov reg, 32780 60K - 64K X This address is on page 8 (32k - 36K). The mmu sees that this page is unmapped so it traps to the OS ( a page fault ). The OS picks an empty or little used page frame and brings virtual page 8 into that page frame. 40K - 44K X 36K - 40K 5 32K - 36K X 28K - 32K X 28K - 32K 24K - 28K X 24K - 28K 20K - 24K 3 20K - 24K 16K - 20K 4 16K - 20K 12K - 16K 0 12K - 16K 8K - 12K 6 8K - 12K 1 4K - 8K 4K - 8K 0 - 4K 2 0 - 4k

  10. mov reg, 32780 60K - 64K X This address is on page 8 (32k - 36K). The mmu sees that this page is unmapped so it traps to the OS ( a page fault ). The OS picks an empty or little used page frame and brings virtual page 8 into that page frame. 40K - 44K X 36K - 40K 5 32K - 36K 7 28K - 32K X 28K - 32K 24K - 28K X 24K - 28K 20K - 24K 3 20K - 24K 16K - 20K 4 16K - 20K 12K - 16K 0 12K - 16K 8K - 12K 6 8K - 12K 1 4K - 8K 4K - 8K 0 - 4K 2 0 - 4k

  11. Page Tables

  12. Inside the MMU Outgoing physical address (15 bit) (24580) 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 Page number Page Table --- 0 --- 0 --- 0 --- 0 111 1 --- 0 9 101 1 8 --- 0 7 --- 0 6 --- 0 5 011 1 4 100 1 3 000 1 2 110 1 1 001 1 0 010 1 present/ absent bit 12-bit offset moved directly from input to output 4 bits used as the index into the page table page frame number 110 Incoming virtual address (16 bit) (8196) 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0

  13. There are two issues that must be faced • The page table can be extremely large • with a 4-KB page size, and 32 bit addresses, • a page table needs 1 million entries. And, each • process needs its own pagetable! • The mapping must be really fast • virtual to physical mapping is taking place on • every memory reference. That might mean 1,2 • or maybe even more page table references per • instruction!

  14. Inside the MMU Page Table 000 0 000 0 000 0 000 0 111 1 000 0 9 101 1 8 000 0 7 000 0 6 000 0 5 011 1 4 100 1 3 000 1 2 110 1 1 001 1 0 010 1 At one extreme the mmu can be constructed as an array of fast hardware registers. When a process is loaded into memory, the process’s page table is loaded into the registers. Advantage: fast, no memory access req’d Disadvantage: ??? * a million registers - expensive! * increases time to load process Hardware Registers

  15. Inside the MMU Page Table register 000 0 000 0 000 0 000 0 111 1 000 0 9 101 1 8 000 0 7 000 0 6 000 0 5 011 1 4 100 1 3 000 1 2 110 1 1 001 1 0 010 1 At the other extreme, the page tables are kept completely in memory and a single register is used to point to the beginning of the current page table. Advantage: When a context switch occurs only the contents of this one register is changed. Disadvantage: ??? * lots more memory is used * memory access is slow Memory

  16. Multi-Level Page Tables Multi-Level page tables address the issue of keeping huge page tables in memory all of the time.

  17. 1023 . . . PT1 indexes into 1st level table Each entry represents 4M 4 3 2 1023 1 . . . 0 pages are 4k 1023 10 10 12 4 . . . PT1 PT2 Offset 3 2 1 4 32 bit virtual address 3 0 4GB 2 top level page table 1 + offset 0 Only the required page tables are in memory PT2 indexes into 2nd level tables 2nd level page tables

  18. Page Table Entries Usually 32 bits Layout is machine dependent, but the following is representative. Page frame number Cache disable Present/absent bit 1 – entry is valid 0 – page not in memory Referenced bit set whenever the page is referenced Protection bits Modified (the dirty bit) 0 – the page is “clean” 1 – the page has been modified

  19. Load R1, address 1 memory cycle w/o paging Access primary page table Access 2nd level page table Calculate real address Execute the Load instruction with paging Translation Look-aside Buffers In most paging schemes the page tables are not kept in memory. Thus the act of doing an address translation requires extra memory fetches. Having to do two memory fetches per memory reference reduces performance by 2/3.

  20. Observation: most programs tend to make a large number of references to a small number of pages. Thus, only a small number of page references are used heavily in a program. As a result chip manufacturers have created a hardware solution that is usually part of the mmu, called a translation lookaside buffer (tlb). in use virtual page modified protection page frame 1 40 1 rw 31 1 20 0 rx 38 1 130 1 rw 29 1 129 1 rw 62 1 19 0 rx 50 1 21 0 rx 45 1 860 1 rw 14 1 861 1 rw 75 Small number of entries

  21. An example that could generate this tlb might be an execution loop in virtual pages 19 – 20, the array being processed on pages 129 & 130, and the stack on pages 860 and 861. in use virtual page modified protection page frame 1 40 1 rw 31 1 20 0 rx 38 1 130 1 rw 29 1 129 1 rw 62 1 19 0 rx 50 1 20 0 rx 45 1 860 1 rw 14 1 861 1 rw 75 Small number of entries

  22. Let a memory reference be presented to the mmu. The mmu checks to see if the page is in the tlb by checking all pages in If a match is found and the access does not violate the protection bits, the page frame number is taken directly out of the tlb. If the instruction is trying to write a read only page, a protection fault is generated. parallel in use virtual page modified protection page frame 1 40 1 rw 31 1 20 0 rx 38 1 130 1 rw 29 1 129 1 rw 62 1 19 0 rx 50 1 21 0 rx 45 1 860 1 rw 14 1 861 1 rw 75 Small number of entries

  23. If the page is not in the tlb, then the mmu does a normal page table look up. Then the new page replaces one already in the tlb. in use virtual page modified protection page frame 1 40 1 rw 31 1 20 0 rx 38 1 130 1 rw 29 1 129 1 rw 62 1 19 0 rx 50 1 21 0 rx 45 1 860 1 rw 14 1 861 1 rw 75 Small number of entries

  24. Inverted Page Tables Traditional page tables require one entry per virtual page, since they are indexed by virtual page number. If the address space is 232 bytes, with 4096 bytes per page, then over 1 million page table entries are required. At 32 bits per entry, this is 4M bytes per page table. What about an address space of 264? The page table would be over 30 million gigabytes!

  25. An inverted page table has one entry per page frame in real memory instead of one per page in the virtual address space. With 64bit virtual addresses, a 4K page, and 256M of real memory, an inverted page table only requires 65,536 entries. Each entry keeps track of which virtual page is located in the page frame. So … how do you do a translation in this case?

  26. Because its an inverted table, you can’t index, so the entire page table has to be searched until you find the virtual page that matches the memory reference – resulting in a 64K table search on every memory reference!

  27. Hash table with number of slots = physical pages Virtual page Physical page Virtual address tlb Use the tlb for heavily used pages When the tlb misses look up the page in a hash table hashed on virtual address So ….. Inverted page table Multiple hits on hash are linked together. These lists are typically small (2-3 entries)

  28. Paging Performance • A page fault causes the following sequence to occur: • A trap to the operating system • Save the registers and process state • Determine that the interrupt was a page fault • Check that the page reference is legal • Determine the location of the page on disk • Issue a read to get the page and move it into a free frame • Wait for the device to be ready • Wait for seek/latency times • Begin transfer of the page to a free frame • While waiting, allocate the cpu to another process • Receive an interrupt that the I/O is complete • Determine that the interrupt was from the disk • Wait for the cpu to be allocated to this process again • Restore the registers, process state, and new page table

  29. Page Replacement Algorithms When a page fault occurs, the Operating System has to choose a page in memory to delete, so that it can bring the new page into that page frame. If the page that is removed has been modified, (the dirty bit is set) then that page has to be written to disk. The problem is figuring out which page to replace.

  30. Not Recently Used Recall that each entry in the page table had an R bit and an M bit. The R bit is set by the hardware whenever that page is referenced. The M bit is set by the hardware whenever that page is modified. The operating system sets both bits for every page to 0 when a process starts. Periodically (e.g. on each clock interrupt) the operating system resets the R bit to 0. When a page fault occurs, the operating system inspects the page table and puts pages into one of four categories: Class 0: not referenced since last interrupt, not modified Class 1: not referenced since last interrupt, modified * Class 2: referenced since last interrupt, not modified Class 3: referenced since last interrupt, modified Replace a page at random from the lowest numbered category that has any pages in it.

  31. Not Recently Used When a page fault occurs, the operating system inspects the page table and puts pages into one of four categories: Class 0: not referenced since last interrupt, not modified Class 1: not referenced since last interrupt, modified Class 2: referenced since last interrupt, not modified Class 3: referenced since last interrupt, modified Most likely to get paged out Least likely to get paged out Replace a page at random from the lowest numbered category that has any pages in it.

  32. First In First Out The operating system maintains a linked list of pages currently in memory. The head of the list contains the oldest page, the tail the most recent. When a page fault occurs the page at the head of the list is removed and the new page is added to the tail of the list. Problem?

  33. First In First Out Page List oldest page next oldest page next oldest page . . . newest page The oldest page is not necessarily the least used page. In fact it could be the most often used page!

  34. Second Chance Uses a linked list like First In First Out, but the R bit of the oldest page is inspected. If it is set, the bit is cleared and the page is moved to the tail of the list just as if it had just arrived. Then the new head of the list is inspected. This process continues until a page is found at the head of the list that does not have its R bit set. Problem? Keeps recently used pages in memory, but requires a lot of processing to manage the linked list.

  35. Clock Replacement This is exactly like Second chance except that the list is a circular list. Keeps from moving pages in the list. A L B When a page fault occurs, inspect the page that the hand is pointing to. If the R bit is: = 0 Evict the page = 1 Clear R and advance the hand C K D J I E worst case - you will go all the way around the list and evict the page whose R bit you first cleared H F G

  36. Least Recently used (LRU) It is commonly understood that pages that have been heavily used for the last few instructions will probably be used again in the next few instructions. The Least Recently Used algorithm then discards the page that has been unused for the longest time.

  37. Ideal Implementation Least Maintain a linked list of the pages in memory, with the most recently used page at the front of the list and the least recently used at the tail. Most Problem?

  38. For 1G of memory, and a 4k page frame, the linked list would contain 262,144 entries! That’s a ton of processing to do on every memory reference! Ideal Implementation Least On Every memory reference, you have to find the page containing the reference in the list, and move it to the front of the list. Most

  39. LRU Hardware Solution 64 bit counter Page Table 000 0 count 000 0 count 000 0 count 000 0 count 111 1 count 000 0 count 9 101 1 count 8 000 0 count 7 000 0 count 6 000 0 count 5 011 1 count 4 100 1 count 3 000 1 count 2 110 1 count 1 001 1 count 0 010 1 count * Provide a 64 bit counter. * Increment the counter after every instruction. * Add a field to every entry in the page table - store the counter for the page just referenced. * When a page fault occurs, scan the list and pick the page with the smallest value in the counter field.

  40. Another LRU Hardware Solution column k (all bits to 0) row k all bits to 1 In a machine with n page frames, maintain an n x n array in hardware. Initially all bits are zero. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. At any moment, the row whose binary value is the lowest is the least recently used.

  41. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  42. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 1 1 1 1 0 0 0 0 Least recently used (row sum = 0) 2 0 0 0 0 3 0 0 0 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  43. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 1 1 1 1 0 1 1 2 0 0 0 0 Least recently used (row sum = 0) 3 0 0 0 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  44. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 0 1 1 1 0 0 1 2 1 1 0 1 Least recently used (row sum = 0) 3 0 0 0 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  45. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 0 0 Least recently used (row sum = 0) 1 1 0 0 0 2 1 1 0 0 3 1 1 1 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  46. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 0 0 Least recently used 1 1 0 0 0 2 1 1 0 1 3 1 1 0 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  47. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 0 0 Least recently used 1 1 0 1 1 2 1 0 0 1 3 1 0 0 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  48. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 1 1 1 1 0 0 1 1 2 0 0 0 1 3 0 0 0 0 Least recently used Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  49. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 1 1 0 1 0 0 1 0 2 0 0 0 0 Least recently used 3 1 1 1 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

  50. When page frame k is referenced, set all of the bits in row k to 1 and all of the bits of column k to 0. 0 1 2 3 0 0 0 1 0 Least recently used (smallest value) 1 0 0 1 0 2 1 0 1 1 3 1 0 1 0 Consider a system with 4 page frames and page references in the order 0 1 2 3 2 1 0 3 2 3

More Related