1 / 93

Computer System Chapter 10. Virtual Memory

Computer System Chapter 10. Virtual Memory. Lynn Choi Korea University. A System with Physical Memory Only. Memory. 0:. Physical Addresses. 1:. CPU. N-1:. Examples: Most Cray machines, early PCs, nearly all embedded systems, etc.

sakina
Télécharger la présentation

Computer System Chapter 10. Virtual Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer SystemChapter 10. Virtual Memory Lynn Choi Korea University

  2. A System with Physical Memory Only Memory 0: Physical Addresses 1: CPU N-1: • Examples: • Most Cray machines, early PCs, nearly all embedded systems, etc. • Addresses generated by the CPU correspond directly to bytes in physical memory

  3. A System with Virtual Memory 0: 1: CPU N-1: • Examples: • Workstations, servers, modern PCs, etc. • Address Translation: Hardware converts virtual addresses to physical addresses via OS-managed lookup table (page table) Memory Page Table Virtual Addresses Physical Addresses 0: 1: P-1: Disk

  4. Page Faults (like “Cache Misses”) • What if an object is on disk rather than in memory? • Page table entry indicates virtual address not in memory • OS exception handler invoked to move data from disk into memory • Current process suspends, others can resume • OS has full control over placement, etc. Before fault After handling fault Memory Memory Page Table Page Table Virtual Addresses Physical Addresses Virtual Addresses Physical Addresses CPU CPU Disk Disk

  5. Servicing a Page Fault disk Disk (1) Initiate Block Read • Processor Signals Controller • Read block of length P starting at disk address X and store starting at memory address Y • Read Occurs • Direct Memory Access (DMA) • Under control of I/O controller • I / O Controller Signals Completion • Interrupt processor • OS resumes suspended process Processor Reg (3) Read Done Cache Memory-I/O bus (2) DMA Transfer I/O controller Memory disk Disk

  6. Memory Management • Multiple processes can reside in physical memory. • How do we resolve address conflicts? • What if two processes access something at the same address? memory invisible to user code kernel virtual memory stack %esp Memory mapped region for shared libraries Linux/x86 process memory image the “brk” ptr runtime heap (via malloc) uninitialized data (.bss) initialized data (.data) program text (.text) forbidden 0

  7. Solution: Separate Virt. Addr. Spaces • Virtual and physical address spaces divided into equal-sized blocks • Blocks are called “pages” (both virtual and physical) • Each process has its own virtual address space • Operating system controls how virtual pages as assigned to physical memory 0 Physical Address Space (DRAM) Address Translation Virtual Address Space for Process 1: 0 VP 1 PP 2 VP 2 ... N-1 (e.g., read/only library code) PP 7 Virtual Address Space for Process 2: 0 VP 1 PP 10 VP 2 ... M-1 N-1

  8. Protection 0: Read? Write? Physical Addr 1: VP 0: VP 0: Yes No PP 9 VP 1: VP 1: Yes Yes PP 4 VP 2: VP 2: No No XXXXXXX • • • • • • • • • Read? Write? Physical Addr Yes Yes PP 6 N-1: Yes No PP 9 No No XXXXXXX • • • • • • • • • • Page table entry contains access rights information • Hardware enforces this protection (trap into OS if violation occurs) Page Tables Memory Process i: Process j:

  9. Address Translation Symbols • Virtual Address Components • VPO: virtual page offset • VPN: virtual page number • TLBI: TLB index • TLBT: TLB tag • Physical Address Components • PPO: physical page offset • PPN: physical page number • CO: byte offset within cache block • CI: cache index • CT: cache tag

  10. Simple Memory System Example 8 4 5 6 7 12 9 10 11 13 3 10 2 0 0 1 1 3 4 5 2 7 8 9 11 6 VPO PPO PPN VPN • Addressing • 14-bit virtual addresses • 12-bit physical address • Page size = 64 bytes (Virtual Page Offset) (Virtual Page Number) (Physical Page Number) (Physical Page Offset)

  11. Simple Memory System Page Table • Only show first 16 entries

  12. Simple Memory System TLB TLBI TLBT 6 11 10 9 8 7 4 5 13 3 2 1 0 12 VPO VPN • TLB • 16 entries • 4-way associative

  13. Simple Memory System Cache CO CI CT 4 10 0 1 2 3 11 5 6 7 8 9 PPO PPN • Cache • 16 lines • 4-byte line size • Direct mapped

  14. Address Translation Example #1 TLBT TLBI 13 12 11 10 9 8 7 6 5 4 3 2 1 0 CO CI CT VPN VPO 5 10 9 8 7 6 3 4 2 1 0 11 PPO PPN • Virtual Address 0x03D4 VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____ • Physical Address Offset ___ CI___ CT ____ Hit? __ Byte: ____

  15. Address Translation Example #2 TLBT TLBI 13 12 11 10 9 8 7 6 5 4 3 2 1 0 CO CI CT VPN VPO 5 10 9 8 7 6 3 4 2 1 0 11 PPO PPN • Virtual Address 0x0B8F VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____ • Physical Address Offset ___ CI___ CT ____ Hit? __ Byte: ____

  16. Address Translation Example #3 TLBT TLBI 13 12 11 10 9 8 7 6 5 4 3 2 1 0 CO CI CT VPN VPO 5 10 9 8 7 6 3 4 2 1 0 11 PPO PPN • Virtual Address 0x0040 VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____ • Physical Address Offset ___ CI___ CT ____ Hit? __ Byte: ____

  17. Program Start Scenario • Before starting the process • Load the page directory into physical memory • Load the PDBR (page directory base register) with the beginning of the page directory • Load the PC with the start address of code • When the 1st reference to code triggers • iTLB miss (translation failed for instruction address) • Exception handler looks up PTE1 • dTLB miss (translation failed for PTE1) • Exception handler looks up PTE2 • Lookup page directory and find PTE2 • Add PTE2 to dTLB • dTLB hit, but page miss (PTE1 not in memory) • Load page containing PTE1 • Lookup page table and find PTE1 • Add PTE1 to iTLB • iTLB hit, but page miss (code page not present in memory) • Load the instruction page • Cache miss, but memory returns the instruction

  18. P6 Memory System 32 bit address space 4 KB page size L1, L2, and TLBs • 4-way set associative inst TLB • 32 entries • 8 sets data TLB • 64 entries • 16 sets L1 i-cache and d-cache • 16 KB • 32 B line size • 128 sets L2 cache • unified • 128 KB -- 2 MB DRAM external system bus (e.g. PCI) L2 cache cache bus bus interface unit inst TLB data TLB instruction fetch unit L1 i-cache L1 d-cache processor package

  19. Overview of P6 Address Translation CPU 32 L2 and DRAM result 20 12 virtual address (VA) VPN VPO L1 miss L1 hit 16 4 TLBT TLBI L1 (128 sets, 4 lines/set) TLB hit TLB miss ... ... TLB (16 sets, 4 entries/set) 10 10 VPN1 VPN2 20 7 5 20 12 CT CI CO PPN PPO physical address (PA) PDE PTE Page tables PDBR

  20. P6 2-level Page Table Structure Up to 1024 page tables • Page directory • 1024 4-byte page directory entries (PDEs) that point to page tables • One page directory per process. • Page directory must be in memory when its process is running • Always pointed to by PDBR • Page tables: • 1024 4-byte page table entries (PTEs) that point to pages. • Page tables can be paged in and out. 1024 PTEs page directory ... 1024 PDEs 1024 PTEs ... 1024 PTEs

  21. P6 Page Directory Entry (PDE) 31 12 11 9 8 7 6 5 4 3 2 1 0 Page table physical base addr Avail G PS A CD WT U/S R/W P=1 Page table physical base address: 20 most significant bits of physical page table address (forces page tables to be 4KB aligned) Avail: These bits available for system programmers G: global page (don’t evict from TLB on task switch) PS: page size 4K (0) or 4M (1) A: accessed (set by MMU on reads and writes, cleared by software) CD: cache disabled (1) or enabled (0) WT: write-through or write-back cache policy for this page table U/S: user or supervisor mode access R/W: read-only or read-write access P: page table is present in memory (1) or not (0) 31 1 0 Available for OS (page table location in secondary storage) P=0

  22. P6 Page Table Entry (PTE) 31 12 11 9 8 7 6 5 4 3 2 1 0 Page physical base address Avail G 0 D A CD WT U/S R/W P=1 Page base address: 20 most significant bits of physical page address (forces pages to be 4 KB aligned) Avail: available for system programmers G: global page (don’t evict from TLB on task switch) D: dirty (set by MMU on writes) A: accessed (set by MMU on reads and writes) CD: cache disabled or enabled WT: write-through or write-back cache policy for this page U/S: user/supervisor R/W: read/write P: page is present in physical memory (1) or not (0) 31 1 0 Available for OS (page location in secondary storage) P=0

  23. How P6 Page Tables Map VirtualAddresses to Physical Ones 10 10 12 Virtual address VPN1 VPN2 VPO word offset into page directory word offset into page table word offset into physical and virtual page page directory page table physical address of page base (if P=1) PTE PDE PDBR physical address of page table base (if P=1) physical address of page directory 20 12 Physical address PPN PPO

  24. Representation of Virtual Address Space Page 15 PT 3 Page 14 Page 13 Page Directory Page 12 Page 11 P=0, M=1 P=1, M=1 P=1, M=1 P=1, M=1 • • • • PT 2 P=0, M=0 P=0, M=1 P=0, M=0 P=1, M=1 • • • • Page 10 P=0, M=0 P=0, M=0 P=1, M=1 P=1, M=1 • • • • Page 9 P=0, M=1 P=0, M=1 P=0, M=1 P=0, M=0 • • • • Page 8 PT 0 Page 7 Page 6 Mem Addr Page 5 Disk Addr Page 4 Page 3 In Mem Page 2 On Disk Page 1 Page 0 Unmapped • Simplified Example • 16 page virtual address space • Flags • P: Is entry in physical memory? • M: Has this part of VA space been mapped?

  25. P6 TLB Translation CPU 32 L2 andDRAM result 20 12 virtual address (VA) VPN VPO L1 miss L1 hit 16 4 TLBT TLBI L1 (128 sets, 4 lines/set) TLB hit TLB miss ... ... TLB (16 sets, 4 entries/set) 10 10 VPN1 VPN2 20 7 5 20 12 CT CI CO PPN PPO physical address (PA) PDE PTE Page tables PDBR

  26. P6 TLB 32 16 1 1 PDE/PTE Tag PD V set 0 entry entry entry entry set 1 entry entry entry entry set 2 entry entry entry entry ... set 15 entry entry entry entry • TLB entry (not all documented, so this is speculative): • V: indicates a valid (1) or invalid (0) TLB entry • PD: is this entry a PDE (1) or a PTE (0)? • tag: disambiguates entries cached in the same set • PDE/PTE: page directory or page table entry • Structure of the data TLB: • 16 sets, 4 entries/set

  27. Translating with the P6 Page Tables(case 1/1) • Case 1/1: page table and page present. • MMU Action: • MMU builds physical address and fetches data word. • OS action • none 20 12 VPN VPO 20 12 VPN1 VPN2 PPN PPO Mem PDE p=1 PTE p=1 data PDBR Data page Page directory Page table Disk

  28. Translating with the P6 Page Tables(case 1/0) • Case 1/0: page table present but page missing. • MMU Action: • Page fault exception • Handler receives the following args: • VA that caused fault • Fault caused by non-present page or page-level protection violation • Read/write • User/supervisor 20 12 VPN VPO VPN1 VPN2 Mem PDE p=1 PTE p=0 PDBR Page directory Page table data Disk Data page

  29. Translating with the P6 Page Tables(case 1/0) • OS Action: • Check for a legal virtual address. • Read PTE through PDE. • Find free physical page (swapping out current page if necessary) • Read virtual page from disk and copy to virtual page • Restart faulting instruction by returning from exception handler. 20 12 VPN VPO 20 12 VPN1 VPN2 PPN PPO Mem PDE p=1 PTE p=1 data PDBR Data page Page directory Page table Disk

  30. Translating with the P6 Page Tables(case 0/1) • Case 0/1: page table missing but page present. • Introduces consistency issue. • Potentially every page out requires update of disk page table. • Linux disallows this • If a page table is swapped out, then swap out its data pages too. 20 12 VPN VPO VPN1 VPN2 Mem PDE p=0 data PDBR Data page Page directory PTE p=1 Disk Page table

  31. Translating with the P6 Page Tables(case 0/0) • Case 0/0: page table and page missing. • MMU Action: • Page fault exception 20 12 VPN VPO VPN1 VPN2 Mem PDE p=0 PDBR Page directory PTE p=0 data Disk Page table Data page

  32. Translating with the P6 Page Tables(case 0/0) • OS action: • Swap in page table. • Restart faulting instruction by returning from handler. • Like case 1/0 from here on. 20 12 VPN VPO VPN1 VPN2 Mem PDE p=1 PTE p=0 PDBR Page table Page directory data Disk Data page

  33. P6 L1 Cache Access CPU 32 L2 andDRAM result 20 12 virtual address (VA) VPN VPO L1 miss L1 hit 16 4 TLBT TLBI L1 (128 sets, 4 lines/set) TLB hit TLB miss ... ... TLB (16 sets, 4 entries/set) 10 10 VPN1 VPN2 20 7 5 20 12 CT CI CO PPN PPO physical address (PA) PDE PTE Page tables PDBR

  34. Speeding Up L1 Access • Observation • Bits that determine CI identical in virtual and physical address • Can index into cache while address translation taking place • Then check with CT from physical address • “Virtually indexed, physically tagged” • Cache carefully sized to make this possible Tag Check 20 7 5 CT CI CO Physical address (PA) PPN PPO Addr. Trans. No Change CI virtual address (VA) VPN VPO 20 12

  35. Linux Organizes VM as Collection of “Areas” • Area • Contiguous chunk of (allocated) virtual memory whose pages are related • Examples: code segment, data segment, heap, shared library segment, etc. • Any existing virtual page is contained in some area. • Any virtual page that is not part of some area does not exist and cannot be referenced! • Thus, the virtual address space can have gaps. • The kernel does not keep track of virtual pages that do not exist. • task_struct • Kernel maintains a distinct task structure for each process • Contain all the information that the kernel needs to run the process • PID, pointer to the user stack, name of the executable object file, program counter, etc. • mm_struct • One of the entries in the task structure that characterizes the current state of virtual memory • pgd – base of the page directory table • mmap – points to a list of vm_area_struct

  36. Linux Organizes VM as Collection of “Areas” process virtual memory • vm_prot: • read/write permissions for this area • vm_flags • shared with other processes or private to this process vm_area_struct task_struct mm_struct vm_end vm_start mm pgd vm_prot vm_flags mmap shared libraries vm_next 0x40000000 vm_end vm_start data vm_prot vm_flags 0x0804a020 text vm_next vm_end vm_start 0x08048000 vm_prot vm_flags 0 vm_next

  37. Linux Page Fault Handling vm_end vm_end vm_end vm_start vm_start vm_start r/o r/o r/w vm_next vm_next vm_next process virtual memory • Is the VA legal? • i.e. is it in an area defined by a vm_area_struct? • if not then signal segmentation violation (e.g. (1)) • Is the operation legal? • i.e., can the process read/write this area? • if not then signal protection violation fault (e.g., (2)) • If OK, handle the page fault • e.g., (3) vm_area_struct shared libraries 1 read 3 data read 2 text write 0

  38. Memory Mapping • Linux (also, UNIX) initializes the contents of a virtual memory area by associating it with an object on disk • Create new vm_area_struct and page tables for area • Areas can be mapped to one of two types of objects (i.e., get its initial values from) : • Regular file on disk (e.g., an executable object file) • The file is divided into page-sized pieces. • The initial contents of a virtual page comes from each piece. • If the area is larger than file section, then the area is padded with zeros. • Anonymous file (e.g., bss) • An area can be mapped to an anonymous file, created by the kernel. • The initial contents of these pages are initialized as zeros • Also, called demand-zero pages • Key point: no virtual pages are copied into physical memory until they are referenced! • Known as “demand paging” • Crucial for time and space efficiency

  39. User-Level Memory Mapping • void *mmap(void *start, int len, int prot, int flags, int fd, int offset) • map len bytes starting at offset offset of the file specified by file description fd, preferably at address start (usually 0 for don’t care). • prot: PROT_EXEC, PROT_READ, PROT_WRITE • flags: MAP_PRIVATE, MAP_SHARED, MAP_ANON • MAP_PRIVATE indicates a private copy-on-write object • MAP_SHARED indicates a shared object • MAP_ANON with NULL fd indicates an anonymous file (demand-zero pages) • Return a pointer to the mapped area. • Int munmap(void *start, int len) • Delete the area starting at virtual address start and length len

  40. Shared Objects • Why shared objects? • Many processes need to share identical read-only text areas. For example, • Each tcsh process has the same text area. • Standard library functions such as printf • It would be extremely wasteful for each process to keep duplicate copies in physical memory • An object can be mapped as either a shared object or a private object • Shared object • Any write to that area is visible to any other processes that have also mapped the shared object. • The changes are also reflected in the original object on disk. • A virtual memory area into which a shared object is mapped is called a shared area. • Private object • Any write to that area is not visible to other processes. • The changes are not reflected back to the object on disk. • Private objects are mapped into virtual memory using copy-on-write. • Only one copy of the private object is stored in physical memory. • The page table entries for the private area are flagged as read-only • Any write to some page in the private area triggers a protection fault • The hander needs to create a new copy of the page in physical memory and then restores the write permission to the page. • After the handler returns, the process proceeds normally

  41. Physical memory Physical memory Process 2 virtual memory Process 2 virtual memory Process 1 virtual memory Process 1 virtual memory Shared Object Shared object Shared object

  42. Private Object Process 2 virtual memory Physical memory Process 2 virtual memory Process 1 virtual memory Physical memory Process 1 virtual memory Copy-on-write Write to private copy-on-write page Private copy-on-write object Private copy-on-write object

  43. Exec() Revisited • To run a new program p in the current process using exec(): • Free vm_area_struct’s and page tables for old areas. • Create new vm_area_struct’s and page tables for new areas. • stack, bss, data, text, shared libs. • text and data backed by ELF executable object file. • bss and stack initialized to zero. • Set PC to entry point in .text • Linux will swap in code and data pages as needed. process-specific data structures (page tables, task and mm structs) physical memory same for each process kernel code/data/stack kernel VM 0xc0 demand-zero stack %esp process VM Memory mapped region for shared libraries .data .text libc.so brk runtime heap (via malloc) demand-zero uninitialized data (.bss) initialized data (.data) .data program text (.text) .text p forbidden 0

  44. Fork() Revisited • To create a new process using fork(): • Make copies of the old process’s mm_struct, vm_area_struct’s, and page tables. • At this point the two processes are sharing all of their pages. • How to get separate spaces without copying all the virtual pages from one space to another? • “copy on write” technique. • copy-on-write • Make pages of writeable areas read-only • flag vm_area_struct’s for these areas as private “copy-on-write”. • Writes by either process to these pages will cause page faults. • Fault handler recognizes copy-on-write, makes a copy of the page, and restores write permissions. • Net result: • Copies are deferred until absolutely necessary (i.e., when one of the processes tries to modify a shared page).

  45. Dynamic Memory Allocation • Heap • An area of demand-zero memory that begins immediately after the bss area. • Allocator • Maintains the heap as a collection of various sized blocks. • Each block is a contiguous chunk of virtual memory that is either allocated or free. • Explicit allocator requires the application to allocate and free space • E.g., malloc and free in C • Implicit allocator requires the application to allocate, but not to free space • The allocator needs to detect when an allocated block is no longer being used • Implicit allocators are also known as garbage collectors. • The process of automatically freeing unused blocks is known as garbage collection. • E.g. garbage collection in Java, ML or Lisp

  46. Heap memory invisible to user code kernel virtual memory stack %esp Memory mapped region for shared libraries the “brk” ptr points to the top of the heap run-time heap (via malloc) uninitialized data (.bss) initialized data (.data) program text (.text) 0

  47. Malloc Package • #include <stdlib.h> • void *malloc(size_t size) • If successful: • Returns a pointer to a memory block of at least size bytes • (Typically) aligned to 8-byte boundary so that any kind of data object can be contained in the block • If size == 0, returns NULL • If unsuccessful (i.e. larger than virtual memory): returns NULL (0) and sets errno. • Two other variations: calloc (initialize the allocated memory to zero) and realloc • Use the mmap or munmap function, or use sbrk function • void *realloc(void *p, size_t size) • Changes the size of block pointed by p and returns pointer to the new block. • Contents of the new block unchanged up to min of old and new size. • void free(void *p) • Returns the block pointed by p to pool of available memory • p must come from a previous call to malloc or realloc.

  48. Malloc Example void foo(int n, int m) { int i, *p; /* allocate a block of n ints */ if ((p = (int *) malloc(n * sizeof(int))) == NULL) { perror("malloc"); exit(0); } for (i=0; i<n; i++) p[i] = i; /* add m bytes to end of p block */ if ((p = (int *) realloc(p, (n+m) * sizeof(int))) == NULL) { perror("realloc"); exit(0); } for (i=n; i < n+m; i++) p[i] = i; /* print new array */ for (i=0; i<n+m; i++) printf("%d\n", p[i]); free(p); /* return p to available memory pool */ }

  49. Allocation Examples p1 = malloc(4) p2 = malloc(5) p3 = malloc(6) free(p2) p4 = malloc(2)

  50. Requirements (Explicit Allocators) • Applications: • Can issue arbitrary sequence of allocation and free requests • Free requests must correspond to an allocated block • Allocators • Can’t control the number or the size of allocated blocks • Must respond immediately to all allocation requests • i.e., can’t reorder or buffer requests • Must allocate blocks from free memory • i.e., can only place allocated blocks in free memory • Must align blocks so they satisfy all alignment requirements • 8 byte alignment for GNU malloc (libc malloc) on Linux boxes • Can only manipulate and modify free memory • Can’t move the allocated blocks once they are allocated • i.e., compaction is not allowed

More Related