Practical, transparent operating system support for superpages Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox OSDI 2002
What’s a Superpage? • A very large page size, much greater than the base page size • Supported by most computer architectures today • Machines that support superpages usually have several different page sizes, beginning with the base page and then in increasing sizes, each a power of 2 – today, some as large as a gigabyte.
Background Summary • Virtual memory automates the movement of a process’s address space (code and data) between disk and primary memory. • Virtual addresses are translated using information stored in the page table. • Page tables are stored in primary memory. • Extra memory references due to page table degrades performance
Translation Lookaside Buffer • TLB (translation lookaside buffer) – faster memory; caches portions of the page table • If most memory references “hit” in the TLB, the overhead of address translation is acceptable. • TLB coverage: the amount of memory that can be accessed strictly through TLB entries.
The Problem • Computer memories have increased in size faster than TLBs. • TLB coverage as a percentage of total memory has decreased over the years. • At the time this paper was written, most TLBs covered a megabyte or less of physical memory • Many applications have working sets that are not completely covered by the TLB • Result: more TLB misses, poorer performance.
The Solution • Superpages! • Increase coverage without increasing TLB size. • How? • By increasing amount of memory each TLB entry can map
Hardware-Imposed Constraints • Must have enough contiguous free memory to store each superpage • Superpage addresses (physical and virtual) must be aligned on the superpage size: e.g., a 64KB SP must start at address 0, or 64KB, or 128KB, etc. • TLB entry only has one set of bits (R, M, etc.) and thus can only provide coarse-grained info – not good for efficient page management.
Design Issues • Issues for a superpage management system: • Storage allocation and fragmentation control • Promotion • Demotion • Eviction
Issues: Frame Allocation • When a page fault occurs, must choose a frame for the new page • In non-superpage systems any frame will do • In a superpage system we may later decide to include this page in a superpage – how does this affect the decision? • Possible approaches to allocation: • Reservation based • Relocation based
Reservation-based Allocation • When a page is initially loaded choose a superpage size and reserve aligned, contiguous frames to hold it. • As other pages are referenced, load them into the previously reserved frames • Will adjoining pages ever be needed by the program?
Object mapping Mapped pages Virtual address space Superpage alignment boundaries Physical address space Allocated frames Unused page frame reservation Figure 2: Reservation based allocation
Relocation-based Allocation • Wait until a superpage is formed, then move pages to contiguous locations • Incurs overhead of moving pages when superpages are created. • Tradeoff: relocation costs versus unused reservations (internal fragmentation)
Choosing a Page Size • Regardless of whether reservation-based or relocation-based allocation is used, size of superpage must be chosen also. • When a computer has several page sizes (base page + several larger sizes), how to choose which size to use? • The issue: larger versus smaller
Choosing a Page Size • Possibilities: • the largest superpage size available • a superpage size that most closely matches the VM object the page belongs to • a smaller size, based on memory availability. • Tradeoff: possible performance gains from large page versus possible loss of contiguous physical memory space that may be needed later
Large Pages? • Large page sizes increase TLB coverage the most, optimize I/O. • But … they can also greatly increase the memory requirements of a process • Some pages are only partially filled • Small localities = a kind of internal fragmentation (page only partially referenced) • If pages are not filled or have internal fragmentation, paging traffic can actually increase instead of decrease.
Small Pages? • Small page sizes reduce internal fragmentation (amount of wasted space in an allocated block & the amount of unreferenced content in a loaded page). • But … they have all the problems that large pages solve, plus they also have the possibility of causing more page faults.
So Why Not Use Multiple Page Sizes? • Memory management is more complex • Uniform page size is simple • Multiple page sizes causes external fragmentation • It’s hard to maintain blocks of contiguous free space to accommodate large superpages.
SP1 SP1 SP4 leaves, is replaced by SP5. SP2 leaves. No room for a large superpage External fragmentation SP2 SP3 SP3 SP5 SP4
Fragmentation Control • Memory can become fragmented with reservation-based approach and pages of various sizes. • Possible solutions: • Page out or overwrite areas of memory that haven’t been used recently • Preempt unused portions of existing reservations
Issues • Issues for a superpage management system: • Allocation and fragmentation control • Promotion • Demotion • Eviction
Issues: Promotion • Initially, base pages are treated normally. • Promote when enough pages have been loaded to justify creating a superpage: • Combine TLB entries into one entry • Load remaining pages, if necessary, to fill reservation • Promotion may be incremental • Tradeoff: early promotion (before all base pages have been faulted in) reduces TLB misses but wastes memory if all pages of the superpage are not needed; late promotion delays benefits of greater TLB coverage.
Issues • Issues for a superpage management system: • Allocation, fragmentation, promotion – done • Promotion - done • Demotion & eviction
Issues: Demotion • Reduce superpage size • To individual base pages • To a smaller superpage • All or some of the base pages may have been chosen for eviction • Difficulty: use bits and dirty bits in the TLB aren’t as helpful as if they referred to a base page. • If the dirty bit is set, the entire superpage must be written to disk, even if only part of it has changed.
Design of System Proposed by Navarro, et al. • The system discussed in this paper is reservation-based. • It supports multiple superpage sizes to reduce internal fragmentation • Effect on external fragmentation? • It demotes infrequently referenced pages to reclaim memory frames • It is able to maintain contiguity (large blocks of contiguous free frames) without using compaction
Design Decisions in This System • With respect to allocation and fragmentation • Storage Management • Reservation-based allocation • Choosing a page size • Fragmentation control
Storage Management • Free space (available for reservations) is stored on multiple lists, ordered by superpage size • Buddy system is used for allocation • Partially filled reservations are kept on a multi-list (one list for each page size) by largest page size that can be obtained by preempting unused portion • Population maps track allocated portions of reservations
Frame Allocation • A page fault triggers a decision: does the page have an existing reservation or not? • If not, then • select a preferred SP size, • locate a set of contiguous, aligned frames • load the page into the correct (aligned) frame • enter the mapping in the page table • reserve the remaining frames • Or, load the page into a previously reserved frame & enter mapping in PT
Choosing a Superpage Size in the Navarro System • Since the decision is made early, can’t decide based on process’s behavior. • Base decision on the memory object type; prefer too large to too small • If the decision is too large, it is easy to reclaim the unneeded space • If the decision is too small, relocation is needed
Guidelines for Choosing Superpage Size • For fixed size memory objects (e.g. code segments) reserve the largest super page possible that is not too large. • For dynamic-sized objects (stacks, heaps) that grow one page at a time: allocate extra space for growth.
Preempting Reservations in the Navarro System • After a page fault, if the guidelines call for a superpage that is too large for any available free block: • Reserve a smaller size superpage or • Preempt an existing reservation that has enough unallocated frames to satisfy the request • This system uses preemption wherever possible.
Preemption Policy - LRA • Which reservation is preempted if more than one can satisfy the request? • Choose the one “whose most recent page allocation occurred least recently” - LRA • Reason: spatial locality suggests that related pages will all be accessed fairly closely together in time;(e.g., arrays, memory mapped files). If a reservation hasn’t added new pages recently, it’s unlikely to do so any time soon.
Fragmentation Control • Contiguity (of storage) is a contended resource • Memory becomes fragmented due to • Multiple page sizes • Wired pages (can’t be paged out) • Result: not enough large, properly aligned blocks of free memory. • Navarro et al. propose several implementation techniques to address this problem
Fragmentation Control in the Navarro System* • The “buddy allocator” (free list manager) maintains multiple lists of free blocks, ordered by size • When possible, coalesce adjacent blocks of free memory to form larger blocks. • Modify the page replacement daemon to include contiguity as one of the factors to be considered.
Navarro System: Design Decisions • With respect to promotion, demotion & eviction • Incremental promotions • Speculative demotions • Paging out dirty superpages
Promotion & Demotion • Navarro et. al. implement incremental promotion • e.g., if 4 aligned pages of a 16 page reservation becomes filled, promote to a mid-size superpage • Demotion: when a base page is evicted, its superpage is demoted. • Speculative demotion: demote active superpages to determine if the whole page is still in use or just parts
Paging Out Dirty Superpages* • If a dirty superpage is to be flushed to disk, there is no way to tell if one page is dirty or all pages. • Writing out the entire superpage is a huge perfomance hit. • Navarro, et. al’s solution: Don’t write to clean superpages. • If a process tries to write to a SP, demote the SP. • Repromote later if all base pages are dirty. • They also experimented with a content hash which could tell if a page had been changed
Goal of Superpage Management Systems • Good TLB coverage with minimal internal fragmentation • Navarro, et. al. Conclusion: create the largest superpage possible that isn’t larger than the size of the memory object (except for stack/heap). • If there isn’t enough memory, preempt existing reservations (these pages had their chance)
Current Usage • Superpages were most often used at the time this paper was written to store portions of the kernel and various buffers. • Reason: the memory requirements for these objects are static and can be known in advance. • Superpage size can be chosen to fit the object. • More likely to be implemented in clusters and large servers than in desktop machines.
Current Research • This paper focuses on supporting superpage use in application memory, as opposed to kernel memory. • An ongoing research area: memory compaction – whenever there are idle CPU cycles, work to establish large contiguous blocks of free memory • Compare to disk management
Summary: Potential Advantages of Superpages • Ideally, superpages can improve performance • Without increasing size of TLB (which would be expensive and increase TLB access time) • Without increasing base page size (which can lead to internal fragmentation) • Superpages allow use of small (base) and large (super) page sizes at the same time.
Summary - Tradeoff • Large superpages increase TLB coverage • Large superpages are more likely to fragment memory. (Why?) • Benefits of large superpages must be weighed against “contiguity restoration techniques” • Pages loaded into reserved areas must be loaded at the proper offset. • Must be enough space for the entire superpage • More overhead for free space management
Authors’ Conclusions • Can achieve 30%-60% improvement in performance, based on tests using an accepted set of benchmark programs as well as actual applications. • Must employ contiguity restoration techniques: demotion, preemption, compaction • Must be able to support a variety of page sizes
Conclusion • Superpage management can be transparently integrated into an existing OS (FreeBSD, in this case). • “hooks” connect the OS to the superpage module at critical events: page faults, page allocation, page replacement, etc. • Tests show this technique scales well, according to authors.
Follow-up • “Supporting superpage allocation without additional hardware support”, Mel Gorman, Patrick Healy, Proceedings of the 7th International Symposium on Memory Management , 2008
Premise • Fragmentation control is essential for successful implementation of superpages. • Navarro’s approach doesn’t always work. • Major hindrance: “wired” pages – pages that can’t be paged out or moved – tend to become scattered throughout memory • (Navarro addressed this issue; proposed to monitor creation of kernel wired pages, cluster them in one location.)
Another problem: page replacement processes that don’t consider superpage structure • Reclaim pages based on age, does not consider contiguity. [Note: Navarro system does claim to take this into consideration – such as activating the paging daemon whenever the system fails to satisfy a request for a certain super-page size]
GPBM • Grouping Pages By Mobility (GPBM) is a placement policy described by Gorman & Healy that allocates frames to pages based on whether or not the pages can later be relocated. • Treats the address space as if divided into arenas, which correspond in size to the largest superpage.
Page Mobility Types • Movable – no restrictions; can be relocated as long as PT is updated • Reclaimable – kernel pages that can be added to the free list (certain kinds of caches, for example) • Temporary – pages that are known to be needed for a short time; treated as reclaimable • Non-reclaimable – wired pages
How are the classes used? • Group pages of the same type into arenas of the same type. • The number of movable and reclaimable arenas have the most effect on the number of superpages that can be allocated. • Contiguity-aware page replacement is used.
Summary • Superpages promise performance improvement but so far no generally accepted approach for user level pages. • Reservation based approach seems to be most popular • Contiguity is the biggest problem • Some researchers propose hardware solutions, such as re-designing the memory controller to allow holes in SPs, or re-designing TLB to permit SPs that consist of non-contiguous base pages. • To date, no hardware solutions implemented.