Efficient Software-Based Fault Isolation—sandboxing Presented by Carl Yao
Revisit last week’s questions • What is the motivation for modular operating system? • Extensibility (Vendors have freedom to enhance parts of OS) • What is the motivation for using hardware protection at module boundaries? • Fault isolation (Extension modules can render OS unreliable) • As a result of hardware protection, why does performance of modular systems degrade? • RPC cost • Purpose of sandboxing: Realizing fault isolation without the high cost of RPC
Different approaches to reduce cost of RPC • LRPC—same thread runs in both caller and callee domain; less data copying compared to RPC; but still has context switch cost (sandboxing eliminates context switch cost) • Tags—allows multiple address spaces to share the TLB; but relies on specialized architecture, does not reduce cost of register management or system calls (sandboxing has little of these cost)
Different approaches to realize fault isolation • Restrictive programming languages (Pilot)—eliminates possible faults, thus no need for fault isolation; but no other languages can be used in these systems (sandboxing is supposed to be language-independent) • Interpreter—filters faults in a programming language for OS; but again is language-dependent and has high filtering cost (sandboxing does not have high filtering cost, but it does have a 4% encapsulation cost)
What is sandboxing? • An assembly-language-level software approach to implement fault isolation within a single address space • Load the code and data for a distrusted module into its own fault domain, a contiguous region of memory within the application’s address space. It has a unique identifier which is used to control its access to process resources such as file descriptors. • Modify the object code of a distrusted module to prevent it from writing or jumping to an address outside its fault domain. An cross-fault-domain RPC interface (much cheaper than RPC) is used for inter-fault-domain communications.
What is a fault domain? • An application’s virtual address space is divided into segments, aligned so that all virtual addresses within a segment share a unique pattern of upper bits, called the segment identifier. A fault domain consists of two segments, one for a distrusted module’s code, the other for its static data, heap and stack. • Example: If 101 is the segment identifier, then 10111001 is inside this segment, but 11011001 is not. • Next we’ll talk about segment matching.
Statically verifying jump and store instructions Say we have an instruction: store value, register A If the value of register A can be verified by the compiler, then we can statically verify whether this memory address is inside or outside a certain segment. However, if the value of register A cannot be determined until run time, for example, by a user input, then we cannot statically verify whether this instruction is safe.
What is segment matching? • Some instructions jump to or store to an address which cannot be statically verified. This is unsafe because it could corrupt critical data. One approach to prevent this is to insert checking code before every unsafe instruction. The checking code determines whether the unsafe instruction’s target address has the correct segment identifier or not. See next page for example.
Example of segment matching dedicated-reg = 10001111 dedicated-reg>>shift-reg = 100, so scratch-reg = 100 segment-reg=101; scratch-reg=100; not equal! A trap is generated to trigger a system error routine outside the distrusted module’s fault domain. If they match, then we know the target address is indeed inside the same segment, so we can run: store value, dedicated-reg Say an unsafe instruction is located in segment 101, but wants to write to segment 100. Initially, target address=10001111; segment-reg=101; shift-reg=5 (or 1001);
Why use dedicated registers? • Why not simply: scratch-reg <= (target address>>shift-reg) compare scratch-reg and segment-reg trap if not equal store value, target address • Because an instruction can jump to the last instruction to bypass the checking instructions. Using dedicated registers prevent this from happening.
What is sandboxing? • Segment matching can pinpoint the offending instruction. Sandboxing reduces runtime overhead further, at the cost of providing no info about the source of faults. See next page for an example.
Example of sandboxing target-reg&and-mask-reg = 00001111 so dedicated-reg=00001111 segment-reg | dedicated-reg = 10101111. so dedicated-reg = 10101111. store value, dedicated-reg Now instead of writing to the intended unsafe location 10001111, sandboxing changed the target address to 10101111, which is in the same fault domain of the writing instruction. In this case probably this fault domain will be corrupt, but it does not affect other fault domains. Again say an unsafe instruction is located in segment 101, but wants to write to segment 100. Initially, target-reg=10001111; segment-reg=101; and-mask-reg=00011111;
Guard Zone Optimization • RISC architectures include a register-plus-offset instruction mode. Consider instruction “store value, offset(reg)”. To avoid calculating reg+offset, we directly sandbox the reg, at the cost of creating guard zones at the top and bottom of each segment.
Optimizing stack pointer sandboxing • The stack pointer in a segment is much more often read than set. So the MIPS stack pointer is treated as a dedicated register. The stack pointer is only sandboxed when it is set, saving the sandboxing cost when it is read. • We can avoid sandboxing the stack pointer after it is modified by a small constant offset as long as the modified stack pointer is used as part of a load or store address before the next control transfer instruction.
Preventing one fault domain from corrupting another in the same address space • Solution 1: Modifying the OS to know about fault domains. This will render sandboxing not portable, and thus not used. • Solution 2: Distrusted modules must access system resources only through cross-fault-domain RPC. A fault domain is reserved to hold trusted arbitration code. This fault domain is used as a proxy for all fault domains in that address space when system calls are made.
Sharing data among fault domains—Lazy Pointer Swizzling • A technique to share data among fault domains in the same address space with no additional runtime overhead • Hardware page tables are modified to map the shared memory regions into every segment at the same offset. (aliasing) • When one segment make changes to the shared data, all segments immediately see the changes because the data is aliased.
Implementing software encapsulation • The authors have not developed a tool to encapsulate object code to implement sandboxing. So instead they modified gcc compiler, which made sandboxing language-dependent by the time this paper was finished. • A program is broken into unsafe regions. When the program exits an unsafe region, the compiler verifies that any dedicated register modified in this region is valid. If not, the code is rejected.
Low-cost cross fault domain communication • The only way for control to escape a fault domain is via a jump table, which guarantees that target addresses are legal entry point. • A call-stub and a return-stub are created for each pair of fault domains. Parameters are copied from caller stub to the callee stub, then read by the callee. The stub is also responsible for managing machine states and registers. • Fatal errors are handled by UNIX signal facility.
Performance lab test result • Sandboxing lowers RPC cost by more than an order of magnitude. • Sandboxing incurs an average of 4% execution time overhead on a DECstation 5000/240 and a DEC Alpha 400.
Performance lab test result • Combining the time saved and incurred, here is the result from running an application on POSTGRES database system
Summary • Sandboxing is a tradeoff between RPC cost and code execution cost. But because normal programs have a large amount of inter-process communications, sandboxing is the better option in most cases. • Sandboxing is a tradeoff between level of trust and encapsulation overhead.