Operating System Innovations for Planetary-Scale Network Services

CS7701: Research Seminar on Networking http://arl.wustl.edu/~jst/cse/770/ Review of: Operating System Support for Planetary-Scale Network Services Paper by: Andy Bavier, Scott Karlin, Steve Muir, Larry Peterson, Tammo Spalink, Mike Wawrzoniak (Princeton) Mic Bowman, Brent Chun, Timothy Roscoe (Intel) David Culler (Berkeley) Published in: First Symposium on Network Systems Design and Implementation (NSDI), 2004 Additional presentation information: www.planet-lab.org Reviewed by: Chip Kastner Discussion Leader: Manfred Georg

Outline • Introduction and Virtualization • The Operating System • Motivation and overview • Virtualization • Isolation / Resource Management • Evolution • Evaluation • Conclusions

Introduction • This paper is more or less an examination of PlanetLab’s operating system • What is PlanetLab? • 441 machines at 201 sites in 25 countries • Supporting 450 research projects • Each machine runs a Linux-based OS and other tools • Why does it exist? • Mainly a testbed for planetary-scale network services • Lets researchers test services in real-world conditions at a large scale • Also a deployment platform • Who uses it? • AT&T labs, HP labs, Intel • Many prestigious universities – including Wash U

Introduction • A PlanetLab user sees a private set of machines (nodes) on which he can run his applications User 2 User 1

Introduction • Reality is a bit different • PlanetLab is an overlay network Intel Labs Internet Tel-Aviv University Princeton Wash U

Introduction • Users are granted slices – Sets of nodes on which users get a portion of the resources Intel Labs Internet Tel-Aviv University Princeton Wash U

Introduction • So, a PlanetLab user is managing a set of distributed Virtual Machines (VMs) User 2 User 1

The Operating System • Operating system design was designed around two high-level goals • Distributed virtualization: Each service runs in an isolated section of PlanetLab’s resources • Unbundled management: The OS is separate from services that support the infrastructure

The Operating System • Researchers wanted to use PlanetLab as soon as the first machines were set up • No time to build a new OS • PlanetLab designers chose to deploy Linux on the machines • Linux acts as a Virtual Machine Monitor • Designers slowly transform Linux via kernel extensions

The Operating System • A node manager runs on top of Linux • A root VM that manages other VMs on a node • Services create new VMs by calling the node manager • Services can directly call the only the local node manager • The node manager is hard-coded • Local control can be added • Infrastructure services can be given privileges to perform special tasks

The Operating System Unprivileged Slices P2P Networks, Embedded network storage Slice creation, monitoring, environment service Privileged Slices Node Manager Resource allocation, Auditing, Bootstrapping Linux w/ kernel extensions Virtual Machine Monitor

The Operating System • PlanetLab Central (PLC) is a service responsible for creating slices • It maintains a database of slice information on a central server • Users request slices through the PLC • The PLC communicates with a resource manager on each node to start up slices and bind them to resources

Virtualization • Virtualization can be done at several levels • Virtualization of hardware would allow each VM to run its own OS • This is infeasible due to performance/space constraints • PlanetLab uses system-call level virtualization • Each VM sees itself as having exclusive access to an OS • All VMs on a node are actually making system calls to the same OS

Virtualization • How is virtualization maintained? • The OS schedules clock cycles, bandwidth, memory, and storage for VMs, and provide performance guarantees • It must separate name spaces – such as network addresses and file names – so VMs can’t access each other’s resources • It must provide a stable base so that VMs can’t affect each other (no root access)

Virtualization • Virtualization is handled by a Linux utility called vserver • Vservers are given their own file system and a root account that can customize that file system • Hardware resources, including network addresses, are shared on a node • Vservers are sufficiently isolated from one another • VMs are implemented as vservers

Isolation • Buggy/malfunctioning or even malicious software might be run on PlanetLab • This can cause problems at many levels Yahoo Wash U Google Internet Intel Labs Princeton SlashDot Tel-Aviv U AOL

Isolation • The problem of isolation is solved in two ways • Resource usage must be tracked and limited • Resource usage must be able to be audited later to see what actions were performed by slices

Resource Management • Non-Renewable Resources are monitored and controlled • System calls are wrapped and intercepted • The OS keeps track of resources allocated to VMs • Resource requests are accepted or denied based on a VM’s current resource usage

Resource Management • Renewable Resources can be guaranteed • A VM that requires a certain amount of a renewable resource will receive it • A fair best-effort method allocates resources to remaining VMs • If there are N VMs on a node, each VM receives at least 1 / N of the available resources

Resource Management • Scheduling & Management are performed by Scout on Linux Kernel (SILK) • Linux CPU scheduler cannot provide fairness and guarantees between vservers • Uses a Proportional Sharing method • Each vserver receives a number of CPU shares • Each share is a guarantee for a portion of the CPU time

Auditing • The PlanetLab OS provides “safe” sockets • Each socket must be specified as TCP or UDP • Each socket must be bound to a port • Only one socket can send on each port • Outgoing packets are filtered to ensure this • Privileged slices can sniff packets sent from each VM • Anomalies can be caught before they become disruptive

Evolution • PlanetLab can be easily extended • Researchers can create their own infrastructure services for PlanetLab • Services that help maintain PlanetLab itself • Slice creation, performance monitoring, software distribution, etc. • It is possible for multiple researchers to develop similar services in parallel • A rather unique system evolution problem

Evolution • What does this mean for PlanetLab’s OS? • It must be responsible only for the basic tasks • It must be fair in granting privileges to infrastructure services • It must provide an interface for creating a VM

Evolution • New services should be implemented at the highest level possible • Services should be given only the privileges necessary to perform their task Unprivileged Slices Privileged Slices Node Manager Linux w/ kernel extensions

Evaluation • Scalability • A VM’s file system requires 29 MB • 1,000 VMs have been created on a single node • Slice creation • Nodes look for new slice info every 10 minutes • Creating a new VM from scratch takes over a minute • Service initialization • Upgrading a package takes 25.9 sec on a single node • Slower when many nodes update a package at once

Conclusion • PlanetLab is an excellent option for researchers to test new network services • Virtualization provides an easy interface • Services are reasonably well-protected from one another • Opportunities exist to expand PlanetLab’s functionality

Operating System Innovations for Planetary-Scale Network Services

Operating System Innovations for Planetary-Scale Network Services

Presentation Transcript

Network Operating System

Operating System Support for Virtual Machines

PlanetLab: A Platform for Planetary-Scale Services

Network Operating System

Operating System Review

Operating System Support for Virtual Machines

PlanetLab: A Distributed Test Lab for Planetary Scale Network Services

Operating System support

Review of: Operating System Support for Planetary-Scale Network Services

Operating System Support for Performance Monitoring

Operating System Support

Operating system services

Operating System Support for Virtual Machines

Operating System Support Services

Operating System support for Multimedia

Operating System Support for Virtual Machines

Operating System Support for Virtual Machines

NETWORK OPERATING SYSTEM INTEROPERABILITY