A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines

A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines Kenichi KouraiShigeru Chiba Tokyo Institute of Technology

Server consolidation with VMs • Server consolidation is widely carried out • Multiple server machines are integrated on one physical machine • Recently, using virtual machines (VM) • VMs are run on a virtual machine monitor (VMM) • Multiplexing resources ... VM VM VMM hardware

Software aging of VMMs • Software aging of a VMM is critical • Software aging is... • The phenomenon that software state degrades with time • E.g. exhaustion of system resources • Software aging of a VMMaffects all VMs on it • E.g. performance degradation ... VM VM VMM

Software rejuvenation of VMMs • Preventive maintenance • Performed before software aging of a VMM affects its VMs • Occasionally stops a VMM, cleans its internal state, and restarts it • Typical example: rebooting a VMM • Cleans the internal state automatically and completely • The easiest way

Drawbacks (1/2):Increasing service downtime • The VMM reboot needs: • Rebooting all OSes running on the VMs • The time tends to be long • Larger number of VMs • Longer startup time of services • A hardware reset • The BIOS power-on self test is time-consuming VM ... OS OS VMM OSshutdown VMMshutdown hardwarereset VMM boot OS boot

Drawbacks (2/2):Performance degradation • The file cache is lost by the OS reboot • OSes cannot restore performance until the file cache is re-filled • They strongly rely on the file cacheto speed up file accesses • The time tends to be long • The file cache size is increasing • Large amount of memory for a VM • Free memory as the file cache process file cache OS disk

Warm-VM reboot • Fast rejuvenation technique • Efficiently reboots only a VMM • The VMM reboot causes no OS reboot • Basic idea • Suspend all VMs before the VMM reboot • Resume them after the reboot • Challenge • How does a VMM efficiently deal with the large memory images of VMs?

On-memory suspend of VMs • Freezes the memory images of VMs on the main memory • That memory area is just reserved • The time does not depend on the memory size • Saving them into a slow disk is inefficient • ACPI S3 state for VMs • Suspend To RAM • Traditional suspend isACPI S4 state VM freeze disk main memory

On-memory resume of VMs • Unfreezes the memory images preserved on the main memory • They are reused directly as the memory of VMs • No need to read them from a slow disk • The file cache of OSes is also restored • No performance degradation VM unfreeze disk main memory

Quick reload of VMMs • Directly boots a new VMM without a hardware reset • The memory images of VMs are preserved through the VMM reboot • Software can keep track of them • A hardware reset does not guarantee this • A VMM is rebooted quickly • No overhead due toa hardware reset main memory VM new VMM preload old VMM

Comparison with other methods • Cold-VM reboot • Needs the OS reboot • Saved-VM reboot • A naive implementation of the warm-VM reboot • VMs are saved into a disk

Model for availability • Must consider the software rejuvenation of both a VMM and OSes • Warm-VM reboot • The OS rejuvenation isindependent • Cold-VM reboot • The OS rejuvenation is affectedby the VMM rejuvenation • # of the OS rejuvenationincreases OS rejuvenation VMM rejuvenation OS rejuvenation VMM rejuvenation

RootHammer • We have implemented the warm-VM reboot into Xen 3.0.0 • On-memory suspend/resume • Based on Xen's suspend/resume • Manages the mapping from theVM memory to the physical memory • Quick reload • Based on the kexec mechanism in Linux • Kexec for a VMM is included in the latest Xen • It is not for reusing the memory images VM memory physical memory

Experiments • Examine that the warm-VM reboot reduces downtime and performance degradation • Comparison • Cold-VM reboot with the OS reboot • Saved-VM reboot using Xen's suspend/resume server ... Linux Linux client VMM 2 dual-core Opteron 12 GB SDRAM 15,000 rpm SCSI disk gigabit Ethernet Linux

Performance ofon-memory suspend/resume • Suspend/resume of one VM with 11 GB of memory • Ours: 1 sec • Xen's: 280 sec • Depends on the memory size • Suspend/resume of 11 VMs • Ours: 4 sec • OS reboot: 58 sec • Depends on # of VMs

Effect of quick reload • The time of rebooting a VMM with no VMs • Warm-VM reboot • 11 sec • The time of quick reload is negligible • Cold-VM reboot • 59 sec • The time due to a hardware reset is 48 sec

Downtime of services • Warm-VM reboot • Always the same • 42 sec • Saved-VM reboot • Depends on # of VMs • 429 sec (11 VMs) • Cold-VM reboot • Affected by the service type • 157 sec (sshd) • 241 sec (JBoss)

Availability of JBoss • The warm-VM reboot achieves four 9s • Assumptions • OS rejuvenation every week • 34 sec • VMM rejuvenation every 4 weeks • In 0.5 week after the last OS rejuvenation 1 week OS rejuvenation VMM rejuvenation 0.5 week

Performance degradation • The throughput of the Apache web server • before and after the VMM reboot • Warm-VM reboot • No degradation • Cold-VM reboot • Degraded by 69%

Software rejuvenationin a cluster environment • Clustering achieves zero downtime • Multiple hosts can provide the same service • Let us consider the total throughput of all hosts in a cluster • Warm-VM reboot • (m-1)p • Cold-VM reboot • (m-1)p • (m-0.69)p for a whileafter the reboot total throughput mp (m-1)p 42 sec 241 sec t m: # of hosts p: throughput of one host

Comparison with VM migrationin a cluster environment • VM migration achieves nearly zero downtime • VMs are moved to another host • Xen's live migration, VMware's VMotion • Total throughput • Normal run • (m-1)p • One host is reserved for migration • Live migration • (m-1.12)p total throughput mp (m-1)p 42 sec 17 min t

Related work • Microreboot [Candea et al.'04] • Reboots only a part of subcomponents • The warm-VM reboot enables rebooting only a parent component (VMM for VMs) • Checkpointing/restart [Randell '75] • Saves/restores OS processes • Similar to suspend/resume of VMs • Optimizations of suspend/resume • Incremental suspend, compression of memory images

Conclusion • We proposed the warm-VM reboot • On-memory suspend/resume • Freezes/unfreezes the memory images of VMs • Quick reload • Preserves the memory images through the VMM reboot • It achieved fast rejuvenation • Downtime reduced by 83% at maximum • No performance degradation

A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines

A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines

Presentation Transcript

Virtual Machines

Virtual Devices for Virtual Machines

Virtual Machines (Introduction to Virtual Machines)

Virtual Machines

Consolidation and Virtualization with SQL Server

Developing resource consolidation frameworks for moldable virtual machines in clouds

Server Consolidation

Implementing malware with virtual machines

Virtual machines

Virtual Hierarchies to Support Server Consolidation

Virtual Machines

Virtual Machines

Virtual Machines

A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines

Virtual Machines

Virtual Hierarchies to Support Server Consolidation

Virtual Machines

Virtual Machines

VIRTUAL MACHINES