220 likes | 238 Vues
In search of a virtual yardstick: Measuring Virtual CPU time. Wade Satterfield 2/26/2009. Percent busy vs. Percent of capacity used. It used to be that the percent of time a CPU was busy was also the percent of the capacity of the CPU’s capacity that was used.
E N D
In search of a virtual yardstick:Measuring Virtual CPU time Wade Satterfield 2/26/2009
Percent busy vs. Percent of capacity used • It used to be that the percent of time a CPU was busy was also the percent of the capacity of the CPU’s capacity that was used. • This is not true of virtual machines.
Errors in measuring CPU time on a VM • The Guest OS counts time it didn’t get • The Guest OS doesn’t count time it did get
Counted Time the VM didn’t actually get Hypervisor takes CPU from the VM Hypervisor returns CPU to the VM Guest OS becomes idle Guest OS finds useful work to do Possible time the Guest OS thinks it used Time the Guest OS actually used
First Experiment • Run a tight loop program in a physical server and again on a VM on an otherwise idle VM host
The benchmark #!/bin/sh a=10000000000 while [$a –gt 0] do let a=$a-1 done
Second Experiment • Run a tight loop program on a VM on a busy VM host (5 VMs running the same thing on a 4-way host)
Further Experiments • Run a tight loop program on a VM on a busy VM host (8 VMs running the same thing on a 4-way host)
What does this mean for %CPU busy? • Suppose you have a Physical workload that is normally 20% busy and you make it into a VM on a VM host that is 40% busy. (Assume no IO or other virtualization overhead) • How busy will the VM think it is? • 20%, 20.5%, 28%, 40%
CPU data that does not get counted app app Guest OS Guest OS Virtual hardware Virtual hardware Host OS (hypervisor) Physical hardware
Threads within a virtual machine app CPU thread IO threads Guest OS Virtual hardware Host OS (hypervisor) The guest OS knows about the threads that represent the virtual CPUs, but it does not know that there are threads representing the NICs, and disks. Physical hardware
The benchmark #!/bin/sh a=1000 while [$a –gt 0] do ftp –f getbigfile done
Experiment • Run an IO intensive program on a VM on an otherwise idle VM host
How bad can this be? • ftp is an extreme example • High IO applications like a web server can have throughput reductions to 70% • In our testing we have found that the VM can get near 100% utilization, but the IO threads can get time too giving a VM 125% utilization.
What you want to measure • Generally you want to measure the CPU time used on your VMs by measuring them at the VM host • Your CPU bottleneck is on the VM host • Some people may want to know how much CPU time would be used if the VM were on a physical system
Vmware • Vmware has a WBEM interface on the Vmhost, and on the Virtual Center host Global memory, disk, network Queues, and process data Guest OS Data Collector Hypervisor “Global” CPU, disk, network
HP-VM • Integrity VMs have a library call available inside the VM to get the CPU time the VM has used on the VM host. Data Collector All data Guest OS “Global” CPU Hypervisor
HyperV • HyperV does most of its IO processing the parent partition The CPU time used by the VM’s worker process in the parent partition should be added to the time used by the VM HyperV has a WMI interface for a guest’s CPU time Worker process CPU data Guest OS Global memory, disk, network Queues, and process data Parent partition Data Collector “Global” CPU, DOM0 CPU disk, network Hypervisor http://blogs.msdn.com/virtual_pc_guy/archive/2008/02/28/hyper-v-virtual-machine-cpu-usage-and-task-manager.aspx
Xen • Xen does most of its IO processing in DOM0 • The CPU time used by DOM0 should be prorated out to the VMs depending on how much CPU time they used. CPU data for IO Guest OS Global memory, disk, network Queues, and process data DOM0 Data Collector “Global” CPU, DOM0 CPU disk, network Hypervisor