Research on Embedded Hypervisor Scheduler Techniques

Research on Embedded Hypervisor Scheduler Techniques Midterm Report 2014/06/25

Background • Asymmetric multi-core is becoming increasing popular over homogeneous multi-core systems. • An asymmetric multi-core platform consists of cores with different capabilities, for example, ARM big.LITTLE architecture.

ARM big.LITTLE Core • Developed by ARM in Oct. 2011. • Combine two kinds of architecturally compatible cores. • To create a multi-core processor that can adjust better to dynamic computing needs and use less power than clock scaling alone. • big cores are more powerful but power-hungry, while LITTLE cores are low-power but (relatively) slower.

Three Types of Models • Cluster migration • CPU migration(In-Kernel Switcher) • Heterogeneous multi-processing (global task scheduling)

Motivation • Traditional scheduler for homogeneous multi-core platform focus on load-balancing. • Each core has the same computing ability, workloads are distributed evenly in order to obtain maximum performance.

Motivation(Cont.) • Need new scheduling strategies for asymmetric multi-core platform. • Cores with different power and computing characteristics.

Current Hypervisor Architecture and Problem Low computing resource requirement High computing resource requirement GUEST1 GUEST2 GUEST2 Task 3 Task 1 Task 4 Task 2 If Guest OS scheduler is not big.LITTLE-aware, it will assign tasks to vCPUs evenly in order to achieve load balancing. OSKernel OS Kernel OSKernel Scheduler Scheduler Scheduler VCPU VCPU VCPU VCPU VCPU VCPU Hypervisor vCPUscheduler will assign vCPUs evenly to physical ARM cores since it is not big.LITTLE-aware. Hypervisor vCPU Scheduler Cannot take advantage on big.LITTLEcore architeture ARM Cortex-A15 ARM Cortex-A7 Performance Power-saving

Current Hypervisor Architecture and Problem(Cont.) GUEST1 GUEST2 Assume that the scheduler in the Guest OS is big.LITTLE-aware. The vCPU are either big or little. OS Kernel OSKernel Scheduler Scheduler VCPU VCPU VCPU VCPU Hypervisor vCPUscheduler will assign vCPUs evenly to physical ARM cores in order to achieve load-balancing. Hypervisor vCPU Scheduler Cannot take advantage on big.LITTLEcore architeture ARM Cortex-A15 ARM Cortex-A7 Performance Power-saving Performance Degradation Waste energy VCPU VCPU VCPU VCPU

Project Goal • Research on the current scheduling algorithms. • Design and tune the hypervisor scheduler on asymmetric multi-core platform. • Assign virtual cores to physical cores for execution. • Minimize the power consumption with performance guarantee.

Challenge • The hypervisor scheduler cannot take advantage of big.LITTLE architecture if the scheduler inside guest OS is not big.LITTLE aware.

Current Hypervisor Architecture and Problem(Cont.) GUEST1 GUEST2 Android Framework Android Framework If Guest OS scheduler is not big.LITTLE-aware, it will assign tasks to vCPUs evenly in order to achieve load balancing. OS Kernel OS Kernel Scheduler Scheduler Task 4 Task 3 VCPU VCPU VCPU VCPU Even if hypervisor vCPU scheduler is big.LITTLE-aware, it will schedule these vCPUs to either big cores or LITTLE cores since they have the same loading. Task 2 Task 1 Hypervisor b-L vCPU Scheduler Both on big core Cannot take advantage on big.LITTLEcore architeture ARM Cortex-A15 ARM Cortex-A7 Ot both on LITTLE core Performance Power-saving

Possible Solution • Apply VM introspection(VMI) to retrieve the process list in a VM. • VMI is a technique that allows the hypervisor to inspect the contents of the VM in real-time. • Modify the CPU masks of tasks in the VM in order to create an illusion of “big vCPU” and “LITTLE vCPU”. • Hypervisor scheduler can assign the vCPU to corresponding big or LITTLE cores.

Hypervisor Architecture with VMI Low computing resource requirement High computing resource requirement GUEST1 GUEST2 GUEST2 [1|0] [0|1] Task 3 Task 1 [1|0] [0|1] Task 4 Task 2 Android Framework Android Framework Android Framework Linaro Linux Kernel OS Kernel OS Kernel Scheduler Scheduler Scheduler VM Introspector gathers task information from Guest OS Treat thisvCPU as LITTLE core since tasks with low computing requirement are scheduled here. VCPU VCPU VCPU VCPU VCPU VCPU VCPU Hypervisor Task-to-vCPUMapper Hypervisor vCPUscheduler will schedule big vCPUto A15, and LITTLE vCPU to A7. Modify the CPU mask of each task according to the task information from VMI VM Introspector b-L vCPU Scheduler ARM Cortex-A15 ARM Cortex-A7 Performance Power-saving

Hypervisor Architecture with VMI(Cont.) Guest OS 1has two task with high computing requirement and two task with low computing requirement Guest OS 2has two task with low computing requirement GUEST1 GUEST2 [1|0] [0|1] Task 1 Task 2 Task 3 Task 1 [1|1] [1|1] [1|0] [0|1] Task 4 Task 2 Android Framework Android Framework OS Kernel OS Kernel Scheduler Scheduler VM Introspector gathers task information from Guest OS Treat this vCPU as LITTLE core since tasks with low computing requirement are scheduled here. VCPU VCPU VCPU VCPU VCPU VCPU VCPU Hypervisor Task-to-vCPUMapper Hypervisor vCPUscheduler will schedule big vCPUto A15, and LITTLE vCPU to A7. Modify the CPU mask of each task according to the task information from VMI VM Introspector b-L vCPU Scheduler ARM Cortex-A15 ARM Cortex-A7 Performance Power-saving

Hypervisor Scheduler • Schedules the virtual cores to physical cores for execution. • Decides the execution order and amount of time assigned to each virtual core according to some scheduling policies. • Xen - credit-based scheduler • KVM - completely fair scheduler

Credit-Based Scheduler • Each domain(OS) is assigned with a weight and a cap. • The weight decides the amount of time a domain will get in a time interval. • The cap optionally fixes the maximum amount of CPU a domain will be able to consume.

Credit-Based Scheduler(Cont.) • Each CPU manages a local run queue of runnable virtual cores. • Queue is sorted by virtual cores priority. • virtual cores priority is either over or under. • Exceeded its fair share of CPU resource in a time interval or not. • While inserting a virtual core, it is put after all other virtual cores of equal priority.

Credit-Based Scheduler(Cont.) • As a virtual core runs, it consumes credits. • The next virtual core to run is picked from the head of the run queue. • A CPU will look on other CPUs for runnable virtual cores before going idle.

Credit-Based Scheduler on Asymmetric multi-core • Credit-Based Scheduler consider only “fair share” of time slices. • Assigning the same amount of time slices on big and little cores results in different performance and power consumption.

Virtual Core Scheduling Problem • For every time period, the hypervisor scheduler is given a set of virtual cores. • Given the operating frequency of each virtual core, the scheduler will generate a scheduling plan, such that the power consumption is minimized, and the performance is guaranteed.

Core Models • There are two types of cores – virtual cores and physical cores. • vj: frequency of the virtual core • fi: frequency of the physical core • ti, tj: type of the core

Power Model • To decide the power model, we have done some preliminary experiments to measure the power consumption of cores. • On ODROID-XU board

Result – bzip2

Power Model(Cont.) • The power consumption of a physical core is a function of core type, core frequency, and the load of the core. • The load of a core is the percentage of time a core is executing virtual cores.

Performance • A ratio between the computing resource assigned, to the computing resource requested. • Ex: a virtual core running at 800MHz runs on a physical core of 1200MHz for 60% of a time interval. • The performance of this virtual core is 0.6*1200/800 = 0.9.

Objective Function • Generate a scheduling plan, such that the power consumption is minimized, and the performance is guaranteed. • Assume there are n physical cores.

Scheduling Plan • A set of ai,j which indicates the amount of time executing virtual core j on physical core i in a time interval. • A feasible scheduling plan must satisfies some constraints.

Constraints • Each virtual cores should be assigned with sufficient computing resources in order to meet performance guarantee.

Constraints(Cont.) • A physical core has a fixed amount of computing resources in a time interval. • According to its frequency

Current Solution • Given the objective function and the constraints, we can use integer programmingto find a feasible scheduling plan. • Divide each time interval into 100 time slices. • The ai,jin the scheduling plan can be transformed into the amount of time slices a virtual core j on physical core i.

Assign Virtual Cores to Physical Cores • With the scheduling plan from integer programming is not enough. • Need to find a way to assign these virtual cores according to the scheduling plan. • A virtual core cannot appear in two or more physical core on the same time.

Example – 3 vCPUs, 2 Physical Core vCPU0 (60, 20) vCPU1 (0, 50) vCPU2 (20, 30) t=100 t=0

Assign Virtual Cores to Physical Cores(Cont.) • Given a feasible scheduling plan, we can schedule the virtual cores to physical cores without violating the constraints. • Consider n physical cores with m virtual cores, n < m.

Example vCPU2 (10,10,20, 20) vCPU0 (50,40,0, 0) vCPU1 (20,20,20, 20) vCPU5 (0, 0,10, 10) vCPU3 (10,10,20, 20) vCPU4 (10,10,10, 10) t=100 t=0

Flow of Each Interval Tasks running in Guest OSes Loading and/or QoS Guest Oses schedules and adjusts the core frequencies Affect task performance Virtual Core Frequencies Hypervisor scheduler generates a scheduling plan Scheduling plan Trigger DVFS mechanism on physical cores Execute virtual cores on physical cores

Simulation • Conduct simulations to compare the power consumption of our asymmetry-aware scheduler with that of a credit-based scheduler.

Simulation Environment • Two types of physical cores • power-hunger “big” cores • frequency: 1600MHz • power-efficient “little” cores • frequency: 600MHz • The DVFS mechanism is disabled.

Scenario I – 2 Big and 2 Little • Each VM has two virtual cores. • Two sets of input: • Case 1: Both VMs with light workloads. • 250MHz for each virtual core. • Case 2: One VM with heavy workloads, the other with modest workloads. • Heavy:1200MHz for each virtual core • Modest:600MHz for each virtual core.

Scenario I - Results • Case 1: asymmetry-aware method is about 43.2% of that of credit-based method. • Case 2:asymmetry-aware method uses 95.6% of energy used by the credit-base method.

Scenario 2 – 4 Big and 4 Little • The hardware specification of ARM 64-bit board • Each case has three Quad-core VM:

Scenario 2 - Results • In case 3, the loading of physical cores are 100% using both methods. • Cannot save power if the computing resources are not enough.

Summary • We develop an energy-efficient asymmetry-aware scheduler for asymmetric multi-core platforms. • The goal is to generate an energy-efficient scheduling plan with performance guarantee. • Our simulation results show that the asymmetry-aware strategy saves up to 57.2% energy against credit-based method, while still providing performance guarantee.

Q&A

Research on Embedded Hypervisor Scheduler Techniques

Research on Embedded Hypervisor Scheduler Techniques

Presentation Transcript

Open Source Hypervisor

Qualitative Research Techniques

XEN HYPERVISOR

Research Techniques

Linux Scheduler

SplitX : Split Guest/Hypervisor Execution on Multi-Core

Research techniques

Scheduler

Inside Xen Hypervisor

Embedded Java Research

Introduction to Xen -A Hypervisor (on x86)

Software Modeling Techniques for Embedded Systems

Research on Embedded Hypervisor Scheduler Techniques

Hypervisor Hardening and Security

EG2116 Research Techniques

Research Techniques

Research Techniques III

Exploring Quartz Scheduler quartz-scheduler

Beyond the Hypervisor Hype

Global Automotive Hypervisor Market

Embedded Java Research