730 likes | 1.28k Vues
A Survey on Power Management Solutions for Individual Systems and Cloud. Outline. Motivation of Power Management Power Management Techniques in a Single System Dynamic Component Deactivation (DCD) Dynamic Performance Scaling (DPS) Power Efficient Task Assignment in Multi-core System
E N D
A Survey on Power Management Solutions for Individual Systems and Cloud
Outline • Motivation of Power Management • Power Management Techniques in a Single System • Dynamic Component Deactivation (DCD) • Dynamic Performance Scaling (DPS) • Power Efficient Task Assignment in Multi-core System • Power Management Techniques in Data Centers • Dynamic server Deactivation • VM Consolidation
Outline Motivation of Power Management Power Management Techniques in a Single System Dynamic Component Deactivation (DCD) Dynamic Performance Scaling (DPS) Power Efficient Task Assignment in Multi-core System Power Management Techniques in Data Centers Dynamic server Deactivation VM Consolidation
Motivation of reducing the power • Increasing Electricity Bill • The cost of energy consumed by a server during its life time may exceed the hardware cost. And the problem is even worse for clusters and data centers. U.S. data center energy use could double to more than 120 billion kWh from 2006 to 2011, equal to annual electricity costs of $7.4 billion, accounted for 2% of all electricity
Motivation of reducing the power • Increasing carbon dioxide(CO2) Emissions • Annual source energy use of a 2MW data center is equal to the amount of energy consumed by 4,600 typical U.S. cars in one year.
Outline Motivation of Power Management Power Management Techniques in a Single System Dynamic Component Deactivation (DCD) Dynamic Performance Scaling (DPS) Power Efficient Task Assignment in Multi-core System Power Management Techniques in Data Centers Dynamic server Deactivation VM Consolidation
Power Consumption • Where did power go in a single computer system?
Power Consumption • Static Power consumption (or leakage power) • Caused by leakage current and exist in any active circuit • Independently of clock rate and usage scenarios, depends on the type of transistors and process technology, the working temperature Si MOSFET
Power Consumption • Dynamic Power Consumption • Created by circuit activity (i.e. transistor switches) • Depends on usage scenario, clock rates and I/O activity • Short-circuit current • ~10%-15% total power, and so far no way to reduce it without compromising the performance • Switched capacitance • The primary source of dynamic power • Modeled as: • a :switching activity; C : physical capacitance • V :supply voltage, f : clock frequency
Power Management in Computer Systems • Different levels of power management
Power Management in Computer Systems Different levels of power management Static power management Focuses on optimizing the circuit, logic and architecture of the system at the design time Dynamic power management Control the system at run time to reduce power consumption Turn off a component when it is idle Save static power consumption Decrease V, f when CPU is not fully utilized (DVFS technique) Save dynamic power consumption Energy efficient task assignment on multi-core system E.g., assign tasks to one core and turn off the others E.g., assign tasks to multiple cores and lower the frequencies
Outline Motivation of Power Management Power Management Techniques in a Single System Dynamic Component Deactivation (DCD) Dynamic Performance Scaling (DPS) Power Efficient Task Assignment in Multi-core System Power Management Techniques in Data Centers Dynamic server Deactivation VM Consolidation
Dynamic Component Deactivation • Considerations • When to deactivate • After turned off, need to turn on again very soon • Performance penalty by on-off-on transitions • Energy cost by on-off-on transitions • Break-even time Tbe • The minimum length of an idle period to compensate the state transitions energy cost • Depends on individual devices and is independent of requests. • E.g. a devices with idle power 1 W, working power 5 W, transition energy: 12J, transition time: 2s. Then Tbe×5 = 12 + 1×(Tbe-2), Tbe=2.5 • Idle time should be longer than Tbe for a worthwhile turning off • How do we know the idle time period of the future • Based on predictions
Techniques based on DCD • Aggressively turn on and off • Turn off the component whenever it is idle • Activate the component when a new task arrives • Static Timeout technique • When idle, wait for a fixed time T0 • If no task arrives within T0, turn off • Adaptive techniques • Apply the timeout technique where timeout value is adaptively adjusted • Increase timeout value if actual idle time is smaller than the predicted one • Decrease the timeout value if the actual idle time is larger than the predicted one
Techniques based on DCD • Predictive technique • Predict the idle time based on the historical behaviors • If the predicted idle is long enough, turn off the component • Stochastic technique • Model system requests and power state transition as Markov processes • Request arrives and device power state transition occurs with their own probabilities obtained by solving the stochastic optimization problem
Techniques based on DCD Paper: Adaptive Timeout Policies for Fast Fine-Grained Power Management, B. Kveton, S. Mannor , 2007, AAAI. • Introduced and compared three DCD techniques
Techniques based on DCD Static time out policy When idle, wait for a fixed time period T0 If no task arrives within T0 The probability of still having more than Tbe idle time is high Turn off CPU How to determine T0? Generally decided based on Tbe , can just set it to Tbe
Techniques based on DCD • 2. Adaptive timeout policy • Initially set timeout threshold to Tbe (5 in this paper) • Adaptively changed timeout threshold based on the performance of previous time out policy • TA(t) : the adaptive policy for time t • Increases its TA(t) to 5 ms whenever it incurs latency • Otherwise decreases TA(t) towards 1ms
Techniques based on DCD • 3. Online learning Adaptive policy • Weighted different static policies in terms of their estimated historical performance to get the best global policy • Gradually decrease the weight of the policy that performed bad • Gradually increase the weight of the policy that performed well • Let TG the global timeout policy we are going to learn • Ti : The assumed static time out policies. • If there are n policies used for learning, then 0<i<=n • Where wi (t) : the weight of policy Ti at time t • Let’s see how wi (t)is updated based on the historical performance of policy Ti
Techniques based on DCD • 3. Online learning Adaptive policy (continue …) • Where: • wi (t): the weight of policy Ti • Ƞ > 0 is the learning rate • 0<α < 1 is a shared parameter • Loss (t, Ti) : the loss of the if Ti was applied between the t−1 and t • n: the total number of the policies • cwasting : the idle time that is not asleep • clatency: the latency that caused by the static policy
Outline Motivation of Power Management Power Management Techniques in a Single System Dynamic Component Deactivation (DCD) Dynamic Performance Scaling (DPS) Power Efficient Task Assignment in Multi-core System Power Management Techniques in Data Centers Dynamic server Deactivation VM Consolidation
Techniques based on DPS • What’s DPS? • Dynamically Performance Scaling • E.g. frequency and voltage scaling for CPU • How DPS saves energy? • When reducing V and f, the total energy will be reduced • Reducing CPU frequency will anyway save the total energy • How much saved may depend on the characteristics of tasks V: voltage, f: frequency Experiments on two tasks with different Memory access pattern. DCM/instr: cache miss per instructions. Task1: high memory access, DCM/instr = 10-2 Task2: low memory access, DCM/instr = 10-6
Techniques based on DPS • On-demand • Periodically monitor the CPU utilization • Control the CPU utilization between 30%~80% • Increase CPU frequency to highest when the CPU utilization is below 30% • Reduce CPU freq to lowest when the CPU utilization is above 80% • Power save • Always use the lowest frequency • Performance • Always use the highest frequency
Techniques based on DPS • Task Categorization based approach • Categorize tasks based on QoS requirement • Soft real-time tasks, e.g. video decoder • Set the lowest frequency such that the decoder can decode the frame within the decoding period • Interactive tasks: e.g. E-book reader • Set the lowest frequency at first and increase gradually within the human-perceivable time (typically 50ms) • Batch tasks, e.g. most background tasks • Set the lowest available frequency, since user won’t care about its performance • Frequency is decided subjecting to the specific QoS requirement • Save maximal power while QoS requirement is satisfied
Techniques based on DPS • Power-budget based approach • Aim: not exceed the power budget • Policy: set the highest CPU frequency within the power budget • Decide CPU frequency in terms of the system power model and the power budget constraint • User specified performance loss based approach • Reducing frequency also reduces the system performance • User specifies the performance loss level that can be tolerated for power savings • Decide CPU frequency based on the performance loss level
Techniques based on DPS Paper: Chameleon: Application-Level Power Management, X. Liu, P. Shenoy, 2008, IEEE transaction on mobile computing. • In this paper, tasks are categorized and different policies for each category • Tasks in different categories has different QoS requirements • Soft real-time tasks • Has a specific deadline, e.g. • Video player • Interactive tasks • User interactive, need a good response time e.g. • E-book reader • Word processor • Batch tasks • User unlikely cares about the performance, e.g. • Background task
Techniques based on DPS • Categorizetasks, different policies for each category • Soft real-time task • Set lowest frequency such that task is finished before deadline • Deadline • Already known when the task is started • Estimate Processor demand • CPU time needed to finish a task at the maximum frequency fmax • Estimate Processor availability • CPU time slice allocated to a task • Determine CPU frequency • Attempts to “match” the processor demand to the processor availability. • E.g. the actual demand is half of the process availability, then set half of the fmax
Techniques based on DPS • Soft real-time task (continue …) • Three execution scenarios for soft real time task t: task arrival time d: the task deadline Case2: no enough CPU time slice, set full speed Case1: task starts too late, set full speed Case3: slow down CPU to save power
Techniques based on DPS • E.g. Video decoder • Decoding each frame represents a task • t : the timestamp when the decoding task initiated • Already known value • c : the time to decode the next frame at full speed • Predicted by linear regression in terms of frame size • e : the estimated CPU time quota scheduled until the frame deadline • exponential moving average of the e in the past periods • d : the deadline of decoding the frame • Already known value
Techniques based on DPS • Interactive task • Human perceivable threshold T0 • Delay cannot be perceived if task responses within T0 • Experiment shows T0 : 50~100ms • Set the lowest frequency such that task can response within T0 • Possible approach • Estimate c, e, d, then determine the frequency • Drawbacks: too complex since too many interactive events exist in an application, e.g. word, web-browser,
Techniques based on DPS • Interactive task • Propose approach: gradually processor acceleration (GPA) • Set the lowest frequency at first and increase gradually within the human-perceivable threshold, e.g. • T0=50ms, set inspection points at 30ms, 40ms, 50ms • Set lowest speed in first 30ms, if task not finished goto • increase speed to the media speed for the next 10ms • If task not finished, set highest speed to the last 10ms. If still not finished, keep highest speed till finished 50ms 0ms 40ms 30ms
Power Efficient Task Assignment in Multi-core system Paper: Load Unbalancing Strategy for Multicore Embedded Processors, 2010 • Consider energy efficient task assignment strategies for Multicore platform • Tasks need to be assignment to cores before being executed • Concentrate tasks to a single core • Save more power by turning off more cores • But incur bad performance • Evenly distribute tasks • Achieve better performance • But cost more power • Task assignment also depends on the type of tasks • Periodic task : care more about response time • Aperiodic task: need to preserve cache data
Power Efficient Task Assignment in Multi-core system • Task assignment strategies • Load Balancing • Distribute the tasks to different cores, concurrent execution, shorter execution time • Adv: Improve performance, • Dis-Adv: Lavish power consumption • Load Unbalancing • Consolidate the tasks to the same core • Adv: Improves power consumption to some extent, not fully • Dis-Adv: • Degrade the performance • may lead to L1 cache misses (too many tasks in one core) • shared L2 cache can minimize the L1 cache miss penalty
Power Efficient Task Assignment in Multi-core system • Types of Tasks • Periodic Task : Occurs periodically • Example: Multimedia streaming, Run MP3 player • Aperiodic Task : Occurs sparsely • Example: User Interactive programs, Touching screen, pushing Button • Performance metric • Periodic Task • Deadline satisfaction rather than fast execution. • Cache memory cannot be shut down, need to put in low power state (stand by) to retain the context of the task • Aperiodic Task : • Response time, waiting time • Cache memory can be shut down if not required.
Power Efficient Task Assignment in Multi-core system • Distributed vs. Concentrated Periodic Tasks in the perspective of Power Consumption Example:30fps for H.264 and ACC ( video and audio codec) Mean decoding time of 30 frames of H.264 in QCIF format = 200 ms Here, Period T = 1 sec Execution time, C = 200 ms UtilizationU = C/T = 0.2
Power Efficient Task Assignment in Multi-core system • Distributed vs. Concentrated Aperiodic Task in the perspective of Power Consumption and Mean Waiting Time Mean waiting time of apreiodic tasks Energy consumption for 1 sec
Power Efficient Task Assignment in Multi-core system • Task Character based Load Unbalancing Technique • Periodic Task: • Load concentration (unbalancing) technique • Distribute periodic tasks to a minimum number of cores • such that all the deadline can be satisfied under Real Time scheduler • Aperiodic Task • Load balancing technique: (task executing concurrently) • Distribute aperiodic tasks evenly to a maximum number of cores. • Mean waiting time of the task is shortened • Can get free cores (as preodic task is concentrated to one core) to execute on
Power Efficient Task Assignment in Multi-core system • Experimental Setups • Processor: ARM11 MPCore • Core: four ARM1136 • Clock frequency: 210MHz/ 30 MHz • Cache: four 32KB L1 caches (the data cache is write through) and one 1MB L2 cache • Board: CT11MPCore + Emulation Base-Board • RTOS: LEO multi-core version (Samsung LEO-multi processor, spin lock, load balancing, multiple ready queue) • Application: Samsung’s propriety H.264 codec
Power Efficient Task Assignment in Multi-core system • Proposed Load Unbalancing vs. Load Balancing vs. Aggressive Load Unbalancing (Periodic + Aperiodic Tasks) Energy consumption for 12 sec Mean waiting time of aperiodic tasks
Outline • Motivation of Power Management • Reduce cost • Reduce CO2 Emissions • Power Management Techniques in Single Devices • Dynamic Component Deactivation (DCD) • Dynamic Performance Scaling (DPS) • Power Efficient Task Assignment in Multi-core System • Power Management Techniques in Data Centers • Turn off or hibernate idle servers • VM consolidation
Power Consumption • Where did power go in a data center?
Power Management Techniques in Data Centers • Approaches to save power • Turn off or hibernate idle servers • Dynamically scale operating frequency/voltage (DVFS) for underutilized servers • VM consolidation
PM Techniques in Data Centers • Load Management for Power and Performance in Clusters • Homogeneous cluster • Server power switching is only used • Relatively small difference in power consumption between an idle node(70W) and a fully utilized node(94W). So, less servers always save more power when handling same workload • Predict the workload and performance up/degradation by keeping tack of the demand for sources • Acceptable performance degradation(QoS) is specified by users. • Activate as few servers as possible
PM Techniques in Data Centers • Energy-Efficient Server Clusters • Homogeneous clusters • Even workload distribution • Vary-ON/Off mechanism • Independent DVFS(IVS) Coordinated DVFS(CVS) • N or N+1 servers to process current workload? • Let’s see how to achieve this in the next slides
PM Techniques in Data Centers • Energy-Efficient Server Clusters Power consumption of N server Power consumption of N-1 server When is N server’s power consumption Greater than N-1 server’s When servers’ frequency decrease to fvaryoff (n), one server should be turned off. When servers’ frequency increase to fvaryon (n), one more server should be turned on..
PM Techniques in Data Centers • PM in virtualized Data Centers • Multiple QoS requirements • Resource allocation between VMs • DVFS may affect all VMs hosted in one server • Live Migration VM can be moved from one host to another
PM Techniques in Data Centers VirtualPower: Coordinated Power Management, 2007 • Goal: support for isolated and independent operation of guest VMs • How to … • intercepts guest VMs' ACPI calls to perform changes in power states • map them on 'soft' states and uses as hints for actual changes in the hardware’s power state. • Soft scale down (providing a VM less time for utilizing the resource)VM’s performance, increase the idle time of a server to save power • The global policies • use knowledge of rack- or blade-level characteristics and requirements to consolidate VMs using migration. • Then hibernate idle physical servers to save power
Power Management Techniques in Data Centers Paper: Energy Aware Consolidation for Cloud Computation, 2009 • How the workload consolidation impact the system performance and energy consumption
Power Management Techniques in Data Centers • Energy Aware Consolidation for Cloud Computation • How the workload consolidation impact the system performance and energy consumption • Consolidation increases utilization of resources which degrade performance in a nonlinear way • Energy consumption per transaction varies with the resource utilization in a “U-shape” curve. • At low utilization, idle power is not amortized effectively • At high utilization, performance degradation increases the energy efficiency • a optimal combination of CPU and disk utilizations exists • Achieve optimal combination of CPU and disk utilizations during consolidation • Multi-dimension bin packing problem which is NP-hard • Need some heuristic approach
Power Management Techniques in Data Centers • Energy Aware Consolidation for Cloud Computation • Proposed heuristic consolidation algorithm • Suppose that the utilization of resource is additive • At each request arrives, allocate it a server such the Euclidean distances of the new utilizations to the optimal point is minimized • When request cannot be allocated, start a new server randomly