270 likes | 398 Vues
This article delves into the energy efficiency of database servers, evaluating performance through completed tasks per energy consumed. It analyzes various components such as CPUs, RAM, and storage, highlighting their power consumption behaviors. Different optimization strategies, including hardware management and workload distribution, are discussed to improve energy efficiency without severely affecting performance. Key findings reveal significant variations in CPU power usage based on database operations, underscoring the importance of tailored configurations for energy-efficient database computing.
E N D
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521
Article • Analyzing the Energy Efficiency of a Database Server • DimitrisTsirogiannis– University of Toronto • Stavros Harizopoulos– HP Labs • Mehul A. Shah – HP Labs
Introduction • Evaluating database system in terms of performance is measured in task per second or queries per second. • Similarly, energy-efficiency is determined by the measure of completed task per energy/Queries per Joule. • Improving performance is hardware/platform oriented or workload-management oriented. • Exploring ways to improve energy efficiency of a single-machine database server.
Power Breakdown • About half of the peak power is idle system • Two CPU’s • Fixed RAM Power • Board components • SDD and HDD Minimal Power • Left side of the chart is active power consumption • CPU is dominant component • SSD and HDD draw similar power
What affects energy efficiency? • EE = Work/Energy = Performance/Power • Several options affect power-use and potentially affect energy efficiency • CPU cycles to fetch data from disk • Scans, record access, compressions, sorting, and joining • Energy efficiency can be improved but it may sacrifice performance
Energy efficiency vs. Performance • Experimented with five different overhead kernels • Parallel performing, cache-conscious hash join, sorting, alphasort and parallel merging • High performance storage engine that supports column and row oriented database scans. • PostgreSQL and System-X DBMS
Assembling data-management architectures • Scale-up • Shared memory and shared disk • Choosing the balance of components and power down unneeded resources • Scale-out • Share nothing • Single node configurations connected by scaled network • Choose energy efficient components for one node and performance optimized for another
Power Profiles of Hardware Components • RAM • RAM is responsible for 20% of the power consumption and stays the same throughout • Only way to vary power usage by memory is to physically remove the modules from the board
Power Profiles of Hardware Components • Disks • Both HDD and SSD in the configuration • Supports active and idle stages, consuming different amount of power – 15% in the active stage • Test Configuration • Raid-0 configuration for both HDD and HDD • Reading 100GB file @ block size of 128KB
Power Profiles of Hardware Components • CPU • The two CPU’s are responsible for the 85% of power increase in the system while active • Interested in understanding: • How CPU power is affected by database operations and the efficacy of hardware and software power management • Developed a set of micro-benchmarks that performs three classes of database operations: hashing, sorting, and scans.
Micro-benchmarks • Custom Join Kernel • Hash join algorithm for computing join of two relations in parallel. • Sort Kernel • Two in-memory parallel sorting algorithm • Scan kernel • Scan uncompressed rows in memory • Scan compressed column on disk
Energy vs. Performance • Parameters that have greatest impact on energy • Algorithm/plan selection • Intra-operator parallelism • Inter-query parallelism
Algorithm/Plan selection • Access Methods • Join Algorithms • Complex Queries and Join Ordering
Intra-operator and Inter-query Parallelism • Intra-operator parallelism • Parallel hash join • Parallel Sorts • Inter-query parallelism • Executing multiple queries at the same time
Implications for Database Computing • One size fits all • Collection of nodes, where each node is optimized for specific task • High parallelism, low-frequency, small cache, and simple design CPU • Solid state drives • Shared nothing, everything, or in-between • Shared nothing and shared disk • Controlling peak power
Conclusion • CPU power usage by different operators can vary by up to 60% • The best performing system was the most energy efficient • Future investigations: • Improving resources across unutilized nodes to save power • Alternative energy efficient hardware for lower fixed-power cost