1 / 27

Application-driven Energy-efficient Architecture Explorations for Big Data

Application-driven Energy-efficient Architecture Explorations for Big Data. Authors : Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of Computing Technology, Chinese Academy of Sciences) Reviewed by- Siddharth Bhave (University of Washington, Tacoma). Big Data.

coyne
Télécharger la présentation

Application-driven Energy-efficient Architecture Explorations for Big Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Application-driven Energy-efficient Architecture Explorations for Big Data Authors: XiaoyanGu RuiHou Ke Zhang Lixin Zhang Weiping Wang (Institute of Computing Technology, Chinese Academy of Sciences) Reviewed by- SiddharthBhave (University of Washington, Tacoma)

  2. Big Data • What is Big Data? • Problems with Big data • Energy Consumption • Velocity (Operation latency and throughput) • Volume (storing capacity) • Variety • Managing Big Data Problems • Storage Technologies • Partitioning • Multithreading • Parallel Processing • Efficient Architecture • Hadoop, Map Reduce, MAHOUT • Find bottle neck

  3. Introduction • Big data management at architecture level • Two architecture systems • Xeon-based cluster • Atom Based (micro-server) Cluster • Comparison Based on: - • Energy consumption • Execution time

  4. Motivation • Ever increasing data. • Energy and Time tradeoff in Xeon and Atom based clusters. • Bottleneck by the processes of compression/decompression • Stateless data processing

  5. Mastiff • Mastiff - Targeted application for performance analysis • Big data processing engine • Columnar store policy

  6. Working flow of the Mastiff

  7. Methodology • TPC-H test benchmark of queries and concurrent data • 1 TB of verification data • 2 cases - data load and data query • Fluke NORMA 4000 • Average cases and median results are reported

  8. Power and Performance Evaluation • Take 3 cases for time and energy consumption • 31 nodes – Atom Cluster (1 master node) • 31 nodes – Xeon Cluster (1 master node) • 16 nodes – Xeon Cluster (1 master node)

  9. Power and Performance Evaluation (cont’d) Energy consumption between 30-node Atom Cluster and 30-node Xeon Cluster

  10. Power and Performance Evaluation (cont’d) Energy consumption between 30-node Atom Cluster and 15-node Xeon Cluster

  11. Power and Performance Evaluation (cont’d) Time Breakdown in Map Phase

  12. Power and Performance Evaluation (cont’d) Time Breakdown in Reduce phase

  13. Findings • Atom platform more power efficient • Data compression and decompression occupies significant percentage. • Compression and decompression can be done in software pipeline fashion i.e. with multiple interleave

  14. Propositions • Heterogeneous architecture • Accelerators to perform data compression/decompression • Multiple interleaved compression/decompression

  15. Off-chip and On-chip Accelerators

  16. Multiple Interleaved Tasks

  17. Strengths • A much needed innovative concept • Organized well • Detailed description of energy and time investigation • Already implemented propositions

  18. Weaknesses • Not enough power meters to monitor all nodes • 2 assumptions • Power of every network router is evenly counted towards nodes • Energy consumption of each node is similar • Results are generalized by Hadoop even if they might not be true for every application. • Vague propsitions implementation

  19. FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen Jason Franklin Michael Kaminsky Amar Phanishayee Lawrence Tan Vijay Vasudevan (Carnegie Mellon University)

  20. Introduction • High performance, energy efficient system for storage • Large number of small low-performance (hence wimpy) nodes with moderate amounts of local storage • 2 parts: FAWN-DS (data store) and FAWN-KV (key value) • Motivation • Traditional architecture consumes too much power • I/O bottleneck due to current storage inabilities

  21. Features • Pairs of low powered embedded nodes with flash storage • FAWN-DS is the backend that consists of the large number of nodes • Each node has some RAM and flash • FAWN-KV is a consistent, replicated, highly available and high performance key value storage system

  22. FAWN Architecture

  23. Efficient Data Streaming with On-chip Accelerators: Opportunities and Chanllenges Authors: RuiHou Lixin Zhang Michael C. Huang Kun Wang Hubertus Franke Yi Ge Xiaotao Chang (University of Rochester)

  24. Motivation • Transistor density increasing day by day • Many cores are integrated in a single die • Advantage of on-chip accelerator instead of using it as PCI

  25. On-Chip Accelerator Architecture

  26. Features • 3 types of accelerators • Crypto accelerators • Decompression accelerators • Network offload accelerator • Some common characteristics of data stream in the 3 accelerators • Optimize the power and performance of the accelerators.

  27. Thank You

More Related