Download
shimin chen big data reading group n.
Skip this Video
Loading SlideShow in 5 Seconds..
Shimin Chen Big Data Reading Group PowerPoint Presentation
Download Presentation
Shimin Chen Big Data Reading Group

Shimin Chen Big Data Reading Group

107 Vues Download Presentation
Télécharger la présentation

Shimin Chen Big Data Reading Group

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive ApplicationsA. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09 Shimin Chen Big Data Reading Group

  2. Introduction • Gordon: a flash-based system architecture for massively parallel, data-centric computing. • Solid-state disks • Low-power processors • Data-centric programming paradigms • Can deliver: • Up to 2.5X the computation per energy of a conventional cluster based system • Increasing performance by up to 1.5X

  3. Outline • Gordon’s system architecture • Gordon’s storage system • Configuring Gordon • Discussion • Summary

  4. Gordon’s System Architecture Gordon node 16 nodes in an enclosure

  5. Prototype Photo: ASPLOS’09 paper uses simulation http://www-cse.ucsd.edu/users/swanson/projects/gordon.html

  6. Gordon node • Configuration: • 256GB of flash storage • A flash storage controller (w/ 512MB dedicated DRAM) • 2GB ECC DDR2 SDRAM • 1.9Ghz Intel Atom processor • Running a minimal linux installation • Power: no more than 19w • Compared to 81w of a server • 900MB/s read and write bandwidth to 256GB disk

  7. Enclosures • Within an enclosure, 16 nodes plug into a backplane that provides 1Gb Ethernet-style network • A rack holds 16 enclosures (16x16=256 nodes) • 64 TB of storage • 230 GB/s of aggregate IO bandwidth

  8. Programming • From the SW and users’ perspectives, a Gordon system appears to be a conventional computing cluster • Benchmarks: Hadoop

  9. Outline • Gordon’s system architecture • Gordon’s storage system • Configuring Gordon • Discussion • Summary

  10. Flash memory Flash memory Flash memory Flash memory Flash Array Hardware The flash controller has over 300 pins 2GB System DRAM CPU 512MB FTL DRAM Flash controller 4 buses, each support 4 flash packages

  11. Super-Page • Three ways to stripe data across flash memory • Horizontal: across buses • Vertical: across packages on the same bus • 2-D: combined

  12. Bypassing and Write Combining • Read bypassing: merging read requests to the same page • Write combining: merging write requests to the same page

  13. Trace-Based Study 64KB super-page achieves the best performance

  14. Outline • Gordon’s system architecture • Gordon’s storage system • Configuring Gordon • Discussion • Summary

  15. Workloads • Map-Reduce workloads using Hadoop Each workload is run on 8 2.4GHz Core 2 Quad machines with 8GB RAM and a single SATA hard disk with 1Gb Ethernet to collect traces.

  16. Power Model

  17. Simulation • High-level trace-driven cluster simulator • Trace: second-by-second utilization (number of instructions, number of L2 cache misses, number of bytes sent and received over network, number of hard drive accesses) • Replay the trace for a conventional cluster model and a Gordon model • Detailed storage system simulators • Disksim for simulation 1TB, 7200rpm SATAII hard drive • Gordon simulator

  18. 88 different node configurations

  19. Optimal Designs • MinT: Min Time • MaxE: Max performance/watt • MinP: Min Power

  20. Outline • Gordon’s system architecture • Gordon’s storage system • Configuring Gordon • Discussion • Summary

  21. Exploit Disks • Cheap redundancy • Virtualize Gorgon: • Gordon + a large disk-based data warehouse • Load data from disk storage into Gordon for a job • May overlap data transfer time for preparing a job with computation

  22. System Costs Actual price is $7200

  23. System Costs for Storing 1PB Data

  24. Summary • Use flash memory + low-power processor (Atom) • Support data intensive computing: such as Map-Reduce operations • The design choice is attractive because of higher power/performance efficiency