CS 294-42: Technology Trends September 12, 2011 Ion Stoica (http://www.cs.berkeley.edu/~istoica/classes/cs294/11/)
“Skate where the puck's going, not where it's been” – Walter Gretzky
Processors • MIMD (Multi-Core Processors) – linear increase: two additional cores every two years • SIMD (GPUs) – exponential increase: width double every four years
SSDs • Performance: • Reads: 25us latency • Write: 200us latency • Erase: 1,5 ms • Steady state, when SSD full • One erase every 64 or 128 reads (depending on page size) • Lifetime: 100,000-1 million writes per page Rule of thumb: writes 10x more expensive than reads, and erases 10x more expensive than writes
Storage Performance & Price 1http://www.fastestssd.com/featured/ssd-rankings-the-fastest-solid-state-drives/ Bwdth: SSD up to x10 than HDD, DRAM > x10 than SSD Price: HDD x20 less than SSD, SSD x5 less than DRAM
Storage Price Trends • RAMs: x2 every ~20 month (http://www.jcmit.com/memoryprice.htm) • 1990-2000: x75 decrease • 2000-2010: x63 decrease • Disks: x2 decrease every ~2 years • SSDs prices dropped faster than disk prices for last 5 year(http://rogerluethy.wordpress.com/2010/12/07/price-trends-of-hdds-vs-ssds/) • But decrease slightly less over last year Storage price halves every ~2 years
Hard Drives (25 years ago) • IBM Personal Computer/AT (1986) • 30 MB hard disk - $500 • 30-40ms seek time • 0.7-1 MB/s (est.) 30-40sec to scan entire disk
Memory (today) • 96 GB RAM - $650-800 (ECC RAM) • Memory bus speed: 10-16 GB/s 6-10sec to scan entire memory!
Working Set (Today) • When was the last time your experience trashing on your laptop? • Memory growing faster than application’s needs – conjecture Today’s memory, yesterday’s disk!
Working Set – Datacenters (Ganesh Ananthanarayanan) • % of jobs whose full inputs fit in memory (~1 week) Nearly all jobs’ inputs fitting in main memory in near future?
(Random) Thoughts • Today’s disks, yesterday’s tapes [John Ousterhout] • Today’s memory, yesterday’s disk? Or should be today’s SSDs, yesterday’s disks? • SSDs not great for caches (due limited writes) • Perfect for archival though and GFS-like filer systems ;-) • In-memory computation not enough for interactive workloads • Parallelism only way out if need to touch a lot of data
(Random) Thoughts (cont’d) • Today’s servers in Hadoop clusters: 10-12 disks • Up to 1GB/s bwdth • How to take advantage of this? • GPU use will only increase: faster increase in processing power than CPUs • Need better support for virtualization • What to do about memory bwdth? • For data intensive apps, locality will continue to be critical
Predictions?? • Memory the new disk • Working sets of more and more apps will fit in memory • SSDs will become the new tape (archival) • GPUs: main driver for increasing processing power • Will be integrated in the main processor