110 likes | 216 Vues
This project investigates the optimal configuration of cores and caches in a 16-processor system executing web-based applications. As technology advances and feature sizes shrink, adding hardware such as additional cores and caches presents various trade-offs. Through simulation experiments, we analyze performance outcomes with different cache sizes and configurations. Our findings suggest that performance per cost is maximized when all 16 processors are housed on a single chip with a total of 2 MB L2 cache. We also outline future work needed to explore L3 cache integration.
E N D
Cores vs. Caches CS 838 Project Matt Ramsay & Chris Feucht
Motivation • As feature sizes push smaller, additional hardware can be placed on chip • Various trade-offs result • Among these for a CMP is how many cores and how much cache on each chip • Our project results suggest an optimal configuration for a 16-processor system running web-based applications
Outline • Motivation • Experiments Performed • Simulator Environment • Results • Project Shortcomings • Future Work • Conclusions & Summary
Experiments • Intended experiments not performed due to simulator limitations • Intended experiments: Each core equivalent to .5 MB L2 cache • Ran apache_8, oltp_2, zeus_8
Simulator Environment • All nodes include 32 KB, 2 way L1 I & D caches • Each nodes has its own L2 bank, regardless of L2 size or assoc. • All other ruby and opal settings left at default
Project Shortcomings & Future Work • Longer runs needed for convincing data • Test different number of processors/system • Add L3 cache to memory hierarchy
Conclusions • CPI (IPC) changes little in a 16-processor system as number of cores/chip varies • This happens despite rapid system-wide L2 cache growth with added chips • Best performance per cost is with all 16 processors on one chip • Even with 2 MB total L2 • Would be helped by off-chip L3
Project Summary We look here! 50 miles