180 likes | 328 Vues
This paper presents a quantitative comparison between two prominent multithreaded benchmark suites: PARSEC and SPLASH-2. By analyzing both suites, we explore the overlaps in program selection and highlight significant differences, advocating for a re-evaluation of SPLASH-2. Our findings reveal that PARSEC showcases greater diversity in workloads, aligning better with current technology trends and processing demands. We employ statistical methods to measure workload similarity and detail how fundamental shifts in data processing affect benchmark performance and characteristics.
E N D
PARSEC vs. SPLASH-2 : A Quantitative Comparison of Two Multithreaded Benchmark Suites on Chip-Multiprocessors 발표자 이보선
INTRODUCTION (1/2) • Princeton Application Repository for Shared-Memory Computers • What distinguishes PARSEC from other benchmark suites? • SPLASH-2, SPEC OMP2001 - focus on High-Performance Computing • BioParallel - bioinformatics programs • ALPBench - suite of multimedia workloads • MineBench - data mining. • PARSEC vs. SPLASH-2
INTRODUCTION (2/2) • This paper makes three contributions • We compare SPLASH-2 with PARSEC to determine how much the program selections of the two suites overlap. Significant differences exist that justify an overhaul of the popular SPLASH-2 benchmark suite. • We identify workloads in both suites that resemble each other that can help researcher to interpret results. A few benchmarks of the two suites have similar characteristics. • We demonstrate how current technology trends are changing programs. The direct comparison of the PARSEC suite with SPLASH-2 shows that the proliferation of CMPs and the massive growth of data have a measurable impact on workload behavior.
OVERVIEW • The SPLASH-2 suite • one of the most widely used collections of multithreaded workloads • released at the beginning of the 90s • High-Performance Computing domain • PARSEC • released at the beginning of 2008 • PARSEC has the following five main features • Multithreaded Applications • Emerging Workloads • Diverse • Employ • Support Research
METHODOLOGY • Execution-driven simulation to obtain the relevant data • Standard statistical method to compute the similarity of the workloads
Experimental Setup & Removing Correlated Data • Simulate abstract cache hierarchy with CMP$im • Compute similarity with hierarchical clustering • Visualize results with dendrograms and scatter plots • Correlated characteristics can skew the redundancy analysis • It is therefore necessary to eliminate correlated information with Principal Component Analysis(PCA)
Measuring Similarity • The Euclidean distance between the program characteristics is a measure for the similarity of the programs • Hierarchical clustering works as follows: • 1. Assign each workload to its own cluster • 2. Compute the pair-wise distances of all clusters • 3. Merge the two clusters with the smallest distance • 4. Repeat steps 2 - 3 until only a single cluster is left
REDUNDANCY ANALYSIS RESULTS (1/2) • How much do the two program collections overlap? • In particular, which workloads of the PARSEC suite resemble which SPLASH-2 codes? • Which benchmark suite is more diverse?
REDUNDANCY ANALYSIS RESULTS (1/2) • Diversity by computing the total • SPLASH-2: 19.55, PARSEC: 18.98 • Direct comparison • the PARSEC suite Contains significantly more diversity than SPLASH-2
Multiple Differences • Instruction Mix Differences • Working Set Differences • Sharing Behavior Differences • No single source for the differences of the two suites.
Inclusion of Pipeline Model • Another difference between SPLASH-2 and PARSEC is the inclusion of workloads that employ the pipeline programming model in PARSEC.
Data Growth • World data is currently doubling every three years • These workloads employ models that allow them to have a basic understanding of the data they process. • For example, the bodytrackprogram employs a model of the human body to detect a person being shown in multiple video streams. • The compressed archive that contains the whole suite with all inputs is 16 MB in the case of SPLASH-2. • For PARSEC, it is 2.7 GB.
Large Working Sets More Common • For smaller caches PARSEC workloads have a significantly higher average miss rate. • The difference is 0.26% for a 1 MB cache, approximately one fourth more. • It decreases to 0.11% for an 8 MB cache. • SPLASH-2 workloads have an average miss rate 0.02% higher than PARSEC workloads if 16 MB caches are used. • This trend continues to the end of the spectrum of cache sizes.
CONCLUSIONS • PARSEC is the more diverse suite in direct comparison