Enhancing Program Performance with Adaptive Microarchitecture Framework
This paper presents a flexible framework for microarchitecture adaptivity, aiming to optimize performance by separating software policies from hardware mechanisms. We conclude with a case study on adaptive caches, evaluating our framework through simulators like SimpleScalar and Wattch, using workloads from the SPEC 2000 suite. The findings demonstrate the potential of adaptive microarchitecture in handling complex software systems efficiently. Our results support the need for tailored optimization strategies rather than one-size-fits-all solutions.
Enhancing Program Performance with Adaptive Microarchitecture Framework
E N D
Presentation Transcript
On Tuning Microarchitecture for Programs Daniel Crowell, Wenbin Fang, and Evan Samanas
Summary • A flexible framework for microarchitecture adaptivity, which separates software policies from hardware mechanism • Case study: adaptive cache • Evaluation: SimpleScalar / Wattch / SPEC2000 / User program • Conclusion: Microarchitecture adaptivity is awesome, and our framework is awesome too
Outline • Motivation • Adaptivity Framework • Case study: Adaptive Cache • Evaluation • Conclusion
Motivation • Optimizing for all is optimizing for nothing • Software is more and more complex, and many are close source • S/W and H/W codesign is infeasible for legacy software
One size doesn’t fit all • Show the cache result from our primitive benchmarking • To back our motivation to do this project • To support our decision of doing case study on adaptive cache, rather than other components
Three Questions for Microarchitecture Adaptivity • When to adapt? => Policy • Interval? Context switch? Function boundary? • What goal(s)? => Policy • Performance first? Performance-power ratio first? • How to adapt? => Mechanism • E.g., parameters of cache include block size, # of blocks, # of sets, replacement algorithm, …
Mechanism • Basically, this is to list some related work on adaptivity, e.g., adaptive cache, adaptive TLB, adaptive processor, … • And list some interesting findings during the course of this project, if we make any progress …
Policy • Instruction 1: adapt_advise • Inspired from “madvise” in os system calls • Used in software: OS, compiler, user programs • Operand: performance first or performance-power ratio first • Instruction 2: adapt_setup • Privilleged, only used by OS • Operand: allowed user programs to use adapt_advise or not
Policy • [OS] Interval / Predicted Interval • [OS] Context switch / Application boundary • [Compiler] Function boundary • [User] User program
Case study: Adaptive Cache • According to our experimental result, we find cache is more interesting than other components …
Selective set VS Selective way • Why do we want to do selective set? • Any interesting
Implementation detail • Hopefully we can put a block diagram here, making it look more professional in architecture area.
Evaluation • Simulator • SimpleScalar 3.0 • Wattch • Workload • 6 programs from SPEC 2000 • 3 microbenchmark programs • Case study: Adaptive Cache
Microbenchmark • Hong-Tai Chou, David J. DeWitt: An Evaluation of Buffer Management Strategies for Relational Database Systems. Algorithmica 1(3): 311-336 (1986). Six data access patterns: • Straight Sequential (SS) References • Clustered Sequential (CS) References • Looping Sequential (LS) References • Independent Random (IR) References • Clustered Random (CR) References • Looping Hierarchical (LH) References
Mechanism • Use 3 microbenchmarkprograms and 6 programs from SPEC 2000 • Use simple policy: e.g., application boundary • Show effectiveness of adaptive cache • Figure 1: bar chart on performance • Figure 2: bar chart on performance-power ratio
Policy • Use 3 microbenchmarkprograms • Don’t use SPEC2000, due to some limitations, e.g., superscalar doesn’t support multi-process • Use idealistic mechanism: best configuration • Show the flexibility of software policies • Figure 1: bar chart on performance [x-axis: policies; y-axis: normalized performance] • Figure 2: bar chart performance-power ratio [x-axis: policies; y-axis: normalized performance-power ratio]
Mechanism + Policy • If time is allowed, think of this part to make this project complete.
Conclusion • Adaptivity is useful • A flexible adaptivity framework • Mechanism • Policy