1 / 12

Fast Configurable-Cache Tuning with a Unified Second-Level Cache

Fast Configurable-Cache Tuning with a Unified Second-Level Cache. Ann Gordon-Ross and Frank Vahid* Department of Computer Science and Engineering University of California, Riverside * Also with the Center for Embedded Computer Systems, UC Irvine. Nikil Dutt

trynt
Télécharger la présentation

Fast Configurable-Cache Tuning with a Unified Second-Level Cache

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast Configurable-Cache Tuning with a Unified Second-Level Cache Ann Gordon-Ross and Frank Vahid* Department of Computer Science and Engineering University of California, Riverside *Also with the Center for Embedded Computer Systems, UC Irvine Nikil Dutt Center for Embedded Computer Systems School for Information and Computer Science University of California, Irvine This work was supported by the U.S. National Science Foundation and by the Semiconductor Research Corporation

  2. Cache Hierarchy Optimizations • ARM920T(Segars ‘01) • The cache hierarchy is a good candidate for optimizations • Applications require highly diverse cache configurations for optimal energy consumption of the cache subsystem • Over 50% energy savings possible in the cache subsystem due to configuration [Gordon-Ross ‘04]

  3. Previous Cache Tuning Methodologies • Previous methods limit configurability to facilitate easier heuristic development I$ I$ I$ Microprocessor Tuner Tuner Microprocessor Main Memory Main Memory D$ D$ D$ Single level cache subsystem with separate caches - less than 50 configurations Multi-level cache subsystem with separate caches - a few hundred configurations

  4. Motivation • Unified second level caches are commonplace in desktop computers and are becoming increasingly popular in embedded microprocessors • Current cache tuning heuristics do not directly apply due to the complexity of tuning in the presence of a unified second level of cache - circular dependency • Search space explodes to ≈ 18,000 configurations A change in any cache effects the performance of all other caches in the hierarchy L1 I$ L2 U$ L1 D$

  5. Motivation • We present an effective and efficient cache tuning heuristic for a highly configurable cache hierarchy including a unified second level of cache. I$ Tuner Microprocessor U$ Main Memory D$

  6. Level One Configurable Cache • The base cache consists of 4 2KByte banks that may individually be shutdown for size configuration • Line size is configurable • Way concatenation allows for configurable associativity • For evaluation of energy savings, we used a base cache of size 8KB with a 32 byte line size and 4 way associativity Way shutdown 2 KB 2 KB 2 KB 2 KB 2 KB 2 KB 2 KB 2 KB 4 KBytes 8 KBytes 8 KBytes 2-way 2 KB 2 KB 2 KB 2 KB Way concatenation

  7. Level Two Configurable Cache • For maximum configurability, level two cache utilized the Motorola M*CORE style way management • Ways can be designated as instruction, data, unified, or off • Line size is configurable • For evaluation of energy savings, we used a base cache size of 64 KB with a 64 byte line size and 4 fully unified ways U-way D-way U-way U-way I-way

  8. Alternating Cache Exploration with Additive Way Tuning (ACE-AWT) Tune level one sizes Tune level one associativities Tune level one line sizes { } { } D I D I D I Tune level two associativity { } Tune level two line size Tune level two size D These steps are difficult because changing size and associativity is synonymous in a way management style cache

  9. ACE-AWT - First Phase • The first phase is applied during size exploration DONE

  10. ACE-AWT - Fine Tuning Phase • The fine tuning phase is applied during associativity exploration Start with resulting cache from the first phase DONE

  11. Results - Energy Savings • Heuristic achieved near optimal results (when optimal could be computed) • 62%energy savings compared to base cache • Yet only searched 0.2% of the search space • Also improved performance by 35% compared to base cache due to tuned line sizes

  12. Conclusions and Future Work • We developed an efficient and effective cache tuning heuristic to tune a two level cache with a unified second level of cache • ≈ 18,000 possible configurations • Compared to a reasonable base cache configuration: • 62% energy savings • Explores only 0.2% of the search space • 35% improvement in performance • Future work includes application of the tuning heuristic to different execution phases in the application

More Related