1 / 19

A Highly Configurable Cache Architecture for Embedded Systems

A Highly Configurable Cache Architecture for Embedded Systems. Chuanjun Zhang, Frank Vahid and Walid Najjar University of California, Riverside ISCA 2003 Presenter: Jianwei Dai. Outline. Introduction background Way Concatenation for Dynamic Power Reduction

sevita
Télécharger la présentation

A Highly Configurable Cache Architecture for Embedded Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Highly Configurable Cache Architecture for Embedded Systems Chuanjun Zhang, Frank Vahid and Walid Najjar University of California, Riverside ISCA 2003 Presenter: Jianwei Dai

  2. Outline • Introduction • background • Way Concatenation for Dynamic Power Reduction • Adding Way Shutdown for Static Power Reduction • Application • Conclusion

  3. Introduction • Caches consume up to 50% of a microprocessor’s energy • Important observations 1. lower associativity of Cache results in lower power consumption. Example: A direct mapped cache is more energy efficient per access, consuming only about 30% the energy of a same sized four-way set associative cache. 2. In some cases, not all cache’s capacity is required. How to explore these features to reduce the power consumption consumed by the Caches?? 1. Way concatenation 2. Way Shutdown

  4. Background 1. Energy Consumed by Caches Simplification: k_miss_energy: ratio between energy_miss and energy_hit 50~200; k_static: percentage of total energy, 30%~50%

  5. Background(Continued) • The Impact of Cache Associativity Tuning the associativity to a particular application is extremely important to minimize energy. For General Purpose uP: It is hard to implement due to the wide range of applications they have to support. For Embedded uP:  The applications executed are well defined. Thus, it is easy to realize this approach.

  6. Background (Continued) • Base Cache 8Kbytes, 4-way associativity, 32 bytes line size

  7. Way Concatenation for Dynamic Power Reduction • A way concatenable four-way set associativity cache architecture

  8. Way Concatenation for Dynamic Power Reduction (Continued)

  9. Way Concatenation for Dynamic Power Reduction (Continued) • How it works reg1reg2=00 a11a22 = 00 reg1reg2=00 a11a22 =11 reg1reg2=11 a11a22 = 00

  10. Way Concatenation for Dynamic Power Reduction (continued) • Time and Area overhead 1. Negligible impact on timing performance. (1) the configuration circuit is not on the critical path (2) by resizing the configuration circuit, we can hide the its operation time since the circuit executes concurrently with index decoding (3) increase the size of NAND gate to speed up NAND gates in the critical path 2. Area overhead: 1% more

  11. Way Concatenation for Dynamic Power Reduction • A way concatenable four-way set associativity cache architecture

  12. Way Concatenation for Dynamic Power Reduction (continued) • Time and Area overhead 1. Negligible impact on timing performance. (1) the configuration circuit is not on the critical path (2) by resizing the configuration circuit, we can hide the its operation time since the circuit executes concurrently with index decoding (3) increase the size of NAND gate to speed up NAND gates in the critical path 2. Area overhead: 1% more

  13. Way Concatenation for Dynamic Power Reduction • A way concatenable four-way set associativity cache architecture

  14. Way Concatenation for Dynamic Power Reduction (continued) Simulation and Results for benchmark g3fax I8KD8KI4D4: an instruction cache with 8 Kbytes active (I8K), a data cache with 8 Kbytes active(D8K), with the instruction cache configured to be 4-way set associative (I4) and the data cache configured to be 4-way set associative (D4) First group: configurable cache with way concatenation Second group: configurable cache with shutdown Third group: conventional four-way and directed mapped cache

  15. Way Concatenation for Dynamic Power Reduction (continued) Simulation and Results Two main observations: • Way Concatenation cache results in better performance compared to non-configurable direct mapped cache in many cases • Way Concatenation cache is better than way shutdown for reducing dynamic power consumption

  16. Adding Way Shutdown For static Energy Reduction • Motivation for some applications, way shutdown has negligible impact on the performance.

  17. Adding Way Shutdown For static Energy Reduction ( continued) • Results • Penalties: Area overhead: 5% more Performance overhead: 8% off

  18. Using a Configurable Cache • How to determine the configuration for a specific application Based on the simulation or actual executions on the platform, the designer might need to modify the boot or reset part of the program. k_static = 30% k_miss_energy = 50

  19. Conclusion • Introduce a novel configurable cache design method called way concatenation • Way concatenation based cache is very efficient in saving energy saving 37% power consumption when dynamic power consumption is considered compared to conventional four-way set associative cache; saving 40% power consumption when both dynamic and static power consumption are considered. • Impose little area overhead.

More Related