html5-img
1 / 24

Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density

Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density. Reiko Komiya † , Koji Inoue ‡ and Kazuaki Murakami ‡ † Fukuoka University, Japan ‡ Kyushu University, Japan. Outline. Introduction Leakage energy of cache memory

Télécharger la présentation

Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density Reiko Komiya †, Koji Inoue ‡ and Kazuaki Murakami ‡ †Fukuoka University, Japan ‡ Kyushu University, Japan ODES-4

  2. Outline • Introduction • Leakage energy of cache memory • Conventional low leakage cache : Cache decay • Problem of cache decay approach • Solution: Always-Active approach • Evaluation • Conclusions ODES-4

  3. Power Analysis of ARM920T Static Pwr Dynamic Pwr Cache energy is 44% Introduction Energy consumption = Dynamic energy + Static energy consumed by charging & discharging by leakage current The breakdown of energy consumption in a processor family*1 Leakage energy increases with the progress of process technology Cache leakage reduction is very important!! *1 Fred Pollack (Intel Fellow): New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies [Micro32] *2 Simon Segars, “Low Power Design Techniques for Microprocessors,” ISSCC2001 ODES-4

  4. Conventional Low-Leakage Cache Conventional low-leakage cache: Cache decay Conventional cache doesn’t support any leakage reduction technique Active mode (high-leakage to preserve the data) Sleep mode (destroy the data to reduce leakage) Sleep-miss (degrades processor performance) The mode of each line transits based on this state transition diagram sleep mode active mode access(miss) (high-leakage) (low-leakage) no-access time≧decay itnerval initial state ODES-4

  5. Performance Impact of Sleep-misses Many sleep-misses cause large performance degradation! ODES-4

  6. Our Goal High-performance, low-leakage cache! • Problem of conventional low-leakage cache • Performance degradation caused by sleep-misses • Our approach • To improve performance, reduce sleep-misses • Prohibit some cache lines from going to sleep mode ODES-4

  7. the number of sleep-misses at the cache line i the average number of sleep-misses for all cache lines SMDi = Analysis of Sleep-misses • Sleep-Miss Density (SMD): shows amount of sleep-misses in each line • Example The number of sleep-misses at each cache line Cache lines which often cause sleep-misses have high SMD ! • The total number of sleep-misses: 90 • The number of lines: 9 • ⇒ The average number of sleep-misses • : 10 SMD8=1 SMD7=0.1 SMD6=6 ODES-4

  8. SMD < 1 1 ≦ SMD < 2 2 ≦ SMD < 4 4 ≦ SMD Characteristics of Sleep-misses The breakdown of cache lines in terms of SMD The breakdown of sleep-misses in terms of SMD 3.1% of lines cause 94.4% of sleep-misses Breakdown of lines Breakdown of sleep-miss A small number of high SMD lines often produce sleep-misses ODES-4

  9. Always-Active Approach • Support “Always-Active mode (AA mode)” • AA mode prohibits the corresponding line from going to sleep mode • Cache lines which cause frequently sleep-misses should operate in AA mode • Such lines are called “Always-Active lines (AA lines)” ODES-4

  10. How to Decide AA Lines A line which causes frequently sleep-misses ⇒ AA line SMD at each cache line The number of sleep-misses at each cache line always-active mode SMD ≦ Threshold SMD > Threshold access active mode sleep mode no-access time ≧ decay interval initial state ODES-4

  11. How to Measure SMD Dynamically the number of sleep-misses at the cache line i ① SMDi = the average number of sleep-misses for all cache lines ② > Threshold ③ ① > ②×③ Example)The number of cache lines = 1024 (=210),Threshold = 2 (=21) the total number of sleep-misses 10bit right shift 1bit left shift ② >? ②×③ ① no active mode yes AA mode ODES-4

  12. Hardware Implementation If a line is in sleep mode, Cache decay ⇒tag is in sleep mode AA approach ⇒tag is in active mode Sleep-miss counter Always-active flag Decay flag 2 bit local counter tag data 0 1 The line is in sleep-mode && tag match ⇒a sleep-miss occurs! 2 gated 1023 global counter Vdd or 0V >? > ? = Voltage Control shifter ¼ decay interval total sleep-miss counter ODES-4

  13. Experimental Setup • Evaluation model • Cache decay: conventional low-leakage cache • AA1: Cache decay with AA approach (threshold value=1) • Cache configuration • L1 data cache • Cache size: 32KB • Associativity: 2way • Hit latency: 1 clock cycle • Miss penalty: 32 clock cycles • Evaluation items • Performance improvement • Energy reduction ODES-4

  14. Results AA1 Cache decay Normalized energy Normalized execution time Improve the performance by increasing energy consumption Higher performance and lower energy consumption ODES-4

  15. Conclusions • We have proposed a high-performance, low-leakage cache: AA approach • Detect lines which cause sleep-misses frequently at run time • The performance is improved by operating the line as AA mode • Evaluation results • Higher performance and lower energy consumption • The best case (f183.equake): • Performance degradation: 19% →4.2% • Energy consumption: 20% reduction • Future work • Compare AA approach with an adaptive decay technique (Kaxiras ISCA’00) ODES-4

  16. Thank you ! ありがとう! (in Japanese) ODES-4

  17. ODES-4

  18. AA1 AA2 AA4 Cache decay Impact of Threshold Normalized energy Normalized execution time Threshold is small ⇒ high performance. Because the number of AA lines increase! ODES-4

  19. Breakdown of Energy Consumption AA1 is Cache decay ・Leakage energy increase AA1 ・Dynamic energy accompanying reduce ‐Because the number of sleep-miss reduce Breakdown of energy (J) Energy reduction is tradeoff of DEmemory and LEL1 ODES-4

  20. Performance Impact of Decay Interval Cache decay: Performance improve along with the extension of decay interval AA approach: Even if it uses short decay interval, performance fully improve ODES-4

  21. Energy Impact of Decay Interval Cache decay: Leakage energy increase along with the extension of decay interval AA approach: Leakage reduction is large than cache decay using long decay interval ODES-4

  22. Energy Model(1/3) Etotal = LEL1 + DEL1 + DEmemory LEL1 :L1キャッシュのリーク消費エネルギー DEL1 :L1キャッシュの動的消費エネルギー DEmemory:主記憶アクセス消費エネルギー • LEL1 = {LEbit×Nactive(i)} CC : プログラム実行時間 LEbit : 1クロックサイクルにおける1ビットSRAMセルでの  平均リーク消費エネルギー Nactive(i): i clock cycle時の活性状態SRAMビット数 ☹ ☺ ☺ ☹ ODES-4

  23. 消費エネルギー・モデル(2/3) • DEL1 = DE常活性 + DE従来低+ DE従来 DE常活性: 常活性ブロック方式の適用による 動的消費エネルギー・オーバヘッド DE従来低: 従来型低リーク・キャッシュの適用による動的消費エネルギー   オーバヘッド DE従来 : 従来型キャッシュでのアクセス消費エネルギー ☹ ☹ ☹ ODES-4

  24. 消費エネルギー・モデル(3/3) [1] K.Flautner, N.S.Kim, S.Martin, D.Blaauw, and T.Mudge, “Drowsy Caches: Simple Techniques for Reducing Leakage Power,” Proc. of the 29th Int, Symp. on Computer Architecture, pp.148-157, May 2002. [2] S.Kaxiras, Z.Hu, and M.Martonosi, “Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power,” Proc. of the 28th Int, Symp. on Computer Architecture, pp.240-251, June 2001. ODES-4

More Related