1 / 31

Predicting Performance Impact of DVFS for Realistic Memory Systems

Predicting Performance Impact of DVFS for Realistic Memory Systems. Rustam Miftakhutdinov Eiman Ebrahimi Yale N. Patt. Dynamic Voltage/Frequency Scaling. V. f. Image source: intel.com. Impact of Frequency Scaling. time. energy. power. f opt. frequency. Impact of Frequency Scaling.

palila
Télécharger la présentation

Predicting Performance Impact of DVFS for Realistic Memory Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predicting Performance Impact of DVFSfor Realistic Memory Systems Rustam Miftakhutdinov EimanEbrahimi Yale N. Patt

  2. Dynamic Voltage/Frequency Scaling V f Image source: intel.com

  3. Impact of Frequency Scaling time energy power fopt frequency

  4. Impact of Frequency Scaling time power fo frequency

  5. Prediction Overview our work time energy per instruction energy fo freq. × fopt fo fopt freq. power frequency fo fo freq. 0 100K 200K 300K instructions

  6. Outline ✓ Intro to performance prediction Why realistic memory systems? Variable memory latency Prefetching

  7. Why Realistic Memory System? V f

  8. Prior Work • Stall time • Leading loads (2010) S. Eyerman et al. G. Keramidas et al. B. Rountree Evaluated withconstant access latency memory system

  9. Energy Savings * Gmean of relative savings for 13 memory-intensive SPEC 2006 benchmarks.Baseline: most energy-efficient static frequency for SPEC 2006

  10. Energy Savings * Gmean of relative savings for 13 memory-intensive SPEC 2006 benchmarks.Baseline: most energy-efficient static frequency for SPEC 2006

  11. Outline ✓ Intro to performance prediction Why realistic memory systems? Variable memory latency Prefetching ✓

  12. Execution Example C memory requests B D A E chip activity 1 2 3 4 time

  13. T = Tmemory + Tcompute independent offrequency proportional tocycle time

  14. Linear Model execution time T Tcompute Tmemory to 0 cycle time t

  15. Measuring Tmemory memory requests chip activity time

  16. Measuring Tmemory memory requests chip activity time

  17. Causes of Request Dependences Pointer Chasing Finite Chip Resources next miss miss next instruction window next

  18. Measuring Tmemory memory requests chip activity time

  19. Critical Path Algorithm new Tmemory at Tstart1. record Tstart and Tmemory Tmemory (length of critical path) time old critical path request latency at Tend2. compute path= Tmemory(Tstart) + (Tend - Tstart) • 3. set Tmemory= max(Tmemory, path) Tend Tstart

  20. Linear Model execution time T Tcompute Tmemory to 0 cycle time t

  21. Linear Model time execution time T fo freq. energy time Tcompute × Tm Tmemory to cycletime fo fopt freq. power fo freq. to 0 cycle time t

  22. Critical Path: Variable Access Latency memory requests chip activity time Leading Loads: Constant Access Latency memory requests chip activity time

  23. Leading Loads execution time T leading loads Tcompute Tmemory to 0 cycle time t

  24. Leading Loads time execution time T leading loads fo freq. energy time Tcompute × Tm Tmemory to cycletime fo fopt freq. power fo freq. to 0 cycle time t

  25. Energy Savings * Gmean of relative savings for 13 memory-intensive SPEC 2006 benchmarks.Baseline: most energy-efficient static frequency for SPEC 2006

  26. Outline ✓ Intro to performance prediction Why realistic memory systems? Variable memory latency Prefetching ✓ ✓

  27. Streaming Workload Prefetcher ON Frequency +500 MHz Prefetcher ON Frequency+200 MHz Prefetcher OFF Prefetcher ON memory requests memory requests memory requests memory requests chip activity chip activity chip activity chip activity time time time time

  28. Limited Bandwidth Model execution time T Tcompute Tmemory min Tdemand tcrossover 0 cycle time t

  29. Energy Savings * Gmean of relative savings for 13 memory-intensive SPEC 2006 benchmarks.Baseline: most energy-efficient static frequency for SPEC 2006

  30. Recap ✓ Intro to performance prediction Why realistic memory systems? Variable memory latency Prefetching ✓ ✓ ✓

  31. Final Thought Performance predictors need realistic evaluation

More Related