1 / 23

Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems

Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems. Ayse Yilmazer, University of Rhode Island Resit Sendag, University of Rhode Island Joshua J. Yi, Freescale Semiconductor, Inc. . Motivation. Previous work on Wrong-path (WP) effects in Uniprocessors

annissa
Télécharger la présentation

Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quantifying and Comparing the Impact of Wrong-Path Memory References in Multiple-CMP Systems Ayse Yilmazer, University of Rhode Island Resit Sendag, University of Rhode Island Joshua J. Yi, Freescale Semiconductor, Inc.

  2. Motivation • Previous work on Wrong-path (WP) effects in Uniprocessors • Positive Effects: Prefetching • Up to 20% better performance for 181.mcf (SPECint 2000) • Negative Effects: Pollution • L1 and L2 cache pollution • Extra traffic • Important to simulate WP, especially for some applications • How about WP effects in Multiple-CMP systems?

  3. Outlines • Wrong Path Effects in SMPs and multi-CMPs • Simulation Methodology • Evaluation Results • Conclusion

  4. Wrong-path effects in SMPs – 0 / 4 • Broadcast (snoop)- and directory-based SMP systems • MSI, MOSI, MESI, MOESI cache coherence protocols • Same issues in uniprocessors apply • Pollution effect • Prefetching effect • Extra cache/memory traffic • In contrast to uniprocessor effects, WP cause: • Extra coherence traffic: • data, invalidations, write-backs, acknowledgements • Additional cache block state transitions

  5. A speculatively replaces B Initial States A is a Wrong-path Block ! Wrong-path effects in SMPs – 1 / 4 • Replacements

  6. Write-back dirty copy of B M -> S Write-back dirty copy of A Only for MESI (or MSI) Wrong-path effects in SMPs – 2 / 4 • Write-backs

  7. P1 loses its write privileges for block A P1 asks for grant to write and sends invalidation Wrong-path effects in SMPs – 3 / 4 • Invalidations

  8. Wrong-path effects in SMPs – 4 / 4 • Data/Bus and Coherence Traffic Increases • L1 references, • L2 references, • coherence traffic • snoop, directory requests for data and invalidations • Power Consumption Increases • Due to extra cache references, coherence traffic and cache block state transitions • Resource Contention • Competing with correct-path resources • In contrast to uniprocessors, the increase in the frequency of full service buffers • critical when many cache-to-cache transfers

  9. WP effects in Multiple-CMPs – 0 / 2 • CMP node and a 4 CMP system • We studied inclusive L1 and L2 cache • L2 cache also tracks the coherence of cache blocks in L1

  10. WP effects in Multiple-CMPs – 1 / 2 OIV SO S OIN I I State Transitions when replacement of an SO line in L2 cache

  11. WP effects in Multiple-CMPs – 1 / 2 MO MT M S SO • State Transitions when an MT line in L2 cache receives a WP request

  12. Outlines • Wrong Path Effects in SMPs and multi-CMPs • Simulation Methodology • Evaluation Results • Conclusion

  13. Experimental Methodology • GEMS simulator – Wisconsin Multifacet Group • Based on Virtutech SIMICS • Aggressive out-of-order superscalar processor • Detailed Shared-Memory Model • We evaluate 16-processor (4 and 8-CMPs) SPARC V9 system running unmodified Solaris 9 • Evaluated 2-level MOSI directory coherence protocol • MOSI: Modified, Owned, Shared, Invalid • We track the speculatively generated memory references • and mark them as being on the wrong-path when the branch misprediction is known

  14. Experimental Methodology

  15. Outlines • Wrong Path Effects in SMPs and multi-CMPs • Simulation Methodology • Evaluation Results • Conclusion

  16. Evaluation Results 1 / 5 -- L1 and L2 Cache Traffic 4 CMPs 8 CMPs • Total memory references increase by 16% and 14% for 4- and 8-CMPs, respectively. • L2 cache references increase by 35% and 36%, respectively. • For em3d, the increase in the number of L1 misses increase as much as 70%.

  17. Evaluation Results 2 / 5 -- Coherence Traffic 4 CMPs 8 CMPs • Internal -- 36% External -- 30%

  18. Evaluation Results 3 / 5 -- L1 and L2 cache replacements • L1 -- 30%, L2 -- 17% • Potential Cache Performance Impact

  19. Evaluation Results 4 / 5 -- Write Misses 4 CMPs 8 CMPs On average 7% On average 4%

  20. Evaluation Results 5 / 5 -- Cache Line State Transitions 4 CMPs 8 CMPs • Internal: 2% to 13% • External: 1% to 9% • Internal: 2% to 17% • External: 1% to 10%

  21. Outlines • Wrong Path Effects in SMPs and multi-CMPs • Simulation Methodology • Evaluation Results • Conclusion

  22. Conclusion • It is important to model WP memory references in cache-coherent multi-CMP systems • For multi-CMPs, not only do the WP affect the performance of individual processors due to prefetching and pollution, they also affect the performance of the entire system by increasing • cache coherence transactions • cache block state transitions • write-backs • invalidations • resource contention • For a workload with many cache-to-cache transfers, WP can significantly affect coherence actions.

  23. The End Thank You !

More Related