1 / 25

Effects of wrong path mem. ref. in CC MP Systems

Effects of wrong path mem. ref. in CC MP Systems. Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture. About the papers.

roch
Télécharger la présentation

Effects of wrong path mem. ref. in CC MP Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe511 – Computer Architecture

  2. About the papers R. Sendag, A. Yilmazer, J.J. Yi, and Augustus K. Uht, Quantifying and Reducing the Effects of Wrong-Path Memory References in Cache-Coherent Multiprocessor Systems, IPDPS2006, 2006 O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance. Symposium on Computer Architecture and High Performance Computing, 2004. O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Understanding the effects of wrong-path memory references on processor performance. Workshop on Memory Performance Issues, 2004.

  3. What is it all about? how wrong-path memory accesses affect the cache coherence traffic state transitions, the resource utilization. proposes a filtering mechanism and areplacement policy

  4. Subjects SMPs: Shared-memory MultiProcessor systems Cache Coherence Branch Prediction and prefetching Wrong paths

  5. Cache Coherence Solutions Snooping Solution (Snoopy Bus): Send all requests for data to all processors Processors snoop to see if they have a copy and respond accordingly Requires broadcast, since caching information is at processors Works well with bus (natural broadcast medium) Dominates for small scale machines (most of the market) Directory-Based Schemes Keep track of what is being shared in 1 centralized place (logically) Distributed memory => distributed directory for scalability (avoids bottlenecks) Send point-to-point requests to processors via network Scales better than Snooping Actually existed BEFORE Snooping-based schemes

  6. Cache Coherence Protocols MSI (Modified, Shared, Invalid) MESI (Modified, Shared, Exclusive, Invalid) MOESI (Modified, Owned, Shared, Exclusive, Invalid)

  7. Wrong-path effects Replacements Writebacks Invalidations Cache Block State Transitions Data/Bus Traffic and Coherence Transactions Power Consumption Resource Contention

  8. Replacements Cause: speculatively-executed load instruction mispredicted path a cache block brought into data cache One of the cache blocks replaced by the new one

  9. Writebacks When a replacement occurs by a wrong path reference The evicted cache block may have the state M (exclusive, dirty) or O (share, dirty) Before removing this block from cache a writeback occurs For MSI and MESI if a requested cache block has the state M, before it is sent to the requestor it is written back to memory Then its state is set to S in the original owner’s cache.

  10. Invalidations Assume MOESI protocol A wrong-path load instruction accesses a cache block that is modified by nother processor The owner sets the state to O The requestor gets the block and the state is S if the owner needs to write to that block Changes state from O to M Then invalidates all other copies

  11. Cache Block State Transitions 2 extra cache transitions in the owner’s cache When a modified block is requested Cache state changes from M to O When that block is modified Again the cache state becomes M

  12. Data/Bus Traffic and Coherence Transactions Due to L1 and L2 cache accesses Caused by extra replacements, writebacks, invalidations and state transitions Traffic also increases Snoop or Directory requests also increase traffic

  13. Power Consumption As there are unnecessary snoops, Traffic overhead State transition overhead Power consumption increases Ex: Filtering unnecessary snoops may reduce L2 cache power by 30% (see Moshovos et al.)

  14. Resource Contention wrong-pathmemory accesses compete with correct-path memoryaccesses for the multiprocessor’s resources additional cache coherence transactions may increase the frequency of full service buffers Result: increasing chance of deadlocks

  15. Simulation SPLASH-2 benchmark suite em3d simulation benchmark MOSI and MOESI protocols used 16-processor SPARC v9

  16. Statement based on experiments mispredicted branches are resolved before 94% of wrong-path L2 misses complete. Therefore, whether “an L2 cache miss is speculative” is usually known before the block is placed into the L2 cache. [REF2]

  17. Reducing Cache Pollution Filtering Filtering applied to L2 cache Observation: if a speculatively-fetched cache block is not used while it resides in the L1 cache, then it is likely that that block will not be used at all or will not be used before being evicted from the L2 cache In this mechanism all memory references made by wrong-path instructions or the prefetcher are fetched only into the first-level cache the processor monitors whether they are referenced by non-speculative (correctpath) instructions Based on the predefined observation, the processor may choose to not write the block into the L2 cache or may adopt a policy that gives lower priority to the unused speculatively-fetched block.

  18. Wrong Path Aware Replacement Policy when a block is brought into the cache, it is marked as being either on the correct-path or on the wrong-path when a block needs to be evicted wrong-path blocks are evicted first, on a LRU basis if there are multiple wrong-path blocks.

  19. Performance Evaluation

  20. Conclusions & Critics IPC (instruction per cycle) can be used as the metric In some cases wrong-path executions positively effect overall performance mcf, parser, and perlbmk In some cases significantly negative effect vpr and gcc To model or not to model especially for future systems with longer memory interconnect latencies and processors with larger instruction windows. The real effect: Cache pollution In SMP case especially For a workload with many cache-to-cache transfers, wrong-path memory references can significantly affect the coherence actions. Proposed solutions yet not studied deeply

  21. References R. Sendag, A. Yilmazer, J.J. Yi, and Augustus K. Uht, Quantifying and Reducing the Effects of Wrong-Path Memory References in Cache-Coherent Multiprocessor Systems, IPDPS2006, 2006 O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance. Symposium on Computer Architecture and High Performance Computing, 2004. O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Understanding the effects of wrong-path memory references on processor performance. Workshop on Memory Performance Issues, 2004.

More Related