Memory Dependence Prediction using Store Sets - ISCA-25 Proceedings 1998
170 likes | 199 Vues
Explore the concept of store sets for efficient memory dependence prediction in this insightful 1998 ISCA paper, examining aggressive approaches, naive speculation, ideal models, and practical implementations.
Memory Dependence Prediction using Store Sets - ISCA-25 Proceedings 1998
E N D
Presentation Transcript
CS 7810 Lecture 8 Memory Dependence Prediction using Store Sets G.Z. Chrysos and J.S. Emer Proceedings of ISCA-25 1998
LSQ Basics • An incomplete store stalls all future loads – No • Speculation – the paper is overly conservative • because it also waits for store values • Most of these stalls are unnecessary – artificial • dependences
Aggressive Approach • Assume that loads do not conflict with earlier • stores – all loads and stores execute out of order • -- Naive Speculation • When there is a conflict, the load behaves like a • branch mispredict – all subsequent instructions • are squashed and re-fetched • Expensive – 30-cycle penalty • Rename checkpoints for all instructions • Re-execute only the dependent instructions? – more complex, better performance
Ideal Model • In the perfect model, loads only wait for conflicting • stores – no artificial dependences and no • memory-order violations
Store Sets Concept • For every load, keep track of all stores that it • has conflicted with in the past • A load does not issue if members of its store • set have not finished (dependences are introduced • at the time of dispatch) • The implementation is easy if • a load depends on only one store • a store is present in only one store set
Trivial Implementations • Execution time normalized to an ideal store set • implementation
Ideal Store Set Predictor • An occasional memory-order violation can • introduce many false dependencies – hence, • use saturating counters
Implementation Overview • Every ld/st depends on the last store in its set • Causes serialized stores and false dependences st st st st st
Store Set Implementation • Every load and store belong to one color – keep track of the • last writer for each color – mpreds can pose problems • Colors are merged as you discover m-o violations
Store Set Merging • Store set merging improves performance by 12% • Note that merging happens gradually – no need to • instantly correct all entries in the table
Design Details • Merging store sets • To deal with occasional dependences and conflicts • clear the table every million cycles • use saturating counters for each entry • The SSIT needs 4K entries and the LFST needs • 128 entries
Related Work • Store barrier cache: identify stores that are likely • to pose conflicts • Keep track of all store-load conflict pairs and • associatively check for dependences while • dispatching instructions
Next Week’s Paper • “Effective Hardware-Based Prefetching for • High-Performance Microprocessors”, T.F. Chen • and J.L. Baer, IEEE Transactions on Computers, • May 1995
Title • Bullet