1 / 21

John Cavazos Dept of Computer & Information Sciences University of Delaware

Arch Explorer Lecture 5. John Cavazos Dept of Computer & Information Sciences University of Delaware www.cis.udel.edu/~cavazos/cisc879. Motivation. Need for systematic quantitative comparison. [MICRO 2004, Gracia-Pérez et al.]. Computer Arch Research. IDEA. REPRODUCTION EXISTING

jasonegan
Télécharger la présentation

John Cavazos Dept of Computer & Information Sciences University of Delaware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Arch Explorer Lecture 5 John Cavazos Dept of Computer & Information Sciences University of Delaware www.cis.udel.edu/~cavazos/cisc879

  2. Motivation Need for systematic quantitative comparison [MICRO 2004, Gracia-Pérez et al.]

  3. Computer Arch Research IDEA REPRODUCTION EXISTING TECHNIQUES FAIR COMPARISON EXPLORATION

  4. Design space exploration AUTOMATIC EXPLORATION Need more than intuition and experience! execution time • Multi-objectives • Time-to-market power area

  5. ArchExplorer archexplorer.org upload test pick design points database add results daily update Website Server-side Infrastructure FULLY AUTOMATIC simulation cluster

  6. How to compare? F D S EX WB CM SM M • Custom simulator • Hardware compatibility • Software compatibility • Upload CustomSimulator F D S EX WB CM M SM $ TLB $ TLB $ $ MEM Wrapped Simulator& Parameter ranges CPU BP IL1 DL1 L2 MEM

  7. Hardware compatibility Instruction caches Data caches Branch predictors Interconnects Main memory Accelerators ...

  8. Software compatibility Isolate the hardware block, possibly by from centralized control to distributed control

  9. Software compatibility Wrapping in SystemC-based on UNISIM communication layer Models of computation Self-Configuration and parameters legality

  10. Case study Memory sub-system for embedded processor • PowerPC405 • 8 different cache modules available • Complex hierarchies automatically explored • Ranking designs for performance, power, energy, area,... Victim Cache Timekeeping Victim cache Stride Prefetcher Content-Directed Prefetcher Stride + Content Directed Prefetcher Tag Prefetcher Global History Prefetcher Skewed associtiative cache

  11. Accurate comparison needs compiler tuning as well 2.62 P1 P2 < 1.23 baseline 1.09 P1 P2 > Tuned to P1, tuned to P2

  12. Best data cache mechanisms per area CONCLUSIONS: Contrast to Gracia-Pérez et al. [MICRO 2004] No clear winner Close to tuned parametric cache

  13. Best data cache mechanisms per area CONCLUSIONS: Contrast to Gracia-Pérez et al. [MICRO 2004] No clear winner Close to tuned parametric cache

  14. Composing cache hierarchies

  15. Speedup and Energy Improvement

  16. Check out this website: ARCHEXPLORER.ORG

  17. Conclusion • Permanent open competition(s) • Future: • superscalar processor • branch predictor repository • multi-cores • Open for your ideas! • NoC, compiler extensions,...

  18. Check out this website: ARCHEXPLORER.ORG

  19. Veerle Desmet – Sylvain Girbal – Olivier Temam 6th HiPEAC Industrial Workshop – Thales Nov 26th, 2008 Genetic Search Algorithm StatisticalExploration $ $ $ CPU Convergence MEM BP BP • Permanently ranks all designs • per area bucket • speedup or power • assigning higher probability to better points • Picking a point according to distribution • Mutations & crossover • Natural selection

  20. Features for Systematic DSE configs configs configs http://unisim.org Standardized Interfaces ModuleRepository CompatibilityDatabase ParameterCheck ParameterIntrospection Compiler FlagDatabase PPC ARM Module category Module interfaces Known models Configuration validityRanges Params. relationship Probing neighbors parameters Machine description WB$ NBWB$ VC$ SP$ TVC$ compilerflags benchmarksdatasets CDP$ CDPSP$ TagP$ GHB$ BUS Compatibilitydatabase DRAM DRAM nBanks  {2;4;8} tRAS+tCD<tRCD Predictive modeling Module exploration Module parameter tuning CompilerExploration focused search algorithm Design Space Exploration Selection probability Fast convergence

More Related