1 / 23

Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite

XX Jornadas de Paralelismo, A Coruña (Spain) – September 17, 2009. Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite. Javier Lira ψ Carlos Molina ф Antonio González λ. ф Dept. Enginyeria Informàtica Universitat Rovira i Virgili

zuriel
Télécharger la présentation

Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XX Jornadas de Paralelismo, A Coruña (Spain) – September 17, 2009 Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite Javier Liraψ Carlos Molinaф Antonio Gonzálezλ фDept. Enginyeria Informàtica Universitat Rovira i Virgili Tarragona, Spaincarlos.molina@urv.net ψDept. Arquitectura de Computadors Universitat Politècnica de Catalunya Barcelona, Spain javier.lira@ac.upc.edu λ Intel Barcelona Research Center Intel Labs - UPC Barcelona, Spainantonio.gonzalez@intel.com

  2. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  3. Introduction • CMPs have emerged as a dominant paradigm in system design. • Keep performance improvement while reducing power consumption. • Take advantage of Thread-level parallelism. • Commercial CMPs are currently available. • CMPs incorporate larger and shared last-level caches. • Wire delay is a key constraint.

  4. NUCA • Non-Uniform Cache Architecture (NUCA) was first proposed in ASPLOS 2002 by Kim et al.[1]. • NUCA divides a large cache in smaller and faster banks. • Banks close to cache controller have smaller latencies than further banks. Processor [1] C. Kim, D. Burger and S.W. Keckler. An Adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. ASPLOS ‘02

  5. NUCA Policies Bank Placement Policy Bank Access Policy Bank Migration Policy Bank Replacement Policy

  6. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  7. Methodology • Simulation tools: • Simics + GEMS • CACTI v6.0 • PARSEC v2.0 Benchmark Suite

  8. Baseline NUCA cache architecture 8 cores 256 banks [2] B. M. Beckmann and D. A. Wood. Managing wire delay in large chip-multiprocessor caches. MICRO ‘04

  9. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  10. Bank Placement Policy • 1B + Static • 16B + Static • 16B + Local

  11. Bank Placement Policy • 1B + Static placement provides fair distribution. • 16B configurations concentrate data in few banks. • Placement and migration policies are strictly correlated.

  12. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  13. Bank Access Policy • Serial • 9P + 7P • Parallel

  14. Bank Access Policy • Power efficiency vs. Perfomance. • 9P + 7P is a trade-off, but it is still far from the performance potencial. • These results suggest the broad area of improvement on this policy.

  15. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  16. Bank Migration Policy • Static • Gradual + Swapping • Gradual + Replication

  17. Bank Migration Policy • Replication reduces the effective size of the cache. • Migration approaches concentrate data blocks in few banks. • Static approach fairly distribute data blocks in the whole cache. • Placement and migration policies are strictly correlated.

  18. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  19. Bank Replacement Policy • Zero-copy • One-copy • Last Bank Last Bank

  20. Bank Replacement Policy • Giving a second chance to evicted data blocks provides significant performance gain. • Last Bank is a promising mechanism, but this is restricted by its small size. • Further exploration on this policy is required.

  21. Outline • Introduction • Methodology • Analysis of NUCA policies • Bank Placement Policy • Bank Access Policy • Bank Migration Policy • Bank Replacement Policy • Conclusions

  22. Conclusions • NUCA is characterized by four policies. • NUCA policies are related. • Static placement with no-migration: Good trade-off. • Bank placement and bank migration are strictly correlated. • Bank access: Power efficiency vs. Performance. • Bank replacement: ↑ Performance (unbounded last bank). • Still room for improvement in all policies.

  23. Performance Analysis of NUCA Policies for CMPs Using Parsec v2.0 Benchmark Suite Questions?

More Related