1 / 25

Core-Selectability in Chip-Multiprocessors

Core-Selectability in Chip-Multiprocessors. Hashem H. Najaf-abadi Niket K. Choudhary Eric Rotenberg. Dividing the Design A definition. Processing Cores. All levels of cache Interconnect Ports to Memory and IO. What this Talk is About. How to improve performance of a CMP.

Télécharger la présentation

Core-Selectability in Chip-Multiprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Core-Selectabilityin Chip-Multiprocessors Hashem H. Najaf-abadi Niket K. Choudhary Eric Rotenberg

  2. Dividing the DesignA definition Processing Cores All levels of cache Interconnect Ports to Memory and IO

  3. What this Talk is About How to improve performance of a CMP by enabling exploitation of the full potential of the interconnection the interconnect is not fully utilized by all workloads by improving the processing if it is, there’s nothing to gain here

  4. need ports to the interconnect If the same interconnect is enough for a quad-core, then it was over-provisioned for a dual-core. The Provisioning FactorBalance in provisioned resources

  5. some technique that boosts general performance If the design is well provisioned with the same interconnect, then it must have been over-provisioned in the baseline. The Provisioning FactorBalance in provisioned resources

  6. The Underutilization FactorInterconnect not fully utilized by all applications workloads that depend the most on interconnect have a louder say in what a well-provisioned design constitutes

  7. RISC v. CISC wide v. narrow issuing deep v. shallow pipelining large v. small issue queue The One-size-fits-all FactorA single solution has limited performance Changing these trade-offs will improve performance for some workloads and degrade it for others. He’s not much for a conversation. But if he was, it would be a conversation about saving you execution time.

  8. The Shrinking FactorProgressively less die area for the cores better return on increasing the interconnection resources `

  9. The Shrinking FactorProgressively less die area for the cores

  10. The Shrinking FactorProgressively less die area for the cores Intel 8088 Intel386 100% Intel Intel IBM 8086 80286 Intel 486DX 90% Power3 80% Intel 70% Intel Pentium IV Pentium Intel 60% Pentium III 50% IBM IBM Power4 Power5 Intel Core Duo 40% 30% IBMPower6 Niagara-2 - - 20% Niagara-1 10% 1990 1995 2000 2005 2010

  11. Program 2 Program 1 The Diversity FactorCan provide diversity in the core designs Single Core Design: Optimized for all workloads

  12. Code 2 Code 1 The Diversity FactorCan provide diversity in the core designs Heterogeneous Cores: Optimized for workload

  13. Program 2 Program 1 Core-Selectability Core-Selectability: Optimized for workload.

  14. Core-Selectability Selectability

  15. One-size-fits-all Factor Provisioning Factor Shrinking Factor Underutilization Factor Diversity Factor Core-Selectability Recap Port Sharing can improve performance without increasing power density results in a homogeneous design can reduce verification effort by splitting up workload space

  16. Core-SelectabilityRemains homogeneous at a high level CMP

  17. Empirical EvaluationBased on Fabscalar • A library of the synthesized implementation of different configurations for different microarchitectural units of a contemporary superscalar processor.

  18. The selection of cores normalized exec. time

  19. On Individual Benchmarks normalized execution time

  20. The Effect of Selectability normalized exec. time

  21. Under Different Task Arrival Patterns Average task turnaround time for (a) normal traffic, and (b) bursty traffic.

  22. Overhead of Reconfigurability

  23. L1 Data Cache extra switching core-selection extra wire (100fF) Core A Core B Implementation of Port Sharing 26ps added propagation delay

  24. Overhead of Reconfigurability • With reconfigurability, change is implemented within a core – with complex coupling between pipeline stages. • With Core-Selectability, change is implemented at the core level – with less complex coupling between core and interconnect.

  25. Thank you It’s as if he knows you like to save execution time.

More Related