1 / 36

Embedded Computer Architecture 5KK73 MPSoC Platforms

Embedded Computer Architecture 5KK73 MPSoC Platforms. Part2: Cell Bart Mesman and Henk Corporaal. The Complexity Crisis. I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone. --Bjarne Stroustrup.

clay
Télécharger la présentation

Embedded Computer Architecture 5KK73 MPSoC Platforms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Embedded Computer Architecture5KK73MPSoC Platforms Part2: Cell Bart Mesman and Henk Corporaal

  2. The Complexity Crisis I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone. --Bjarne Stroustrup

  3. The Software Crisis

  4. The first SW crisis Time Frame: ’60s and ’70s • Problem: Assembly Language Programming • Computers could handle larger more complex programs • Needed to get Abstraction and Portability without losing Performance • Solution: • High-level languages for von-Neumann machines FORTRAN and C

  5. The second SW crisis Time Frame: ’80s and ’90s • Problem: Inability to build and maintain complex and robust applications requiring multi-million lines of code developed by hundreds of programmers • Computers could handle larger more complex programs • Needed to get Composability and Maintainability • High-performance was not an issue: left for Moore’s Law

  6. Solution • Object Oriented Programming • C++, C# and Java • Also… • Better tools • Component libraries, Purify • Better software engineering methodology • Design patterns, specification, testing, code reviews

  7. Today: Programmers are Oblivious to Processors • Solid boundary between Hardware and Software • Programmers don’t have to know anything about the processor • High level languages abstract away the processors • Ex: Java bytecode is machine independent • Moore’s law does not require the programmers to know anything about the processors to get good speedups • Programs are oblivious of the processor -> work on all processors • A program written in ’70 using C still works and is much faster today • This abstraction provides a lot of freedom for the programmers

  8. The third crisis: Powered by PlayStation

  9. Contents • Hammer your head against 4 walls • Or: Why Multi-Processor • Cell Architecture • Programming and porting • plus case-study

  10. Moore’s Law

  11. Single Processor SPECint Performance

  12. What’s stopping them? • General-purpose uni-cores have stopped historic performance scaling • Power consumption • Wire delays • DRAM access latency • Diminishing returns of more instruction-level parallelism

  13. Power density

  14. Power Efficiency (Watts/Spec)

  15. 1 clock cycle wire range

  16. Global wiring delay becomes dominant over gate delay

  17. Performance µProc: 55%/year 1000 CPU 100 Processor-Memory Performance Gap:(grows 50% / year) “Moore’s Law” 10 DRAM: 7%/year DRAM 1 2005 1980 1985 1990 1995 2000 Time [Patterson] Memory

  18. Now what? • Latest research drained • Tried every trick in the book So: We’re fresh out of ideas Multi-processor is all that’s left!

  19. Low power through parallelism • Sequential Processor • Switching capacitance C • Frequency f • Voltage V • P = fCV2 • Parallel Processor (two times the number of units) • Switching capacitance 2C • Frequency f/2 • Voltage V’ < V • P = f/2 2C V’2 =fCV’2

  20. Architecture methodsPowerful Instructions (1) MD-technique • Multiple data operands per operation • SIMD: Single Instruction Multiple Data Vector instruction: for (i=0, i++, i<64) c[i] = a[i] + 5*b[i]; c = a + 5*b Assembly: set vl,64 ldv v1,0(r2) mulvi v2,v1,5 ldv v1,0(r1) addv v3,v1,v2 stv v3,0(r3)

  21. * * * * Architecture methodsPowerful Instructions (1) • Sub-word parallelism • SIMD on restricted scale: • Used for Multi-media instructions • Motivation: use a powerful 64-bit alu as 4 x 16-bit alus • Examples • MMX, SUN-VIS, HP MAX-2, AMD-K7/Athlon 3Dnow, Trimedia II • Example: i=1..4|ai-bi|

  22. MPSoC Issues • Homogeneous vs Heterogeneous • Shared memory vs local memory • Topology • Communication (Bus vs. Network) • Granularity (many small vs few large) • Mapping • Automatic vs manual parallelization • TLP vs DLP • Parallel vs Pipelined

  23. Multi-core

  24. Cell

  25. What can it do?

  26. Cell/B.E. - the history • Sony/Toshiba/IBM consortium • Austin, TX – March 2001 • Initial investment: $400,000,000 • Official name: STI Cell Broadband Engine • Also goes by Cell BE, STI Cell, Cell • In production for: • PlayStation 3 from Sony • Mercury’s blades

  27. Cell blade

  28. Cell/B.E. – the architecture • 1 x PPE 64-bit PowerPC • L1: 32 KB I$ + 32 KB D$ • L2: 512 KB • 8 x SPE cores: • Local store: 256 KB • 128 x 128 bit vector registers • Hybrid memory model: • PPE: Rd/Wr • SPEs: Asynchronous DMA • EIB: 205 GB/s sustained aggregate bandwidth • Processor-to-memory bandwidth: 25.6 GB/s • Processor-to-processor: 20 GB/s in each direction

  29. Cell chip

  30. SPE

  31. SPE

  32. SPE pipeline

  33. Communication

  34. 8 parallel transactions

  35. Send the code of the function to be run on SPE 1 Send address to fetch the data 2 DMA data in LS from the main memory 3 Run the code on the SPE 4 DMA data out of LS to the main memory 5 Signal the PPE that the SPE has finished the function 6 C++ on Cell

  36. Conclusions • Multi-processors inevitable • Huge performance increase, but… • Hell to program • Got to be an architecture expert • Portability?

More Related