1 / 53

Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala

MAD MAC 525. Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala. Design Manager: Zack Menegakis. 1 st May, 2006 Final Presentation. Design a crucial part of a GPU called the Multiply Accumulate Unit (MAC) which is revolutionizing graphics. Agenda. Marketing – Jigar

korbin
Télécharger la présentation

Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAD MAC 525 Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala Design Manager: Zack Menegakis 1st May, 2006 Final Presentation Design a crucial part of a GPU called the Multiply Accumulate Unit (MAC) which is revolutionizing graphics

  2. Agenda • Marketing – Jigar • Project and Algorithm Description – Farhan • Implementation Part I – Farhan • Implementation Part II – Sonali • Floorplan – Sonali • Layout – Avni • Verification – Avni • Design Specifications – Avni • Conclusion – Jigar

  3. Marketing Jigar

  4. Purpose MAD MAC 525 accelerates FP16 blending to enable true HDR graphics Huh?? Marketing Description Implementing Floorplan Layout Verify Specifications

  5. Beauty of High Dynamic Range • With HDR rendering, pixel intensity can extend beyond the range of traditional graphics • Nature doesn’t have a limited pixel intensity and neither should Computer Graphics • In other words: • Bright things can be really bright • Dark things can be really dark • And the details can be seen in both Marketing Description Implementing Floorplan Layout Verify Specifications

  6. Applications of HDR Marketing Description Implementing Floorplan Layout Verify Specifications

  7. Target Market • Target Market Segment • Graphic chip manufacturers • High speed DSP manufacturers • CPU co-processors • Potential Customers Marketing Description Implementing Floorplan Layout Verify Specifications

  8. Design Comparison • Top 180nm graphics chip is the NVIDIA NV16. • Highest speed only 250MHz • 9 bit Integer precision • As games are becoming more advanced, they are in need of fast graphics chips • Conclusion: Market Needs a FAST MAD MAC Marketing Description Implementing Floorplan Layout Verify Specifications

  9. Description and Implementation I Farhan

  10. Project Description • Multiply Accumulate unit (MAC) • Executes function AB+C on 16 bit floating point inputs. • Format – 1 bit sign, 5 bit exponent and 10 bit significand • Multiply and add in parallel to greatly speed up operation • Rounding performed only once so greater accuracy than individual multiply and add functions. • Also known as: • Fused Multiply Add (FMA) • Multiply Add (MAD/MADD) in graphics shader programs Marketing Description Implementing Floorplan Layout Verify Specifications

  11. Algorithm • FP Multiply (A*B) • Multiply significands • Add exponents • Normalize • Round • FP Add (A+B) • Align smaller number to larger number • Add significands • Normalize • Round Marketing Description Implementing Floorplan Layout Verify Specifications

  12. Algorithm • FP Multiply-Add (AB+C) • Align sig C based on exp A+B-C • Multiply significands A and B • Add sig A*B result to aligned sig C • Normalize • Round Marketing Description Implementing Floorplan Layout Verify Specifications

  13. Block Diagram A B C Multiplier Exp Calc Align Adder Leading 0 Anticipator Normalize Y Round Output Ovf Checker Marketing Description Implementing Floorplan Layout Verify Specifications

  14. Implementation • Design target: 300MHz • Speed is the design goal • Ambitious target? • How we planned achieve this • Fast Logic – parallelize ops as much as possible • Pipelining Marketing Description Implementing Floorplan Layout Verify Specifications

  15. Implementation • Adder • Carry Select vs Carry Lookahead tree Marketing Description Implementing Floorplan Layout Verify Specifications

  16. Implementation • Adder • Han-Carlson based carry lookahead adder • 6 lookahead logic stages for 32 bit adder • Less logic than a Kogge-Stone adder • Less wiring than a Brent-Kung adder Marketing Description Implementing Floorplan Layout Verify Specifications

  17. Implementation • Multiplier • Carry-Save Multiplier • Avoids having ripple carry in every stage • Enables regular and compact layout • Easy to pipeline • Final 10 bit add stage using carry lookahead adder Marketing Description Implementing Floorplan Layout Verify Specifications

  18. Implementation • Leading Zero Anticipator • Predicts number of shifts to do in normalize • Normalize begins with zero delay • Operates in parallel with adder so normalize shifts can be predicted with accuracy of 1 shift to left or right Marketing Description Implementing Floorplan Layout Verify Specifications

  19. Implementation • Latches • Pulse Latches • Practically eliminates setup time • 16 transistors per pulse generator • Simplified version of those used in a certain high speed CPU Clock pulse generator Marketing Description Implementing Floorplan Layout Verify Specifications

  20. Implementation II and Floorplan Sonali

  21. Design Decision: Pass Logic • Extensive use of Pass Logic • Reduces transistor count • Reduces area • Transistor count reduced from 20,200 to 12,800 Example • Normalize: 3400 -> 942 • Align: 1500 -> 530 • Ensure all pass logic is buffered Marketing Description Implementing Floorplan Layout Verify Specifications

  22. Design Decision: Pipelining • Initially planned 6 pipeline stages • Reduced to 4 pipeline stages • Adder – Fast Carry Lookahead architecture • Multiplier – Ripple Carry to Carry Lookahead Marketing Description Implementing Floorplan Layout Verify Specifications

  23. Pipeline Stages Reg A Reg B Reg C Multiplier Exp Calc Align C Adder Ld Zero Normalize Round Output Marketing Description Implementing Floorplan Layout Verify Specifications

  24. Schematics I N P U T S P I PELINE • Multiplier OUTPUTS P I P E L I N E O U T P U T S Marketing Description Implementing Floorplan Layout Verify Specifications

  25. Schematic OUTPUTS Sum Logic • Adder Look Ahead Logic Look Ahead Logic Look Ahead Logic Look Ahead Logic Look Ahead Logic Look Ahead Logic INPUTS Marketing Description Implementing Floorplan Layout Verify Specifications

  26. Floorplan Evolution Initial Floorplan Multiplier Reg A Reg C Exp Calc Reg B Align C Pipeline Reg Pipeline Reg Adder Ld Zero Pipeline Reg Round Normalize Overflow checker Reg Y Marketing Description Implementing Floorplan Layout Verify Specifications

  27. Floorplan Evolution Final Floorplan Reg C Reg A Exponents Multiplier Reg B Ld zero Align Adder O v f N o r m a l i z e Output R o u n d Marketing Description Implementing Floorplan Layout Verify Specifications

  28. Layout, Verification & Specification Avni

  29. Layout Decisions • 3 cell heights – 6.03, 5.04 and 3.55 • Uniform width vdd and ground rails • Wider vdd and ground rails in power hungry modules • Max of 8 latches per clock pulse generator • Uniform metal directionality within each block Marketing Description Implementing Floorplan Layout Verify Specifications

  30. Final Layout Marketing Description Implementing Floorplan Layout Verify Specifications

  31. Final Layout MULTIPLIER Marketing Description Implementing Floorplan Layout Verify Specifications

  32. Multiplier I N I N • Height: 191.6 • Width: 206.38 • Area: 20,388 B I T S L I C E P I P E L I N E R E G O U T P U T O U T P U T Marketing Description Implementing Floorplan Layout Verify Specifications

  33. Final Layout MULTIPLIER ADDER Marketing Description Implementing Floorplan Layout Verify Specifications

  34. Adder A D D E R • Height:122.9 • Width: 110.2 • Area:13,202 INCREMENTER Marketing Description Implementing Floorplan Layout Verify Specifications

  35. Final Layout Input Exponents Input Multiplier Ld zero Align Adder O v f O U T N o r m a l i z e R o u n d Marketing Description Implementing Floorplan Layout Verify Specifications

  36. Layer Masks Active: 14.04% Marketing Description Implementing Floorplan Layout Verify Specifications

  37. Layer Masks Poly : 9.25% Marketing Description Implementing Floorplan Layout Verify Specifications

  38. Layer Masks Metal 1 : 34.08% Marketing Description Implementing Floorplan Layout Verify Specifications

  39. Layer Masks Metal 2 : 18.00% Marketing Description Implementing Floorplan Layout Verify Specifications

  40. Layer Masks Metal 3 : 14.99% Marketing Description Implementing Floorplan Layout Verify Specifications

  41. Layer Masks Metal 4 : 6.23% Marketing Description Implementing Floorplan Layout Verify Specifications

  42. Verification Of Design • Behavioral and Structural Verilog • Extensive Testing – Unable to find C or Matlab Code • Schematic and Layout testing • Analog Simulations – Compare Output with Behavioral • Full Chip Verification Marketing Description Implementing Floorplan Layout Verify Specifications

  43. Design Specifications • Critical path delay = 2.25ns • Clock speed = 400MHz • Pipeline stages = 4 • Height by width = 195.26 um * 303.255 um • Area = 59,214 um^2 • Aspect ratio = 1:1.55 • Transistor density = 0.22 • Total Pin Count = 67 Marketing Description Implementing Floorplan Layout Verify Specifications

  44. Marketing Description Implementing Floorplan Layout Verify Specifications

  45. Marketing Description Implementing Floorplan Layout Verify Specifications

  46. Conclusion Jigar

  47. Everyone Needs a MAD MAC • Graphics – HDR Rendering, Blending and Shader ops • Fastest 180nm GPU: 250 MHz (9-bit Int) • MAD MAC 525: 400 MHz (16-bit FP) Marketing Description Implementing Floorplan Layout Verify Specifications

  48. Everyone Needs a MAD MAC • DSPs – Computing Vector Dot-Products in Digital Filters Marketing Description Implementing Floorplan Layout Verify Specifications

More Related