1 / 35

FPGA Defect Tolerance: Impact of Granularity

FPGA Defect Tolerance: Impact of Granularity. Anthony Yu Guy Lemieux December 14, 2005. Outline. Introduction and motivation Previous works New architectures Coarse-grain redundancy (CGR) Fine-grain redundancy (FGR) Experimentation Results Conclusions. Introduction and Motivation.

conan
Télécharger la présentation

FPGA Defect Tolerance: Impact of Granularity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FPGA Defect Tolerance: Impact of Granularity Anthony Yu Guy Lemieux December 14, 2005

  2. Outline • Introduction and motivation • Previous works • New architectures • Coarse-grain redundancy (CGR) • Fine-grain redundancy (FGR) • Experimentation Results • Conclusions Field-Programmable Technology (FPT) '05

  3. Introduction and Motivation • Scaling introduces new typesof defects • Smaller feature sizes susceptible to smaller defects • Expected results • Defects per chip increases • Chip yield declines • FPGAs are mostly interconnect • FPGAs must tolerate multiple interconnect defects to improve yield (and $$$) Field-Programmable Technology (FPT) '05

  4. General Defect Tolerant Techniques • Defect-tolerant techniques minimize impact (cost) of manufacturing defects • FPGA defect-tolerance can be loosely categorized into three classes: • Software Redundancy – use CAD tools to map around the defects • Hardware Redundancy – incorporate spare resources to assist in defect correction (eg. Spare row/column) • Run-time Redundancy – protection against transient faults such as SEUs (eg. TMR) Field-Programmable Technology (FPT) '05

  5. Previous work – 1 – Xilinx • Xilinx’s Defect-Tolerant Approach • Customer (knowingly) purchases “less that perfect” parts • Customer gives Xilinx configuration bitstream • Xilinx tests FPGA devices against bitstream • Sells FPGA parts that “appear” perfect • Defects avoid the bitstream • Limitation: • Chips work only with given bitstream – no changes! Field-Programmable Technology (FPT) '05

  6. Previous work – 2 – Altera • Altera’s Defect-Tolerant Approach • Customer purchases “seemingly perfect” parts • Make defective resources inaccessible to user • Coarse-grain architecture • Spare row and column in array (like memories) • Defective row/column must be bypassed • Use the spare row/column instead • Limitation: • Does not scale well (multiple defects) Field-Programmable Technology (FPT) '05

  7. Objective • Problem • FPGA yield is on decline because of aggressive technology scaling • Proposed Solutions • Defect-tolerance through redundancy • Important Objectives • Interconnect defects important (dominates area) • Tolerate multiple defects (future trend) • Preserve timing (no timing re-verification) • Fast correction time (production use) • Understand the factors that influence yield Field-Programmable Technology (FPT) '05

  8. Background

  9. Island-style FPGA Field-Programmable Technology (FPT) '05

  10. Directional Switch Block Field-Programmable Technology (FPT) '05

  11. Directional Switch Block Field-Programmable Technology (FPT) '05

  12. Course-grain Redundancy (CGR)

  13. Coarse-grain Redundancy (CGR) Field-Programmable Technology (FPT) '05

  14. So…what’s wrong with it? Field-Programmable Technology (FPT) '05

  15. Improving yield for CGR –Adding Multiple Global Spares • Add multiple global spare to traditional CGR • Global spares can be used to repair any defective row/column in the array • Wire extensions are now longer Field-Programmable Technology (FPT) '05

  16. Yield Impact of Multiple Global Spares Field-Programmable Technology (FPT) '05

  17. Increasing Area+Delay Overhead MORE SPARES  MORE MUX OVERHEAD IN EVERY SWITCH ELEMENT NO SPARES 2 GLOBAL SPARES 4 GLOBAL SPARES MAY BE IMPRACTICAL !!! 1 GLOBAL SPARE Field-Programmable Technology (FPT) '05

  18. Improving yield for CGR –Adding Multiple Local Spares • Divide FPGA into subdivisions • Each subdivision has localspare(s) • Distributes spares across chip • Reduces mux area overhead(of Global scheme) • Limitation: • Spare(s) can only repair defect within the subdivision Field-Programmable Technology (FPT) '05

  19. Yield Impact of Multiple Local Spares(not as good as Global with same # spares) Field-Programmable Technology (FPT) '05

  20. Fine-grain Redundancy (FGR)

  21. Fine-grain Redundancy (FGR) – Defect Avoidance by Shifting Field-Programmable Technology (FPT) '05

  22. Defect-tolerant Switch Block Field-Programmable Technology (FPT) '05

  23. Switch Implementation Options • Several detailed implementations are possible • Trade off area / delay / yield(repairability) Field-Programmable Technology (FPT) '05

  24. Minimum Fault-free Radius (MFFR) Field-Programmable Technology (FPT) '05

  25. Experimentation Results • Switch implementation • Array size • Wire length • Area • Summary Field-Programmable Technology (FPT) '05

  26. Switch Implementation * Assumes all bridging defects Field-Programmable Technology (FPT) '05

  27. Fixed Array Size (32x32) – Global Sparing Field-Programmable Technology (FPT) '05

  28. Fixed Array Size (32x32) – Local Sparing Field-Programmable Technology (FPT) '05

  29. Increasing Array Size Field-Programmable Technology (FPT) '05

  30. Yield for Varying Wire Length Field-Programmable Technology (FPT) '05

  31. Estimated Area overhead at equal yield (80%) * CGR-G1 can only tolerate 1-2 defects Field-Programmable Technology (FPT) '05

  32. Limitations of Study & Architectures • Logic and power/ground shorts were not considered • Assumed that all defects are randomly distributed • Assumed that all defects can be corrected with a single row/column • Switch area was not accounted for our yield model • Area results for CGR are approximated Field-Programmable Technology (FPT) '05

  33. Conclusions • CGR is effective for 1 or 2 defects • FGR meets desired objectives: • Tolerates multiple randomly distributed defects • Defect correction does not perturb timing • Tolerates an increasing number of defects as array size increases • Correction can be applied quickly Field-Programmable Technology (FPT) '05

  34. Thank you! anthonyy@ece.ubc.ca

  35. Summary • As the density of FPGAs increase, they becoming in susceptible to manufacturing defects • Fault-redundant techniques alleviate this growing problem • Depending on the desired level of protection, we can apply different techniques • At low defect rates, the spare row and column approach has lower overhead than the fine-grain approach • At large array sizes, the spare row and column approach requires more area overhead to tolerate the same number of defects as the fine-grain approach Field-Programmable Technology (FPT) '05

More Related