Kaijie Wu and Ramesh Karri CAD Lab Department of Electrical Engineering Polytechnic University

Algorithm Level RE-computing with Shifted Operands -A Register Transfer Level Concurrent Error Detection Technique Kaijie Wu and Ramesh Karri CAD Lab Department of Electrical Engineering Polytechnic University (kwu03@utopia.poly.edu,ramesh@india.poly.edu)

Outline • Review time redundancy based CED techniques • Describe ARESO • operation • checking ratio • benefits / drawbacks • Integrate pipelining with ARESO • Summary of ARESO overhead • Examples and Experimental Results • Conclusion

1. Perform basic computation 0 0 X3 Y3 X2 Y2 X1 Y1 X0 Y0 +4 +3 +2 +1 +0 result 1 Z3 Z4 Z2 Z1 Z0 2. Repeat computation with 1-bit shifted operands X3 Y3 X2 Y2 X1 Y1 X0 Y0 0 0 +4 +3 +2 +1 +0 result 2 Z3 0 Z1 Z0 Z2 3. Compare results result 1 result 2 c Error RE Computing with Shifted Operands (RESO)

Fault detection capability of RESO • With k-bit shift, RESO can detect errors in • all bit-wise logical operations when failures are confined to k adjacent bit-slices. • arithmetic operations in a ripple-carry adder and carry-lookahead adder when failures are confined to k-1 adjacent bit-slices, k>1. • arithmetic operations in a group carry look ahead adder when failures are confined to a group. Each group i consists of a k-1 bit adder and circuits for group-carry generate Gi, group-carry propagate Pi, and group carry-in Ci. • Up to k errors in a bit-slice of an array multiplier can be detected by shifting at most Log2(2k+1) bits in one of the operands.

+ + + * + * + + + + + + + + C C C C C C C + + + + + + + + + + * C C + + + + * * + + * C * C (a) (b) (c) (d) • No CED • (a) Example CDFG • Logic Level CED • (b) Duplication • (c) RESO, RERO, REDWC etc.. • Algorithm Level CED • (d) Algorithm level time redundancy Comparison

Algorithm Level Re-Computing with Shifted Operands (ARESO) • Does not use fault tolerant logic operators • Performs checking operations at the Register Transfer Level • Supports hardware overhead vs. performance penalty vs. error detection latency trade-offs

RTL Data path Operation of ARESO Indicator input input shift register shift register shift register C Output Error

ARESO - Checking Ratio (R) L Sh Input R Input R Input samples L Sh Input R R=1  check all results !!! Input R Input 2 Input 1 time L = # of clock cycles per iteration

ARESO features • Good points • # of comparison(s) are reduced • By increasing checking ratio, time overhead can be reduced • Compared to straightforward duplication, area overhead is reduced • Bad points • Large detection latency, (R+1)  L

Integrating ARESO with Pipelining • Reduces Error Detection Latency • If L=18, R=2 • Detection Latency = 54 cycles for basic ARESO (R+1)L • Detection Latency = 30 cycles for pipeline ARESO with initiation interval I = 6 (RIARESO)+L … … Detection Latency L Detection Latency Shifted Input 2 Shifted Input 2 IARESO Input 2 Input 2 L Input 1 Input 1 time 0 36 54 0 6 12 18 30 18 Basic ARESO Pipeline ARESO

ARESO -Throughput • Throughput: # of results that come from non-shifted inputs per clock cycle (= ) • To maintain this throughput, the initiation interval of the pipelined ARESO design should be IARESO= … IARESO Shifted Input 2 I … Input 2 I Input 2 Shifted Input 1 Input 1 Input 1 0 12 0 6 12 18 30 30 pipeline design w/o ARESO Pipeline design w ARESO (R = 1)

ARESO Design Tradeoffs

Error detection capability of ARESO • All RESO detectable permanent faults • The transient faults detection capability varies with R (the checking ratio) and D (the # of data outputs that will be affected) • when 1  R  D, 100 % RESO detectable faults • when D<R, 100 x (D / R) % RESO detectable faults

FIR Filter Example - overhead 50 ns clock FIR I=12, L=23 ARESO-1 FIR IARESO=6, R=1,L=24 ARESO-2 FIR IARESO =8, R=2, L=24 Multipliers (8×814) (9×817) (10 ×1016) (10 ×1017) (10 ×1019) (10 ×1016) (10 ×1019) Adders 2 (19×1919) 3 (21×2121) 2 (21×2121) Register bits 419 963 750 Combinational area (unit cells) 4051 6960 71.8% 5483 35.3% Sequential area (unit cells) 4983 11506 130.9% 8635 73.3% Total area (unit cells) 9034 18466 104.4% 14118 56.3% Detection latency (ns) - (6+24) ×50 = 1500 (2×8+24)×50= 2000 30.8% reduction in area at the expense of 33.3% increase in error detection latency.

*17 *16 *15 *14 +16 *13 *12 +14 +15 *11 +13 *10 +12 +11 *9 *8 +9 +10 +8 +7 *7 +6 *6 +5 *5 +4 read inputs test checking ratio counter *4 +3 *3 +2 *2 +1 *1 FIR Filter Example - Schedule • 17 multiplications, 16 additions • ARESO with • checking ratio = 2 • IARESO=8 clock cycles • L=24 clock cycles • 50 ns clock cycle • ARESO constraints were incorporated into Synopsys BC synthesis scripts • Two 21×2121 adders • One 10 ×1016 and One 10 ×1019 multipliers • Detection latency of 2000 ns

Multi-cycle ops (30 ns clock) FIR I=12 L=36 ARESO-1 FIR IARESO =6, R=1,L=36 ARESO-3 FIR IARESO =9, R=3,L=36 Combinational area (unit cells) 5318 8666 63.0% 6868 29.1% Sequential area (unit cells) 7898 14410 82.5% 10637 34.7% Total area (unit cells) 13216 23076 74.6% 17505 32.5% Detection latency (ns) - (6+36)×30=1260 (3×9+36)×30=1890 FIR using multi-cycle operations 31.8% reduction in area at the expense of 50% increase in error detection latency.

Combinational area (unit cells) 4186 6912 65.1% 5491 31.2% Sequential area (unit cells) 4983 11044 121.6% 8910 78.8% Total area (unit cells) 9169 17956 82.5% 14401 57.1% Detection latency (ns) - (6+24)×100= 3000 (2×8+24)×100= 4000 FIR using chained operations Op. chaining (100 ns clock) FIR I=12, L=24 ARESO-1 FIR IARESO =6, R=1,L=24 ARESO-2 FIR IARESO =8,R=2,L=24 24.7% reduction in area at the expense of 33.3% increase in error detection latency.

Conclusions • Compared to straightforward duplication, area overhead of ARESO-R designs are in the range 30%-100%. • The detection latency of ARESO-R increases with checking ratio R. • For a given throughput, the area overhead decreases as the checking ratio R increases. • ARESO constraints incorporated into Synopsys BC.

Kaijie Wu and Ramesh Karri CAD Lab Department of Electrical Engineering Polytechnic University

Kaijie Wu and Ramesh Karri CAD Lab Department of Electrical Engineering Polytechnic University

Presentation Transcript

Department of Electrical and Computer Engineering

ESPOL POLYTECHNIC UNIVERSITY DEPARTMENT OF MARITIME ENGINEERING AND SCIENCES

Department of Electrical Engineering

University of Tehran Department of Electrical and Computer Engineering

DEPARTMENT OF ELECTRICAL ENGINEERING

Polytechnic University Department of Electrical and Computer Engineering

Electrical Engineering Department

Kaijie Wu Polytechnic University kwu03@utopia.poly

Fayoum University Faculty of Engineering Electrical Engineering Department

General Engineering Polytechnic University

Department of Electrical Engineering Computer Networking Lab

Islamic University of Gaza Electrical Engineering Department

Florida Atlantic University Department of Computer and Electrical Engineering

University of Strathclyde Department of Electronic and Electrical Engineering

Department of Electrical Engineering

Qassim University College of Engineering Electrical Engineering Department

Chulalongkorn University Department of Electrical Engineering

ESPOL POLYTECHNIC UNIVERSITY DEPARTMENT OF MARITIME ENGINEERING AND SCIENCES

General Engineering Polytechnic University

DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY VISVESVARAYA COLLEGE OF ENGINEERING