CSE246 Adder – Part II
190 likes | 211 Vues
Learn about Zimmerman's Heuristic Approach for generating a parallel prefix adder of minimum size with depth constraint. Explore the advantages, disadvantages, and dynamic programming involved in constructing the fastest prefix adder under arbitrary input arrival time profiles.
CSE246 Adder – Part II
E N D
Presentation Transcript
CSE246Adder – Part II Instructor: Prof. Chung-Kuan Cheng
Zimmerman’s Heuristic Approach • Problem formulation • Given depth constraint, generate a parallel prefix adder of minimum size • Two step Heuristic Start with a serial prefix adder • Compress to a fastest prefix structure at the cost of increasing size • LSB to MSB, low level to high level • Expand to reduce size, subject to depth constraint • MSB to LSB, high level to low level
Zimmerman’s Heuristic Approach • Local compression/expansion operation • Up/down shift
Zimmerman’s Heuristic Approach • Advantages • Simple and fast • Product depth-size optimal result in many cases • Handles non-uniform input arrival times • Disadvantage • No guarantee on optimality
Prefix Adder with arbitrary input arrival time profile • Non-uniform input arrival times represented in real number • How to construct the fastest prefix adder under arbitrary input arrival time profile?
Cont’ • Timing model • All (G,P) generators have the same delay C • Denote the output timing of generator (G,P)[i:j] as t[i:j] • Suppose in the prefix graph, (G,P)[i:j] is generated from (G,P)[j:k] and (G,P)[k-1:j], then t[i:j] = max{t[i:k] , t[k-1:j] }+C
… … Level 1: … (G,P)[i:j] = (G,P)[i:k] (G,P)[k-1:j] Level 2: . … . . Level n: Dynamic Programming – The idea • Image a full array of partial prefix results • All (G,P) signals of length i are on level i • Rightmost signals are wanted prefix results • Generate all the (G,P) signals row by row, from lower level to higher level • For each (G,P) signal, find the scheme that leads to best timing, i.e., find the partition point k such that t[i:j] = min{max{t[i:k] , t[k-1:j] }+C} t[n:n] t[n-1:n-1] t[2:2] t[1:1] k t[n:n-1] t[2:1] t[n:n-2] t[3:1] t[n:2] t[n-1:1] t[n:1]
2(g4p4) 4(g3p3) 3(g2p2) 1(g1p1) 0(G0) Level 1 6 6 5 3(GP[1,0]) Level 2 8 7 5(GP[2,0]) Level 3 8 7(GP[3,0]) Level 4 8(GP[4,0]) Level 5 7 8 Dynamic Programming • A 5-bit example
Dynamic Programming • Complexity • For (G,P)[i:j], search (i-j) combinations • Overall O(n3) • Hints for reducing complexity • For (G,P)[i:j], there might more than one optimal partition points, but we want just one • At least one optimal partition point of (G,P)[i:j] is bounded by the optimal partition points of (G,P)[i-1:j] and (G,P)[i:j+1]
Backward Reduction I • Some of the partial prefix results are not used, hence can be removed Level 1 Level 2 Level 3 Level 4 Level 5 (a) (b)
3(g4p4) 3(g4p4) 6(g3p3) 6(g3p3) 7(g2p2) 7(g2p2) 11(g1p1) 11(g1p1) 8 9 8 9 13 13 (9) (G,P)[2,1] (G,P)[2,1] (11) (9) (11) (13) (13) 10 10 13 13 (11) (G,P)[4,2] (G,P)[4,2] (G,P)[3,1] (G,P)[3,1] (11) (13) (13) 13 13 (G,P)[4,1] (G,P)[4,1] (13) (13) 9 8 9 () (9) (9) 11 11 (11) (11) Backward Reduction II • Some nodes may be over tightened, and can be relaxed to reduce area
A missing detail • (G,P) signals allows overlap search space increases • However, allowing overlapping does not produce better timing (G,P)[i:j] = (G,P)[i:k] (G,P)[l:j] l ≥k
a11,8 b11,8 a7,4 b7,4 a3,0 b3,0 c12 c8 c4 cin A2 A1 A0 p11,8 p7,4 p3,0 x c12 0 1 0 1 0 1 c4 c8 Function level optimization • Carry Skip Adder If p3,0=p3p2p1p0 = 1, then x = cin
False Path • A1 <- MUX <- A0 <- cin is a false path • If carry is from cin, then block must have p3p2p1p0 = 1 • Since p3,0 = 1, g3,0 must be 0 • The carry is not generated from A0 • The carry needs not to propagate via A0, it will go from the MUX
False Path: Cycles • Cycles of False Paths: Eg. 1’s complement number addition Positive: x Negative: (2n-1)-x • Addition (2n-1)-x + (2n-1)-y = 2n+(2n-1)-(x+y)-1 A3,0 B3,0 Cout Cin Adder S3,0
Example • 0+0=0 11111 0 + 11111 0 111110 111111 0 • -3-5 = -8 11100 -3 + 11010 -5 110110 110111 -8
Multi-Operand Addition • Carry save adder: a (3,2) counter
Example • A (3,2) counter compresses X rows to 2/3X rows each time • Tree structure in implementation
Other Counters • (7,3) counter • (5,3) counter S1 Ca Cb S0 S2 S0 • Design of (5,3) counter using full adders Ca Cb S0