1 / 75

Programmable Logic Circuits: Computer Arithmetic: Introduction

ELECT 90X. Programmable Logic Circuits: Computer Arithmetic: Introduction. Dr. Eng. Amr T. Abdel-Hamid. Slides based on slides prepared by: B. Parhami, Computer Arithmetic: Algorithms and Hardware Design, Oxford University Press, 2000.

seth-mendez
Télécharger la présentation

Programmable Logic Circuits: Computer Arithmetic: Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ELECT 90X Programmable Logic Circuits: Computer Arithmetic: Introduction Dr. Eng. Amr T. Abdel-Hamid • Slides based on slides prepared by: • B. Parhami, Computer Arithmetic: Algorithms and Hardware Design, Oxford University Press, 2000. • I. Koren, Computer Arithmetic Algorithms, 2nd Edition, A.K. Peters, Natick, MA, 2002. Fall 2009

  2. What is Computer Arithmetic? Pentium Division Bug (1994-95): Pentium’s radix-4 SRT algorithm occasionally gave incorrect quotient First noted in 1994 by T. Nicely who computed sums of reciprocals of twin primes: 1/5 + 1/7 + 1/11 + 1/13 + . . . + 1/p + 1/(p + 2) + . . . Worst-case example of division error in Pentium:

  3. A Motivating Example Using a calculator with √, x2, and xy functions, compute: u = √√ … √ 2= 1.000 677 131 “1024th root of 2” v = 21/1024 = 1.000 677 131 Save u and v; If you can’t save, recompute values when needed x = (((u2)2)...)2 = 1.999 999 963 x' = u1024 = 1.999 999 973 y = (((v2)2)...)2 = 1.999 999 983 y' = v1024 = 1.999 999 994 Perhaps v and u are not really the same value w = v – u = 1  10–11 Nonzero due to hidden digits (u – 1)  1000 =0.677 130 680 [Hidden ... (0) 68] (v – 1)  1000 =0.677 130 690 [Hidden ... (0) 69]

  4. Finite Range Can Lead to Disaster Example: Explosion of Ariane Rocket (1996 June 4) Unmanned Ariane 5 rocket of the European Space Agency veered off its flight path, broke up, and exploded only 30 s after lift-off (altitude of 3700 m) The $500 million rocket (with cargo) was on its first voyage after a decade of development costing $7 billion Cause: “software error in the inertial reference system” Problem specifics: A 64 bit floating point number relating to the horizontal velocity of the rocket was being converted to a 16 bit signed integer An SRI* software exception arose during conversion because the 64-bit floating point number had a value greater than what could be represented by a 16-bit signed integer (max 32 767) *SRI = Inertial Reference System

  5. Encoding Numbers in 4 Bits Some of the possible ways of assigning 16 distinct codes to represent numbers.

  6. The Binary Number System • In conventional digital computers - integers represented as binary numbers of fixed length n • An ordered sequence of binary digits • Each digit x (bit) is 0 or 1 • The above sequence represents the integer value X • Upper case letters represent numerical values or sequences of digits • Lower case letters, usually indexed, represent individual digits i

  7. Radix of a Number System • The weight of the digit x is the i th power of 2 • 2 is the radix of the binary number system • Binary numbers are radix-2 numbers - allowed digits are 0,1 • Decimal numbers are radix-10 numbers - allowed digits are 0,1,2,…,9 • Radix indicated in subscript as a decimal number • Example: • (101) - decimal value 101 • (101) - decimal value 5 i 10 2

  8. Range of Representations • Operands and results are stored in registers of fixed length n - finite number of distinct values that can be represented within an arithmetic unit • Xmin ; Xmax - smallest and largest representable values • [Xmin,Xmax] - range of the representable numbers • A result larger then Xmax or smaller than Xmin - incorrectly represented • The arithmetic unit should indicate that the generated result is in error - an overflowindication

  9. Example - Overflow in Binary System • Unsigned integers with 5 binary digits (bits) • Xmax = (31)10 - represented by (11111)2 • Xmin = (0)10 - represented by (00000)2 • Increasing Xmax by 1 = (32)10 =(100000)2 • 5-bit representation - only the last five digits retained - yielding (00000)2 =(0)10 • In general - • A number X not in the range [Xmin,Xmax]=[0,31] is represented by X mod 32 • If X+Y exceeds Xmax - the result is S = (X+Y) mod 32 • Example: X 10001 17 +Y 10010 18 1 00011 3 = 35 mod 32 • Result has to be stored in a 5-bit register - the most significant bit (with weight 2 =32) is discarded 5

  10. Fixed Radix Systems • r - the radix of the number system • Conventional number systems are also called fixed-radix systems • With no redundancy - 0 xi r-1 • xi r introduces redundancy into the fixed-radix number system ?? HOW? • If xi r is allowed - • two machine representations for the same value -(...,xi+1,xi,... ) and (...,xi+1+1,xi-r,... )

  11. Representation of Mixed Numbers • A sequence of n digits in a register - not necessarily representing an integer • Can represent a mixed number with a fractional part and an integral part • The n digits are partitioned into two - k in the integral part and m in the fractional part (k+m=n) • The value of an n-tuple with a radix point between the k most significant digits and the m least significant digits • is

  12. Fixed Point Representations • Radix point not stored in register - understood to be in a fixed position between the k most significant digits and the m least significant digits • These are called fixed-point representations • Programmer not restricted to the predetermined position of the radix point • Operands can be scaled - same scaling for all operands • Add and subtract operations are correct - • aX  aY=a(X  Y) (a - scaling factor) • Corrections required for multiplication and division • aX  aY=a X  Y ; aX/aY=X/Y • Commonly used positions for the radix point - • rightmost side of the number (pure integers - m=0) • leftmost side of the number (pure fractions - k=0) 2

  13. ULP - Unit in Last Position • Given the length n of the operands, the weight r of the least significant digit indicates the position of the radix point • Unit in the last position (ulp) - the weight of the least significant digit • ulp = r • This notation simplifies the discussion • No need to distinguish between the different partitions of numbers into fractional and integral parts -m -m

  14. Representation of Negative Numbers • Fixed-point numbers in a radix r system • Two ways of representing negative numbers: • Sign and magnituderepresentation (or signed-magnitude representation) • Complement representation with two alternatives • Radix complement (two's complement in the binary system) • Diminished-radix complement (one's complement in the binary system)

  15. Signed-Magnitude Representation • Sign and magnitude are represented separately • First digit is the sign digit, remaining n-1 digits represent the magnitude • Binary case - sign bit is 0 for positive, 1 for negative numbers • Non-binary case - 0 and r-1 indicate positive and negative numbers • Only 2r out of the r possible sequences are utilized • Two representations for zero - positive and negative • Inconvenient when implementing an arithmetic unit - when testing for zero, the two different representations must be checked n-1 n

  16. Disadvantage of the Signed-Magnitude Representation • Operation may depend on the signs of the operands • Example - adding a positive number X and a negative number -Y : X+(-Y) • If Y>X, final result is -(Y-X) • Calculation - • switch order of operands • perform subtraction rather than addition • attach the minus sign • A sequence of decisions must be made, costing excess control logic and execution time • This is avoided in the complement representation methods

  17. Complement Representations of Negative Numbers • Two alternatives - • Radix complement (called two's complement in the binary system) • Diminished-radix complement (called one's complement in the binary system) • In both complement methods - positive numbers represented as in the signed-magnitude method • A negative number -Y is represented by R-Y where R is a constant • This representation satisfies -(-Y )=Y since R-(R-Y)=Y

  18. Advantage of Complement Representation • No decisions made before executing addition or subtraction • Example: X-Y=X+(-Y) • -Y is represented by R-Y • Addition is performed by X+(R-Y) = R-(Y-X) • If Y>X, -(Y-X) is already represented as R-(Y-X) • No need to interchange the order of the two operands

  19. Two’s Complement 0 • r=2, k=n=4, m=0, ulp=2 =1 • Radix complement (called two's complement in the binary case) of a number X = 2 - X • It can instead be calculated by X+1 • 0000 to 0111 represent positive numbers 010 to 710 • The two's complement of 0111 is 1000+1=1001 • it represents the value (-7)10 • The two's complement of 0000 is 1111+1=10000=0 mod 2 - single representation of zero • Each positive number has a corresponding negative number that starts with a 1 • 1000 representing (-8)10 has no corresponding positive number • Range of representable numbers is -8  X  7 4 - 4

  20. The Two’s Complement Representation

  21. Example - Addition in Two’s complement • Calculating X+(-Y) with Y>X - 3+(-5) 0011 3 + 1011 -5 1110 -2 • Correct result represented in the two's complement method - no need for preliminary decisions or post corrections • Calculating X+(-Y) with X>Y - 5+(-3) 0101 5 + 1101 -3 1 0010 2 • Only the last four least significant digits are retained, yielding 0010

  22. One’s Complement in Binary System • r=2, k=n=4, m=0, ulp=2 =1 • Diminished-radix complement (called one's complement in the binary case) of a number X = (2 - 1) - X = X • As before, the sequences 0000 to 0111 represent the positive numbers 010 to 710 • The one's complement of 0111 is 1000, representing (-7)10 • The one's complement of zero is 1111 - two representations of zero • Range of representable numbers is -7  X  7 0 4 -

  23. Comparing the Three Representations in a Binary System

  24. 5.1 Bit-Serial and Ripple-Carry Adders Half-adder (HA): Truth table and block diagram Full-adder (FA): Truth table and block diagram

  25. Half-Adder Implementations c Three implementations of a half-adder.

  26. Full-Adder Implementations Possible designs for a full-adder in terms of half-adders, logic gates, and CMOS transmission gates.

  27. Full-Adder Details Logic equations for a full-adder: s = xycin (odd parity function) = xycinxycinxycinxycin cout = x yx ciny cin (majority function) CMOS transmission gate and its use in a 2-to-1 mux.

  28. Simple Adders Built of Full-Adders Using full-adders in building bit-serial and ripple-carry adders.

  29. Critical Path Through a Ripple-Carry Adder Tripple-add = TFA(x,ycout) + (k – 2)TFA(cincout) + TFA(cins) Critical path in a k-bit ripple-carry adder.

  30. Binary Adders as Versatile Building Blocks Set one input to 0: cout = AND of other inputs Set one input to 1: cout = OR of other inputs Set one input to 0 and another to 1: s = NOT of third input Four-bit binary adder used to realize the logic function f = w + xyz and its complement.

  31. Conditions and Exceptions Two’s-complement adder with provisions for detecting conditions and exceptions. overflow2’s-compl = ckck–1 = ckck–1ck ck–1

  32. Manchester Carry Chains and Adders Sum digit in radix rsi =(xi + yi + ci) mod r Special case of radix 2 si =xiyici Computing the carries ci is thus our central problem For this, the actual operand digits are not important What matters is whether in a given position a carry is generated, propagated, or annihilated (absorbed) For binary addition: gi = xi yipi = xiyiai =xiyi  = (xiyi) It is also helpful to define a transfer signal: ti = gipi = ai= xiyi Using these signals, the carry recurrence is written as ci+1= gici pi = gici gici pi = gici ti

  33. Carry Network is the Essence of a Fast Adder gi = xiyi pi = xiyi Ripple; Skip; Lookahead; Parallel-prefix The main part of an adder is the carry network. The rest is just a set of gates to produce the g and p signals and the sum bits.

  34. Ripple-Carry Adder Revisited The carry recurrence: ci+1 = gipici Latency of k-bit adder is roughly 2k gate delays: 1 gate delay for production of p and g signals, plus 2(k – 1) gate delays for carry propagation, plus 1 XOR gate delay for generation of the sum bits The carry propagation network of a ripple-carry adder.

  35. The Complete Design of a Ripple-Carry Adder gi = xiyi pi = xiyi

  36. Unrolling the Carry Recurrence Recall the generate, propagate, annihilate (absorb), and transfer signals: SignalRadix rBinary gi is 1 iff xi + yirxi yi pi is 1 iff xi + yi = r – 1xiyi ai is 1 iff xi + yi < r – 1xiyi  = (xiyi) ti is 1 iff xi + yir – 1 xiyi si (xi + yi + ci) mod rxiyici The carry recurrence can be unrolled to obtain each carry signal directly from inputs, rather than through propagation ci = gi–1ci–1pi–1 = gi–1 (gi–2ci–2pi–2)pi–1 = gi–1gi–2pi–1ci–2pi–2pi–1 = gi–1gi–2pi–1gi–3pi–2pi–1ci–3pi–3pi–2pi–1 = gi–1gi–2pi–1gi–3pi–2pi–1gi–4pi–3pi–2pi–1ci–4pi–4pi–3pi–2pi–1 = . . .

  37. Full Carry Lookahead x3 y3 x2 y2 x1 y1 x0 y0 cin . . . s3 s2 s1 s0 Theoretically, it is possible to derive each sum digit directly from the inputs that affect it Carry-lookahead adder design is simply a way of reducing the complexity of this ideal, but impractical, arrangement by hardware sharing among the various lookahead circuits

  38. Four-Bit Carry-Lookahead Adder Complexity reduced by deriving the carry-out indirectly Four-bit carry network with full lookahead. Full carry lookahead is quite practical for a 4-bit adder c1= g0c0p0 c2= g1g0p1c0p0p1 c3= g2g1p2g0p1p2c0p0p1p2 c4= g3g2p3g1p2p3g0p1p2p3 c0p0p1p2p3

  39. Carry Lookahead Beyond 4 Bits . . . 32-input OR Consider a 32-bit adder c1= g0c0p0 c2= g1g0p1c0p0p1 c3= g2g1p2g0p1p2c0p0p1p2 . . . c31= g30g29p30g28p29p30g27p28p29p30 . . . c0p0p1p2p3...p29p30 32-input AND High fan-ins necessitate tree-structured circuits

  40. Solutions to the Fan-in Problem • Multilevel lookahead • Block Adders • High-radix addition (i.e., radix 2h) : Increases the latency for generating g and p signals and sum digits, but simplifies the carry network (optimal radix?) • Example: 16-bit addition • Radix-16 (four digits) • Two-level carry lookahead (four 4-bit blocks) • Either way, the carries c4, c8, and c12 are determined first • c16 c15 c14 c13 c12 c11 c10 c9 c8c7c6c5c4c3c2c1c0 • Cout ? ? ? cin

  41. Block Ripple Adder

  42. Larger Carry-Lookahead Adder Design • Block generate and propagate signals • g[i,i+3]= gi+3gi+2pi+3gi+1pi+2pi+3gi pi+1pi+2pi+3 • p[i,i+3]= pi pi+1pi+2pi+3 • If all 4 bits in a block propagate, the block propagates a carry. • If at least one of the 4 bits generates carry and it can be propagated to the MSB, the block generates a carry.

  43. A Building Block for Carry-Lookahead Addition Four-bit lookahead carry generator. Four-bit adder

  44. Combining Block g and p Signals Combining of g and p signals of four blocks of arbitrary widths into the g and p signals for the overall block

  45. A Two-Level Carry-Lookahead Adder Building a 64-bit carry-lookahead adder from 16 4-bit adders and 5 lookahead carry generators.

  46. Ling Adder and Related Designs Consider the carry recurrence and its unrolling by 4 steps: ci = gi–1ci–1ti–1 = gi–1gi–2ti–1gi–3ti–2ti–1gi–4ti–3ti–2ti–1ci–4ti–4ti–3ti–2ti–1 Ling’s modification: Propagate hi = cici–1 instead of ci hi = gi–1hi–1ti–2 = gi–1gi–2gi–3ti–2gi–4ti–3ti–2hi–4ti–4ti–3ti–2 CLA: 5 gates max 5 inputs 19 gate inputs Ling: 4 gates max 5 inputs 14 gate inputs The advantage of hi over ci is even greater with wired-OR: CLA: 4 gates max 5 inputs 14 gate inputs Ling: 3 gates max 4 inputs 9 gate inputs Once hi is known, however, the sum is obtained by a slightly more complex expression compared with si = pici si= (tihi+1) hi gi ti–1

  47. Carry Determination as Prefix Computation

  48. Formulating the Prefix Computation Problem The problem of carry determination can be formulated as: Given (g0, p0) (g1, p1) . . . (gk–2, pk–2) (gk–1, pk–1) Find (g[0,0] , p[0,0]) (g[0,1] , p[0,1]) . . . (g[0,k–2] , p[0,k–2]) (g[0,k–1] , p[0,k–1]) c1c2 . . . ck–1ck The desired pairs are found by evaluating all prefixes of (g0, p0) ¢ (g1, p1) ¢ . . . ¢ (gk–2, pk–2) ¢ (gk–1, pk–1) The carry operator ¢ is associative, but not commutative [(g1, p1) ¢ (g2, p2)] ¢ (g3, p3) = (g1, p1) ¢ [(g2, p2) ¢ (g3, p3)] Prefix sums analogy: Given x0x1x2 . . . xk–1 Find x0x0+x1x0+x1+x2 . . . x0+x1+...+xk–1

  49. Example Prefix-Based Carry Network 6 2 -1 5 12 6 7 5 ¢ ¢ ¢ ¢ g3, p3 g3, p3 g2, p2 g2, p2 g1, p1 g1, p1 g0, p0 g0, p0 g[0,3], p[0,3] =(c4, --) g[0,3], p[0,3] =(c4, --) g[0,2], p[0,2] =(c3, --) g[0,2], p[0,2] =(c3, --) g[0,1], p[0,1] =(c2, --) g[0,1], p[0,1] =(c2, --) g[0,0], p[0,0] =(c1, --) g[0,0], p[0,0] =(c1, --) + + Four-input prefix sums network + + Scan order Four-bit Carry lookahead network

  50. Alternative Parallel Prefix Networks Parallel prefix sums network built of two k/2-input networks and k/2 adders. (Ladner-Fischer)

More Related