1 / 13

Floating Point Operations - Part II

Floating Point Operations - Part II. Multiplication. Do unsigned multiplication on the mantissas including the hidden bits Add the true exponents or unbias one of the exponents (subtract 127 from it) then perform 2’s complement addition Normalize the result Set the sign bit of the result.

Télécharger la présentation

Floating Point Operations - Part II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Floating Point Operations - Part II

  2. Multiplication • Do unsigned multiplication on the mantissas including the hidden bits • Add the true exponents or unbias one of the exponents (subtract 127 from it) then perform 2’s complement addition • Normalize the result • Set the sign bit of the result

  3. Setting the Sign bit The following table gives the sign bit of the result:

  4. mantissas already normalized exponents Example 12.5 least 16 bits eliminated 18.0 x 9.5 10010000 0 1000 0011 (1)001 0000 x 10011000 0 1000 0010 (1)001 1000 10010000 0 1000 0110(1)010 1011 10010000 10010000 101010110000000 1000 0011 0000 0011 1000 0110 14 bits Unbias (subtract 127) one of exponent, then perform 2’s complement addition

  5. Division • Do unsigned division of the mantissas • Subtract the exponent of the divisor from the exponent of the dividend • Normalize the result • Set the sign bit of the result

  6. Setting the sign bit of the quotient The sign bit of the quotient is set using the following table:

  7. Rounding • In floating point operations, some results may not be representable. • There is always a small amount of error incurred during rounding. • Error tend to accumulate over time • Operations performed in a different order might give different results • Exact comparison of two floating point variables is infeasible

  8. Floating point addition is not associative. Example 13.1 Suppose x = -1.510 x 1038, y = 1.510 x 1038 and z = 1.0 and suppose these are single-precision numbers. x+(y+z)= -1.510 x 1038 +(1.510 x 1038 + 1.0) = -1.510 x 1038 + 1.510 x 1038 = 0.0 (x+y)+z= (-1.510 x 1038 + 1.510 x 1038) + 1.0 = 0 + 1.0 = 1.0

  9. Rounding Rules • Round to nearest. Same as taught in school. In case of tie, if the lsb is 1 add a 1; if the lsb is a 0 truncate. The lsb is always 0. • Round toward zero. Truncate the magnitude to the correct number of bits. • Round toward positive infinity. The least positive value representable that is not arithmetically less than the unrounded value is chosen. • Round toward negative infinity. The least negative value representable but not arithmetically greater than the unrounded value is chosen.

  10. Overflow • Overflow occurs when the exponent of the normalized result is outside the range of values representable • The smallest number that can be represented normally has an exponent of e = -126, i.e. E = 1 = 0000 0001 and the largest number has an exponent ofe = 127, i.e. E = 254 = 1111 1110

  11. The IEEE FPS assigns special meaning for extreme values of the exponent • -¥ (S=1,E=255,F=0) • +¥ (S=0,E=255,F=0) • NaN (E=255,F ¹ 0) • 0 (E=0,F=0)

  12. Underflow • Underflow occurs when the result is too close to zero to be represented • Repeatedly dividing a number by a positive constant results in values that will approach zero but may never be zero, e.g. 1 divide by 10 repetitively • In these cases, floating point operations after some iteration will eventually return zero

  13. Until underflow occurs, the computation is reversible, i.e. if we multiply the current result by the constant the same number of times we have divided it, it will return the original number • Once, underflow occurs any number of multiplication will still produce zero

More Related