200 likes | 409 Vues
The New IEEE-754 Standard for Floating Point Arithmetic. Original IEEE 754 Standard. Adopted in 1985 Intended for hardware implementation Provision for software implementation of some operations Appendix of recommended fuctions Standard and alternate exception handling
E N D
The New IEEE-754 Standard for Floating Point Arithmetic Peter Markstein
Original IEEE 754 Standard • Adopted in 1985 • Intended for hardware implementation • Provision for software implementation of some operations • Appendix of recommended fuctions • Standard and alternate exception handling • No provisions for software access to floating point features, e.g. rounding mode.
Objectives of new standard • Incorporate good existing standard practice • Include decimal arithmetic instead of updating IEEE-854 Radix independent arithmetic • Clarify the standard • Reproducible floating point results • Try not to invalidate systems which conform to IEEE-754 (1985)
Existing Practice Added to Standard • Fused multiply-add (a × b + c) • Full double-length product a × b participates in addition before rounding. • Two different existing implementations of invalid exception reporting • Quad precision data format • 128 bit container, 113 mantissa bits, 15 exponent bits
Existing Practice Added to Standard - 2 • Non-homogeneous operations • 1985 standard suggests not allowing the result to be of a different format from the inputs • 2008 standard requires most operations to accept mixed inputs (in the same radix), with a specified result format.
Decimal Arithmetic • Motivated by business applications • 2 computational formats
Decimal Arithmetic -2 • Representaton of data • Two proposed methods • Binary encoded mantissa – mantissa encoded as a binary number • Decimal encoded matissa – uses 10-bit subfields to encode 3 decimal digits (not obvious binary encoding) • Standard requires conversion between both formats, although arithmetic on only one representation is expected on an implementation. • Decimal encoding proposed by hardware implemetor; binary encoding by software implementor.
Decimal Arithmetic -3 • Cohorts • Number may have several representations • Members of a cohort compare equal, but can be distinguished • Unnormalized data allows position of decimal point to be represented • Preferred exponent for all arithmetic operations
Some clarifications • Exceptions • Signal • Setting of flags • In default mode, signals set flags • Underflow flag is set only if inexact is also signalled • Retain two methods of determining underflow • Alternate exception handling • Optional • Interrupts are an example
Number formats • Formats charactized by p (precision), emax (maximum exponent), b (radix) • Fixed Width Interchange formats • Encodings defined in standard • Binary 16, 32, 64, 128 • Decimal 32, 64, 128 • Basic formats • At least one basic format must be supported • Binary 32, 64, 128; Decimal 64, 128
Extended and extendable precisions • Not required by the standard • Extended precision format extends a basic format with wider precision and range • Extendable precision format has precision and range specified under program control.
Additional functionality • Control of modes (or attributes) • Static and dynamic • Rounding mode • Decimal requires round to nearest, ties to larger magnitude • Alternate exception handling • Control of expression evaluation
Optional operations • Elementary Function Library • Conforming functions must round correctly • Special case behavior specified • Dilemma about correct rounding producing results outside the expected range e.g. atan(infinity)
Optional Operations - 2 • Product Reductions • Inserted to allow fast, special purpose evaluations, even in the presence of over/underflow. • Not generally reproducible. • Result is scaled • ∏ai , ∏(bi + ci ) , ∏(bi - ci )
Optional Operations-3 • Sum reductions • ∑ai • ∑aibi • ∑│aibi │ • ∑ai2 • Open question – should correctly rounded dot product (like in Acrith) be specified? At the moment it is not included.
Optional operations -4 • Operations on dynamic modes • Language standards should provide • getBinaryRoundingDirection • setBinaryRoundingDirection • Same for decimal if decimal supported • saveModes,restoreModes • defaultModes (sets all dynamically specified modes to their defaut values
Expression evaluation • Single operations are reproducible if performed with same operands and same environment • Languages may define order of operations, formats of implicit intermediate results, possible double-rounding of final result before storing
Expression evaluation- 2 • widenTo attributes: noWidenTo, widenToFormat • widenToFormat does not affect width of final rounding of a specific destination format • Value-changing optimizations, e.g. fma generation, reduction operation generation, use of associative and distributive laws, are not reproducible
Reproducible floating point results • Reproducible operations, reproducible attributes • Only overflow, divideByZero, and invalid are reproducible status flags • Reproducible programs must use reproducible operations, reproducible attributes • Avoid value changing optimizations • Avoid fma(0, Inf, Nan)