160 likes | 281 Vues
This paper presents an innovative design for a decimal floating-point (DFP) adder aimed at reducing latency through key modifications, including the implementation of a new internal format and an advanced Leading-Zero Anticipator (LZA). The LZA detects the position of the most significant bit, enhancing the overall performance of decimal arithmetic operations. Unlike traditional binary designs, this approach utilizes uncompressed exponent fields and a BCD-encoded significand to eliminate unnecessary critical path delays. Testing confirmed the correctness of the new adder against IBM’s extensive test suite.
E N D
A Decimal Floating-Point Adder with Decoded Operands and a Decimal Leading-Zero Anticipator By Liang-Kai Wang and Michael J. Schulte Joseph Schneider March 12, 2010
Objective • Goal is to improve latency for DFP Adder • Number of modifications performed to achieve this, such as an implementation of a new internal format • Overall focus is on the design of a decimal LZA
Leading-Zero Anticipator • Detects location of most significant bit • Previous designs have been for binary, not decimal • Design of decimal LZA expected to improve latency
New Internal Format • Exponent field uncompressed • Significand encoded in BCD • New section for Leading Zero Count; Removes leading zero detection from critical path
Improvements • Internal Format removes need for Forward and Backward conversion units • Pre-correction moved in front of Swapping unit and duplicated; Keeps it out of critical path • Leading Zero Detection no longer performed in Shift Amount unit; Lead Zero Count is now an input signal, LZA used so later decimal operations do not need to recalculate it
Leading Zero Anticipator • Needed in addition and subtraction to guarantee leading zero count of output is correct • Only needed when result after addition or subtraction is not rounded; LZC is always zero when result is rounded
LZA - Addition • Preliminary LZC is the minimum number of leading zeros between the two significands being added • If there is a carry, final LZC obtained by reducing preliminary LZC by one
LZA - Subtraction • Requires Encoding unit, Correction unit, and a parallel array of decimal digit adders • Encoding unit • Converts BCD digits into strings of zeros and ones • Detects position of most significant non-zero digit in the string • Correction unit • Flag generation modules and correction trees determine if correction needs to be performed on Encoding unit’s result
Testing • IBM decNumber library version 3.56 used to verify correctness of adder • Sign, exponent, and length and value of significand randomly generated • Adder successfully passed numerous random tests and the corner cases of IBM’s test suite • Previous adder version and new adder implemented in Verilog RTL using TSMC 45nm bulk technology
Results • Both designs use same floorplan so Area Util. Rate reflects how much area used by each design • New adder 14% faster but at the cost of 18% more area
Results – Adder Area Profile • LZA takes up significant amount of area, though Kogge-Stone adder is still the largest component
Results- LZA • LZA synthesized alone; Critical path has maximum delay of 24 FO4 inverter delays • Subtractor takes up over 60% of LZA area