380 likes | 553 Vues
What You Have Always Wanted to Know about FP Hardware Implementation (But Were Afraid to Ask) Acknowledgements: Based on Prof. Shaaban lecture notes, Prof. M. Flynn and S. Oberman’s lecture notes, and past work on the SNAP project. Nhon Quach. Outline.
E N D
What You Have Always Wanted to Know about FP Hardware Implementation (But Were Afraid to Ask)Acknowledgements: Based on Prof. Shaaban lecture notes, Prof. M. Flynn and S. Oberman’s lecture notes, and past work on the SNAP project Nhon Quach
Outline • IEEE 754 Standard – Motivations and implementation challenges • Common implementation practices in current FP adders and multipliers • Advanced implementation topics (based on the Stanford SNAP project) EE270 Special Lecture on FP Arithmetic
IEEE Standard Motivations • Enhance portability of math libraries • Preserve simple mathematical properties such as a*b = b*a • Graceful degradation through support of denormalized numbers • Multiple rounding modes for better rounding bias (RN) and interval arithmetic (RP and RM) • Well worth the complexity and cost in hardware EE270 Special Lecture on FP Arithmetic
FP Addition Algorithm (the 1st time) EE270 Special Lecture on FP Arithmetic
Why Need Direct Hardware Support? EE270 Special Lecture on FP Arithmetic
Latency and Throughput of Various FP Units EE270 Special Lecture on FP Arithmetic
FP Addition Algorithm (The 2nd Time) EE270 Special Lecture on FP Arithmetic
Predict the number of 0’s in the result based on the significands Result needs 2’s complemented before normalization EE270 Special Lecture on FP Arithmetic
Leading One Prediction (LOP) • Detect the patterns of Z*, T*GZ, G*, and T*ZG*, where Z=a’b’, T=a xor b, G=ab, and Z* means any number of Z’s. EE270 Special Lecture on FP Arithmetic
Note the use of “.” notation Multiplicand Multiplier Partial Product Carry and Sum (CPA) Round Final Significand EE270 Special Lecture on FP Arithmetic
s Sign Extension Trick (Favorite Interview Question) s s s s s s s s 1 s s s s s s 1 1 s s s s 1 1 s s 1 1 • Adding the 1’s does not change the value of significand • Sum of sign and 1 is negative of sign EE270 Special Lecture on FP Arithmetic
Many Ways to Build A Tree EE270 Special Lecture on FP Arithmetic
Many Types of Counters Too • 3-2 counters • 7-3 counters • 9-2 counters • Binary tree, ZM trees, overturned Staircase trees, etc. EE270 Special Lecture on FP Arithmetic
FP Addition Algorithm (The 3rd Time) EE270 Special Lecture on FP Arithmetic
Advanced FP Addition Algorithm N. Quach & M. Flynn, SNAP Addition Algorithm, Stanford, 1991 EE270 Special Lecture on FP Arithmetic
Summary • Fast FP adder tricks: two path implementation, LOP, and integrated rounding • Fast FP multiplier tricks: sign extension (elimination) logic, partial reduction tree, rounding, and Booth’s encoding EE270 Special Lecture on FP Arithmetic
To Probe Further • Arith.stanford.edu (lots of technical reports and papers on FP adder, multiplier, and divider implementation) • IEEE 754 standard specification • David Goldberg, “What Every Computer Scientist Should Know About Floating-Point Arithmetic,” ACM Computing Surveys, 23(1), 5-48 • W. Kahan, “IEEE Standard 754 for Binary Floating-Point Arithmetic,” Lecture notes on the status of IEEE 754 standard EE270 Special Lecture on FP Arithmetic