1 / 19

Decimal Multiplication with Efficient Partial Product Generation

Decimal Multiplication with Efficient Partial Product Generation. Mike Schulte Dept. of Electrical & Computer Engineering University of Wisconsin at Madison. Mark Erle, Eric Schwarz Server & Technology Group IBM. Outline. Introduction and motivation Decimal multiplication challenges

lea
Télécharger la présentation

Decimal Multiplication with Efficient Partial Product Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decimal Multiplication with Efficient Partial Product Generation Mike Schulte Dept. of Electrical & Computer Engineering University of Wisconsin at Madison Mark Erle, Eric Schwarz Server & Technology Group IBM

  2. Outline • Introduction and motivation • Decimal multiplication challenges • Novel aspects of algorithm • Algorithm components • Operand recode • Digit-by-digit multiplication • Partial product generation • Overlap removal & encoding • Partial product accumulation • Final product correction • Summary

  3. Introduction and Motivation • Preponderance of business data in decimal form • Inexact mapping between decimal and binary • Decimal arithmetic used (required) in banking, finance, insurance, accounting • Increasing support in arithmetic community (revising IEEE 754/854) • Significant speedup achievable in hardware • Multiplication a key function

  4. By the way, we’re about 20% through the talk: 0.2010 = 0.00110011…2

  5. Decimal Multiplication Challenges • Greater number of multiplicand tuples • Complicates partial product generation • Representing decimal values with two-state devices • Complicates partial product generation • Complicates partial product accumulation • Inability to use binary arithmetic techniques directly

  6. Novel Aspects of Algorithm • Recode operands • Simplify partial product generation • Improve latency of partial product generation • Restrict magnitude range of partial product digits • Simplify partial product accumulation • Improve latency of partial product accumulation

  7. Key Aspect of Algorithm • Generate partial products as needed, not a priori • Benefits: • Reduces cycles to generate tuples • Reduces wiring to distribute tuples • Eliminates registers needed to store tuples • Cost can be delay during iterative portion of algorithm • Reduce cost via pipelining • Generate partial product in cycle i • Accumulate partial product in cycle i+1

  8. Operand Recode - Complexity of Digit-by-digit Products

  9. Operand Recode - Mechanism • Need signed-digits to restrict range • E.g., 2 5 6 is recoded into 3 -4 -4 • aiS .elem. {-5, -4, …, 0, …, +4, +5} • Recode in parallel all digits .ge. 5 • Four cases: ai .ge. 5 ?, ai-1 .ge. 5 ? • Need three operations • “Do nothing” • Increment • Radix complement • Diminished radix complement

  10. Operand Recode -Implementation • Recode entire multiplicand, recode multiplier digit by digit • Fig. a: single digit • Fig. b: n-digit

  11. Digit-by-digit Product - Mechanism • Restrict digits to yield only 16 combinations • Magnitude: {0, …, 9}  {-5, …, +5} (100) • Absolute value: {-5, …, +5}  {0, …, 5} (36) • Zero & identity: {0, …, 5}  {2, …, 5} (16) • Lookup-table or combinatorial logic • Product characteristics • Absolute value  sign correction • {0, …, 25}, i.e., two digits  overlap removal • Restrict LSD to |5|  signed-digit addition • LSD magnitude restriction eases • Overlap removal • Partial product accumulation

  12. Partial Product - Implementation • LSD mux selects: • a0S or biS = 0 • a0S = 1 • biS = 1 • a0S and biS > 1 • MSD mux selects: • a0S and biS < 2 • a0S and biS > 1 • Fig. a: single digit • Fig. b: n+1 -digit

  13. Overlap Removal & Encoding • Partial products are sign-corrected, signed-magnitude digits in overlapped form • In each digit position • Four-bit, signed-magnitude digit {-5, …, +5} • Three-bit, signed-magnitude digit {-2, …, +2} • Prepare for partial product accumulation via Svoboda signed-digit adder • Use combinatorial circuit to • remove the overlap • produce Svoboda-encoded signed-digits

  14. Partial Product Accumulation • Addition with signed-digits eliminates carry propagation • Use Svoboda signed-digit adder to accumulate • Partial product in encoded form • Shifted intermediate product (previous iteration) • One final product digit converted to BCD each cycle • Four cases: IPi[0] .ge. 0 ?, IPi-1[0] .ge. 0 ? • Need four operations • Convert to BCD • Convert to BCD and decrement • Convert additive inverse to BCD and radix complement • Convert additive inverse to BCD, radix complement, and decrement

  15. Cycle By Cycle

  16. Block Diagram -Top

  17. BlockDiagram -Bottom

  18. Summary • Algorithm utilizes restricted-range, signed digits throughout • Original aspects include: • Recoding operands into restricted-range, signed-digits • Generating non-overlapping, sign-corrected partial products from recoded operands • Recoding partial products for entry into signed-digit adder • Algorithm achieves n+5 latency • Extendable to floating-point multiplication

  19. Questions & Perhaps Some AnswersEnd

More Related