1 / 38

Design of a Faithful LNS Interpolator

Design of a Faithful LNS Interpolator. Mark Arnold University of Manchester Institute of Science and Technology. Outline. Why choose Logarithmic Number Systems (LNS)? Floating Point versus LNS Round to Nearest is Hard Restricted versus Unrestricted Faithful Rounding

manton
Télécharger la présentation

Design of a Faithful LNS Interpolator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design of a Faithful LNS Interpolator Mark Arnold University of Manchester Institute of Science and Technology

  2. Outline Why choose Logarithmic Number Systems (LNS)? Floating Point versus LNS Round to Nearest is Hard Restricted versus Unrestricted Faithful Rounding Interpolation and Partitioning Prior Interpolators (Coleman et al., Lewis) Proposed Interpolators Conclusions

  3. Arithmetic Choices • Fixed-point (FX) • Scaled integer—manual rescale after multiply • Hard to design, but common choice for cost-sensitive applications • Floating-point IEEE-754 (FP) • Exponent provides automatic scaling for mantissa • Easier to use but more expensive • Logarithmic Number System (LNS) • Converts to logarithms once—keep as log during computation • Easy as FP, can be faster, cheaper, lower power than FX

  4. 1 2 3 4 5 67891 2 3 4 567891 1 2 3 4 567891 2 3 4 567891 Advantages of LNS • Cheaper multiply, divide, square root • Good for applications with high proportion of multiplications • Introduces no additional rounding error log(3) log(2) log(2) + log(3) = log(6) • Most significant bits change less frequently: power savings

  5. Commercial Interest in LNS Motorola: 120MHz LNS 1GFLOP chip[pan99] European Union: LNS microprocessor [col00] Yamaha: Music Synthesizer [kah98] Boeing: Aircraft controls Interactive Machines,Inc.:IMI-500: Animation forJay Jay the Jet Plane Advanced Rendering Hardware Ray-Tracing Engine Technologies:

  6. Notation x = real values, X = corresponding logarithmic representations b = base of the logarithm (b=2 is typical) F = precision 2F__  =  b , i.e., the smallest value > 1.0

  7. LNS Addition Given X = logb(x) and Y = logb(y): Why it works: 1. Let Z = X-Y 1. Z = logb(x/y) 2. Lookup sb(Z) = logb(1+bZ) 2. sb(Z) = logb(1+x/y) 3. T = Y + sb(Z) 3. T = logb(y(1+x/y)) Thus, T = logb(y + x) Hardware: 1 subtractor 1 function approximation unit lookup table (ROM or RAM) for F<12 interpolation for higher precision 1 adder Similar function, db, for subtraction

  8. 4.0 2.0  = 42 7 8 = 4.0 4 = 2.0 1.0 Floating Point versus LNS Exactly representable points shown for precision F=2 Floating point has greater relative error here LNS FloatingPoint

  9. Floating Point versus LNS LNS Continuous change in distance means constant relative precision 4.0 2.0 1.0 Discrete change in distance causes wobble in relative precision Lewis’ Observation: Round to Nearest LNS ln(2) Better Than FP! Margin for round error yet still be BTFP Floating Point But, is it worth the cost?

  10. Rounding Modes Round to Nearest Prescribed by IEEE-754 for Floating Point (FP) Affordable for FP at any precision Economical for LNS only at low precision (F<12) Restricted Faithful Unrestricted Faithful

  11. Round to Nearest Non-exactly-representable values round to the nearest of the Two possible exact representations

  12. Round to Nearest The green point is closer to the left representation

  13. Round to Nearest All values on the left, no matter how close to the midpoint, round to this representation

  14. Round to Nearest Points on the right of the midpoint round to this representation

  15. Table Makers’ Dilemma Need interpolation of sb for high precision some results are hard to round to nearest costs much more memory Relax rounding requirements Faithful rounding chooses one of two closest points increasing next-nearest points decreases memory

  16. Faithful Rounding Modes Restricted Faithful “Better than Floating Point” (BTFP) in worse case Like Round-to-Nearest except near midpoint Unrestricted Faithful Our previous simulations show it good enough for some apps Cuts LNS memory size 3- to 6-fold vs. Restricted Probabilistic Model p = probability faithful result does not round to the nearest

  17. Unrestricted Faithful p = .25 Non-exactly-representable values round to either of the Two possible exact representations

  18. Unrestricted Faithful p = .25 3/4 of the points to the left of the midpoint are rounded to the nearest 1/4 of the points to the left of the midpoint are rounded to the next-nearest

  19. Unrestricted Faithful p = .25 The situation on the right of the midpoint is similar

  20. Restricted Faithful p = .25 Non-exactly-representable values round to one of the Two possible exact representations so that the result is better than floating point (BTFP)

  21. Restricted Faithful p = .25 Values close to the left always round there (to the nearest)

  22. Restricted Faithful p = .25 Values near midpoint can round either way

  23. Restricted Faithful p = .25 Values close to the right always round there (to the nearest)

  24. Linear Interpolator high bits of z function ROM of z + with round function function ROM ROM f(z) » f(z) z z » partitioning slope ROM of ROM * * * low bits of z z

  25. Quadratic Interpolator function ROM high bits of z function of z ROM + with round f(z) slope ROM » z * partitioning multiplier ROM of zL2 low bits of z * z quad ROM ROM

  26. Partition Definitions •  : Distance between adjacent tabulated points • Interval: Width  using a particular polynomial approximation • D: Degree of polynomial • Region: Set of interval(s) with the same  • W: Width of a region • Segment: Largest region with same  • W/ = the number of words in a region

  27. Interpolator Partitioning Simple Example D D D D D D D D D D D D 1 1 1 1 1 1 1 1 2 2 2 2 Memory per region Z words 04 14 22 32 … … 0.0 1.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 Interval Interval W=1region 2 words W=1region 2 words W=1 region 4 words W=1 region 4 words W=1region 2 words 1 W=2 segment with =0.5 W=2 segment with =0.25 Summarize with # words per W=1 region

  28. Choosing Partition Points • Minimum  is determined by: • starting point of the interval, • interpolation method, such as Lagrange • degree of the polynomial, D • (D+1)th derivative of the function • To minimise memory, partition at multiples of D+1: • Linear D=1 Multiple of 2 Lewis90 • Quadratic D=2 Multiple of 3Lewis94 andProposed • To simplify partition hardware, but double memory: • Quadratic D=2 Powers of 2 Coleman • To simplify partition hardware and keep memory almost optimal: • Quadratic D=2 Multiple of 4Proposed

  29. Memory per Region: F=23 Quadratic Interpolation Z Proposed Proposed Lewis Coleman 03232128256 132 32 128256 23232128128 3323264128 41616 64 64 5 16 16 6464 6161632 64 7816 32 64 8 8 83232 98 8 1632 1048 16 32 11 4 81632 1244 832 132 4 8 32 14 2 4832 1524432 1622 4 16 17 2 2 416 1822 216 ... ... ... ... Total 234 256 768 1536

  30. Memory per Region: F=23 Quadratic Interpolation Z Proposed Proposed Lewis Coleman 03232128256 132 32 128256 23232128128 3323264128 41616 64 64 5 16 16 6464 6161632 64 7816 32 64 8 8 83232 98 8 1632 1048 16 32 11 4 81632 1244 832 132 4 8 32 14 2 4832 1524432 1622 4 16 17 2 2 416 1822 216 ... ... ... ... Total 234 256 768 1536 Power of 2 method is inefficient compared to multiple of 3 because each power of 2 partition has to take the largest number from the Lewis table within the power of two segment

  31. Memory per Region: F=23 Quadratic Interpolation Z Proposed Proposed Lewis Coleman 03232128256 132 32 128256 23232128128 3323264128 41616 64 64 5 16 16 6464 6161632 64 7816 32 64 8 8 83232 98 8 1632 1048 16 32 11 4 81632 1244 832 132 4 8 32 14 2 4832 1524432 1622 4 16 17 2 2 416 1822 216 ... ... ... ... Total 234 256 768 1536 Doesn’t fit the multiple of 3 pattern... so do multiple of 4 with first seven regions same as multiple of 3

  32. Next-nearest probability for Lewis’ Restricted Faithful Interpolator multiple-of-3 partitioning Average p= 0.0032 zend

  33. Next-nearest probability for Coleman Restricted Faithful Interpolator multiple-of-3 partitioning Average p= 0.0006 zend

  34. Next-nearest probability for Proposed Unrestricted Faithful Interpolator multiple-of-3 partitioning Average p= 0.074 zend

  35. Next-nearest probability for Proposed Unrestricted Faithful Interpolator multiple-of-4 partitioning Average p= 0.039 zend

  36. Effect of Partitioning for Quadratic Interpolation • Partitioning method interacts with rounding mode • Multiple of 3 =  increases when z increases by 3 Multiple of 4 =  increases when z increases by 4 • Power of 2 =  increases when z doubles Who Partitioning Rounding Words Probability Proposed Multiple of 3 Unrestricted Faithful 234 0.074 Proposed Multiple of 4 Unrestricted Faithful 256 0.039 Lewis Multiple of 3 Restricted Faithful 768 0.0032 Coleman et al. Power of 2 Restricted Faithful 1500 0.00063

  37. Conclusions • Round to nearest is essentially impossible for F=23 quadratic LNS interpolator. • Restricted faithful rounding has low probability (<0.003) of next nearest. • Restricted faithful rounding costs too much. • Multiple of 3 partitioning, 768 words • Power of 2 partitioning, 1500 words €¥$£

  38. Conclusions • Round to nearest is essentially impossible for F=23 quadratic LNS interpolator. • Restricted faithful rounding has low probability (<0.003) of next nearest. • Restricted faithful rounding costs too much. • Multiple of 3 partitioning, 768 words • Power of 2 partitioning, 1500 words • Unrestricted faithful rounding increases probability slightly. • Multiple of 3 partitioning, p < 0.07 • Multiple of 4 partitioning, p < 0.04 • Previous FFT study suggests p < 0.12 is OK • Unrestricted faithful rounding reduces memory cost 3- to 6-fold • Multiple of 3 partitioning, 234 words • Multiple of 4 partitioning, 256 words €¥$£ €£¥$

More Related