1 / 90

Improved Cotransformation for Logarithmic Number System (LNS) Subtraction

Improved Cotransformation for Logarithmic Number System (LNS) Subtraction. Mark Arnold University of Manchester Institute of Science and Technology. Outline. Advantages of Logarithmic Number Systems (LNS) LNS Addition with interpolation of s b Subtraction and d b

langer
Télécharger la présentation

Improved Cotransformation for Logarithmic Number System (LNS) Subtraction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improved Cotransformation for Logarithmic Number System (LNS) Subtraction Mark Arnold University of Manchester Institute of Science and Technology

  2. Outline Advantages of Logarithmic Number Systems (LNS) LNS Addition with interpolation of sb Subtraction and db Cotransformation overcomes singularity of db Coleman’s cotransformation Arnold’s cotransformation New cotransformation Simulation Results Conclusions

  3. Arithmetic Choices • Fixed-point (FX) • Scaled integer—manual rescale after multiply • Hard to design, but common choice for cost-sensitive applications • Floating-point IEEE-754 (FP) • Exponent provides automatic scaling for mantissa • Easier to use but more expensive • Logarithmic Number System (LNS) • Converts to logarithms once—keep as log during computation • Easy as FP, can be faster, cheaper, lower power than FX

  4. 1 2 3 4 5 67891 2 3 4 567891 1 2 3 4 567891 2 3 4 567891 Advantages of LNS • Cheaper multiply, divide, square root • Good for applications with high proportion of multiplications log(3) log(2) log(2) + log(3) = log(6) • Most significant bits change less frequently: power savings • Table-based arithmetic ideal for FPGAs

  5. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite

  6. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € )

  7. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € ) Yamaha: Music Synthesizer

  8. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € ) Yamaha: Music Synthesizer Boeing: Aircraft controls

  9. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € ) Yamaha: Music Synthesizer Boeing: Aircraft controls Interactive Machines,Inc.:IMI-500: Animation forJay Jay the Jet Plane

  10. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € ) Yamaha: Music Synthesizer Boeing: Aircraft controls Interactive Machines,Inc.:IMI-500: Animation forJay Jay the Jet Plane Advanced Rendering Hardware Ray-Tracing Engine Technologies:

  11. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € ) Yamaha: Music Synthesizer Boeing: Aircraft controls Interactive Machines,Inc.:IMI-500: Animation forJay Jay the Jet Plane Advanced Rendering Hardware Ray-Tracing Engine Technologies: Cambridge/Microsoft: HTK Hidden Markov Model Toolkit

  12. Commercial and Practical Interest in LNS Motorola: 120MHz LNS 1GFLOP chip for satellite European Union: LNS microprocessor (Coleman 800K € ) Yamaha: Music Synthesizer Boeing: Aircraft controls Interactive Machines,Inc.:IMI-500: Animation forJay Jay the Jet Plane Advanced Rendering Hardware Ray-Tracing Engine Technologies: Cambridge/Microsoft: HTK Hidden Markov Model Toolkit Univ. of Tokyo: N-body Gravity Pipeline (GRAPE) Won 1999 Gordon Bell Prize

  13. Notation upper-case variables (e.g., X) = real value,

  14. Notation upper-case variables (e.g., X) = real value, lower-case variables (e.g., x) = corresponding logarithmic representation

  15. Notation upper-case variables (e.g., X) = real value, lower-case variables (e.g., x) = corresponding logarithmic representation b = base of the logarithm (b=2 is typical)

  16. LNS Addition Given x = logb(X) and y = logb(Y): Why it works:

  17. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y

  18. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y)

  19. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ)

  20. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y)

  21. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z)

  22. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z) 3. t = logb(Y(1+X/Y))

  23. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z) 3. t = logb(Y(1+X/Y)) Thus, t = logb(Y + X)

  24. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z) 3. t = logb(Y(1+X/Y)) Thus, t = logb(Y + X) Hardware: 1 subtractor - x y

  25. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z) 3. t = logb(Y(1+X/Y)) Thus, t = logb(Y + X) Hardware: 1 subtractor 1 function approximation unit - x sb(z) y

  26. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z) 3. t = logb(Y(1+X/Y)) Thus, t = logb(Y + X) Hardware: 1 subtractor 1 function approximation unit 1 adder - + x sb(z) t y

  27. LNS Addition Given x = logb(X) and y = logb(Y): Why it works: 1. Let z = x - y 1. z = logb(X/Y) 2. Lookup sb(z) = logb(1+bZ) 2. sb(z) = logb(1+X/Y) 3. t = y + sb(z) 3. t = logb(Y(1+X/Y)) Thus, t = logb(Y + X) Hardware: 1 subtractor 1 function approximation unit 1 adder History Leonelli 1803 Gauss 1812 Matula and Marasa 1969 Kingsbury and Rayner 1971 Swartzlander et. al. 1975 Lee and Edgar 1977 Barlow and Bareiss 1985

  28. Plot of sb(z) y y = sb(z) z y=z

  29. Ways to reduce sb table size: Don’t tabulate for positive z (only tabulate for z < 0 ): sb(z) = sb(- z) + z

  30. Ways to reduce sb table size: Don’t tabulate for positive z (only tabulate for z < 0 ): sb(z) = sb(- z) + z Don’t tabulate for large |z|: sb(-z)  0 if z > precision

  31. Ways to reduce sb table size: Don’t tabulate for positive z (only tabulate for z < 0 ): sb(z) = sb(- z) + z Don’t tabulate for large |z|: sb(-z)  0 if z > precision Interpolate from a smaller table: cuts number of address bits in half

  32. Linear Interpolator high bits of z function ROM of z + with round function function ROM ROM f(z) » f(z) z z » partitioning slope ROM of ROM * * * low bits of z z

  33. Subtraction Similar to addition except: uses db(z)=logb|1-bZ| instead of sb(z)

  34. Subtraction Similar to addition except: uses db(z)=logb|1-bZ| instead of sb(z) y y = sb(z) y = db(z) z y = db(z) y=z

  35. Subtraction Similar to addition except: uses db(z)=logb|1-bZ| instead of sb(z) db harder to interpolate due to singularity near z=0 y y = sb(z) y = db(z) z y = db(z) y=z Singularity

  36. Prior solutions for db singularity • 1. Partition range of z with non-uniform [lew90] • Precision 15 17 19 21 23 • sb bits 0.3K 1K 2K 5K 10K • db bits 1K 4K 10K 24K 60K • Problem: db takes most of the ROM

  37. Prior solutions for db singularity • 1. Partition range of z with non-uniform [lew90] • Precision 15 17 19 21 23 • sb bits 0.3K 1K 2K 5K 10K • db bits 1K 4K 10K 24K 60K • Problem: db takes most of the ROM • 2. Use Arnold’s cotransformation [arn97] to convert db to sb • db(zH+zL) = db(zL) + sb(zL + db(zH) – db(zL)), where zH>0 and zL>0 • db(zH) and db(zL) are in tables • Problem: Doesn’t work with z<0, as needed for table reduction

  38. Prior solutions for db singularity • 1. Partition range of z with non-uniform [lew90] • Precision 15 17 19 21 23 • sb bits 0.3K 1K 2K 5K 10K • db bits 1K 4K 10K 24K 60K • Problem: db takes most of the ROM • 2. Use Arnold’s cotransformation [arn97] to convert db to sb • db(zH+zL) = db(zL) + sb(zL + db(zH) – db(zL)), where zH>0 and zL>0 • db(zH) and db(zL) are in tables • Problem: Doesn’t work with z<0, as needed for table reduction • 3. Use Colman’s cotransformation[col95] to convert dbaway from 0: • db(zH+zL) = db(zL) + db(zL + db(zH) – db(zL)) , where zH<0 and zL>0 • Problem: Not as accurate as Arnold’s • Needs dbinterpolator rather thansb • Needs more input guard bits to interpolator.

  39. Prior solutions for db singularity • 1. Partition range of z with non-uniform [lew90] • Precision 15 17 19 21 23 • sb bits 0.3K 1K 2K 5K 10K • db bits 1K 4K 10K 24K 60K • Problem: db takes most of the ROM • 2. Use Arnold’s cotransformation [arn97] to convert db to sb • db(zH+zL) = db(zL) + sb(zL + db(zH) – db(zL)), where zH>0 and zL>0 • db(zH) and db(zL) are in tables • Problem: Doesn’t work with z<0, as needed for table reduction • 3. Use Colman’s cotransformation[col95] to convert dbaway from 0: • db(zH+zL) = db(zL) + db(zL + db(zH) – db(zL)) , where zH<0 and zL>0 • Problem: Not as accurate as Arnold’s • Needs dbinterpolator rather thansb • Needs more input guard bits to interpolator. • This talk reviews cotransformation and • proposes a new cotransformation that • overcomes these problems

  40. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2|

  41. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)),

  42. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Z1 * |Z2 - 1| |1 - Z1|

  43. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Remember db(z1)=log|1-bz1| Z1 * |Z2 - 1| |1 - Z1|

  44. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| *|1 - | Z1 * |Z2 - 1| |1 - Z1|

  45. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Remember db(z)=log|1-bz| Z1 * |Z2 - 1| |1 - Z1|

  46. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Z1 * |Z2 - 1| |1 - Z1|

  47. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Remember db(z1)=log|1-bz1| Z1 * |Z2 - 1| |1 - Z1|

  48. Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Remember db(z2)=log|1-bz2| Z1 * |Z2 - 1| |1 - Z1|

  49. = (1 - Z1) *() Z1 * (Z2 - 1) 1 - Z1 1 - Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Z1 * |Z2 - 1| |1 - Z1|

  50. = (1 - Z1) *() Z1 * (Z2 - 1) 1 - Z1 1 - Review of Prior Cotransformations Choose z1 and z2 such that z = z1+z2 Z1 = bz1 Z2 = bz2 Z = Z1 *Z2 = bz1 bz2 = bz1+z2 = bz T = bt = |1-Z| = |1 - Z1*Z2| Coleman [col95] assumes that Z1 <1 and Z2 >0 so z1<0 and z2>0 t = db (z1) + db (z1 + db(z2) - db (z1)), T = |1 - Z1| * |1 - | Z1 * |Z2 - 1| |1 - Z1| = 1- Z1 - Z1*Z2 + Z1 = 1 - Z1 * Z2 = 1-bz

More Related