Upload
constance-patterson
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Unrestricted Faithful Rounding is Good Enough for Some LNS
Applications
Mark Arnold
Colin Walter
University of Manchester Institute
of Science and Technology
Why choose Logarithmic Number Systems (LNS)? Floating Point versus LNS
Round to Nearest is Hard
Restricted versus Unrestricted Faithful Rounding
FFT simulation
Multimedia Example
Conclusions
Outline
Fixed-point (FX) Scaled integer—manual rescale after multiply Hard to design, but common choice for cost-sensitive applications
Floating-point IEEE-754 (FP)Exponent provides automatic scaling for mantissa Easier to use but more expensive
Logarithmic Number System (LNS)Converts to logarithms once—keep as log during computationEasy as FP, can be faster, cheaper, lower power than FX
Arithmetic Choices
Cheaper multiply, divide, square root
Good for applications with high proportion of multiplications
Multimedia example coming
Advantages of LNS
1 2 3 4 5 6 7891 2 3 4 5 6 7891
1 2 3 4 5 6 7891 2 3 4 5 67891
log(2)
log(2) + log(3) = log(6)
Most significant bits change less frequently: power savings
log(3)
Motorola: 120MHz LNS 1GFLOP chip [pan99]
European Union: LNS microprocessor [col00]
Yamaha: Music Synthesizer [kah98]
Boeing: Aircraft controls
Interactive Machines,Inc.: IMI-500: Animation for Jay Jay the Jet Plane
Commercial Interest in LNS
Notation
x = real values,
X = corresponding logarithmic representations
b = base of the logarithm (b=2 is typical)
F = precision
2F__ = b , i.e., the smallest value > 1.0
if x is exact LNS value, x is next larger exact value
= logb(), the LNS representation of X and X+ are the corresponding representations
Given X = logb(x) and Y = logb(y): Why it works:1. Let Z = X-Y 1. Z = logb(x/y)2. Lookup sb(Z) = logb(1+bZ) 2. sb(Z) = log b(1+x/y)3. T = Y + sb(Z) 3. T = logb(y(1+x/y)) Thus, T = logb(y + x)
Hardware: 1 subtractor1 function approximation unit
lookup table (ROM or RAM) for F<12interpolation for higher precision
1 adder
Similar function, db, for subtraction
LNS Addition
Z
ZH
ZL
sb(ZH+)
-*
dual-port
memory
+
sb(ZH)
sb(Z)+
Stored Value Linear Interpolation
Virtex FPGA has dual-port block RAM
Can obtain adjacent tabulated points in one cycle
1.0
4.0
2.0
Floating Point versus LNS
Exactly representable points shown for precision F=2
Floating Point
LNS
= 42
7 8 = 4.0
4 = 2.0
Floating point has greater relative error here
1.0
4.0
2.0
Floating Point versus LNS
LNS
Floating Point
than LNS has here
Floating point has greater relative error here
1.0
4.0
2.0
Floating Point versus LNS
LNS
Floating Point
1.0
4.0
2.0
Floating Point versus LNS
Discrete change in distance between representable values in floating point causes wobble in relative precision
LNS
Floating Point
1.0
4.0
2.0
Discrete change in distance between representable values in floating point causes wobble in relative precision
Continuous change in distance between representable values in LNS means constant relative precision
Floating Point versus LNS
LNS
Floating Point
1.0
4.0
2.0
Floating Point versus LNS
LNS
Floating Point
Focus analysis between 1.0 and All other cases are analogs
Lewis’ Observation:Round to Nearest LNSln(2) better than FP!Margin for interp error yet still be BTFP
Rounding Modes
Round to NearestPrescribed by IEEE-754 for Floating Point (FP)Affordable for FP at any precisionEconomical for LNS only at low precision (F<12)
Unrestricted Faithful
Restricted Faithful
Non-exactly-representable values
Two possible exact representations
round to the nearest of the
Round to Nearest
All values on the left, no matter how close to the midpoint, round to this representation
Round to Nearest
All values on the left, no matter how close to the midpoint, round to this representation
Round to Nearest
Table Makers’ Dilemma
Need interpolation of sb and db for moderate to high precisionsome results will be hard to round to nearestwould cost much more memory
Relax rounding requirmentsFaithful rounding chooses one of two closest pointsMore next-nearest points decreases memory requirements for sb
Table Requirements for 32-bit AdditionProposed Unrestricted Faithful 234 (words)
Lewis Restricted Faithful 768 Coleman et al. Restricted Faithful 1500 Swartzlander Round to Nearest 228
Faithful Rounding Modes
Unrestricted FaithfulAllowed by the Brown Model for Floating PointProposed here as “good enough” for some LNS appsCuts LNS memory size 3- to 6- fold vs. Restricted
Restricted FaithfulHigher LNS Precision (F=23) than Round-to-Nearest FP“Better than Floating Point” (BTFP) in worse caseLike Round-to-Nearest except near midpoint
Probabilistic Model
p = probability faithful result does not round to the nearest
Restricted Faithful 0 < p <
= distance (in log domain) from midpoint in whichrounding either way is permitted
= / ( / 2) = .443 = max probability allowed for BTFPp = .443 acts like FP
Round to Nearest p = 0 Restricted or Unrestricted
Unrestricted Faithful 0 < p < 1.0
Non-exactly-representable values
Two possible exact representations
round to either of the
Unrestricted Faithful p = .25
On average, 3 out of 4 green points will round to the nearest because p=.25
Unrestricted Faithful p = .25
On average, 3 out of 4 green points will round to the nearest because p=.25
Unrestricted Faithful p = .25
About 1/4 of the time, the blue point rounds here to a less accurate representation
Values slightly left ot the midpoint (blue) generally round here
Unrestricted Faithful p = .25
3/4 of the points to the left of the midpoint are rounded to the nearest
1/4 of the points to the left of the midpoint are rounded to the next-nearest
Unrestricted Faithful p = .25
An occasional purple point rounds to the next nearestrepresentation
Most purple points round here
Unrestricted Faithful p = .25
Non-exactly-representable values
Two possible exact representations
round to one of the
so that the result is better than floating point (BTFP)
Restricted Faithful p = .25
Values slightly left ot the midpoint (blue) generally round here
More than 1/4 of the blue points round here (adjusted for the width of the midpoint region, )
Restricted Faithful p = .25
Values slightly left ot the midpoint (blue) generally round here
More than 1/4 of the blue points round here (adjusted for the width of the midpoint region, )
Restricted Faithful p = .25
Values slightly left ot the midpoint (blue) generally round here
More than 1/4 of the blue points round here (adjusted for the width of the midpoint region, )
Restricted Faithful p = .25
Real values: __ 1.0 =2/2 2/2+ 1+loge(2)/2 1+loge(2)(/2+)
LogarithmicRepresentations: 0 /2 /2+ = log()
Ignore this half
(symetrical)
BTFP Figure 1: p. 246
Simulation
2n-point radix-two FFT n 2n Complex MACs4n 2n LNS additions and subtractionserror in first stage can propagate to all results
Compare F-bit arithmetics against 64-bit Floating PointInput: n-point 25% duty square wave+white noiseFor 5 < n < 10, 17 < F < 26, note RMS errors:
EP = RMS error for Restricted rounding
UP = RMS error for Unrestricted rounding
Re-run 250 times
0.70.720.740.760.780.8
0.820.840.860.880.9
1 2 3 4 5 6
Simulation: Ep/E, n=11
Ep/E 0.71 + 0.73
.01 .02 .03 .04 .05 .06 p
0.70.720.740.760.780.8
0.820.840.860.880.9
1 2 3 4 5 6
Simulation: Up/E, n=11
Up/E 0.72 + 2.25
.01 .02 .03 .04 .05 .06 p
64-bit Floating PointRound to Nearest (p=0)
14-bit Fixed Point Round to Nearest
MPEG Frame: Conventional Arithmetics
8 x 8 Inverse Discrete Cosine Transforms
11-bit LNSRound to Nearest (p=0)
11-bit LNS Unrestricted Faithful (p=.5)
MPEG Frame: LNS Arithmetic
8 x 8 Inverse Discrete Cosine Transforms
10-bit LNSRound to Nearest (p=0)
10-bit LNS Unrestricted Faithful (p=.5)
MPEG Frame: LNS Arithmetic
8 x 8 Inverse Discrete Cosine Transforms