Digital DesignFloating-Point Number-0 CS3104: 數位系統導論 Principles of Digital Design [project2] floating-point number addition 吳中浩教授助教高鵬程國立清華大學資訊工程學系

Digital Design Floating-Point Number-1

CS3104: 數位系統導論Principles of Digital Design

[project2] floating-point number addition

吳中浩教授助教高鵬程

國立清華大學資訊工程學系八十七學年度第一學期


Scientific Notation

6.02 x 10 1.673 x 1023 -24

exponent

radix (base)Mantissa

decimal point

Sign, magnitude

Sign, magnitude

IEEE F.P. (SP):1.M x 2e - 127

Issues:Arithmetic (+, -, *, / )Representation, Normal formRange and PrecisionRoundingExceptions (e.g., divide by zero, overflow, underflow)ErrorsProperties ( negation, inversion, if A = B then A - B = 0 )


Floating-Point RepresentationRepresentation of floating point numbers in IEEE 754 standard:

single precision1 8 23

sign

exponent:excess 127binary integer

mantissa:sign + magnitude, normalizedbinary significand w/ hiddeninteger bit: 1.M

actual exponent ise = E - 127

S E M

N = (-1) 2 (1.M)S E-127

0 < E < 255

Magnitude of numbers that can be represented is in the range:

2-126

(1.0) to 2127

(2 - 223)

which is approximately:

1.8 x 10-38

to 3.40 x 10 38

(integer comparison valid on IEEE Fl.Pt. numbers of same sign!)


Examples:

Floating-Point Representation

-0.75 = -0.11x2**0 = -1.1x2**(-1) = 1 01111110 10000…i.e. (-1) x (1+ .1000 0000 0000 0000 0000 000) x (2 )

0 = 0 00000000 0 . . . 0 -1.5 = 1 01111111 10 . . . 0

1 126


Floating-Point Addition

Basic addition algorithm:Illustrate: 9.999ten x 10 + 1.610ten x 10

Assume that 4 decimal digits of the significand and two

decimal digits of the exponent.

STEP 1: compute Ye - Xe (getting ready to align binary point)

STEP 2: right shift the smaller number, e.g., Xm, that many

positions to form Xm 2 EX: 1.610ten x 10 =0.1610ten x 10 = 0.01610ten x 10

If we can represent only four decimal digits, so the

number is really:

0.016ten x 10

SETP 3: compute Xm 2 + Ym EX: 9.999ten

+ 0.016ten

10.015ten The sum is 10.015ten x 10

Xe-Ye

Xe-Ye

-1 0 1

1

1

-11


Floating-Point Addition

Basic addition algorithm (continue) :if representation demands normalization, then normalize:

STEP 4: left shift result, decrement result exponent (e.g.,0.001xx)

right shift result, increment result exponent (e.g., 101.1xx)

EX: we pick up the normalized form:

10.015ten x 10 = 1.0015ten x 10

NOTE: check overflow or underflow during the shift

STEP 5: round the mantissa continue until MSB of data is 1 (NOTE: Hidden bit in IEEE Standard)

EX: we must round the number :

1.0015tenx10 = 1.002 x 10

1 2

2 2


Extra Bits for Accuracy Extra bits during intermediate calculations to

help rounding, to get closer to actual number Two extra bits on the right in IEEE 754: guard and round E.g., base = 10

precision = 3 bit

Guard bits: digits to the right of mantissa to guard against loss of digits => can later be shifted left into mantissa during normalization (especially when the two numbers are very close, or when multiplication)

Round bits: after the guards being shifted into mantissa, the result can be rounded according to the round bits

Sticky bit: additional bit to the right of the round digit to better fine tune rounding

0 2 1.69

0 0 7.85

0 2 1.61

= 1.6900 * 10

= - .0785 * 10

= 1.6115 * 10

2-bias

2-bias

2-bias

-

d0 . d1 d2 d3 . . . dp-1 0 0 0 0 . 0 0 X . . . X X X S X X S

+Sticky bit: set to 1 if any 1 bits fall off the end of the round digit


Always round up (toward + ) Always round down (toward -) Truncate Round-to-nearest-even

Four Rounding Modes

Number Round(x) Error Number Round(x) Error

X0.00 X0. 0 X1.00 X1. 0

X0.01 X0. -1/4 X1.01 X1. -1/4

X0.10 X0. -1/2 X1.10 X1.+1 +1/2

X0.11 X1. +1/4 X1.11 X1.+1 +1/4

100.

011.

010.

001.

000.

00.0 00.1 01.0 01.1 10.0 10.1 11.0 11.1x

Round-to-nearest-even(x)


Example with G, R, and S bits

sign exponent mantissa GRSA 0 1000 0011(131) 1.1000 0010 1100 0000 0000 000 000B 0 0111 1111(127) 1.0000 0011 0000 0101 1001 010 000B aligned 0 1000 0010(131) 0.0001 0000 0011 0000 0101 100 101A-B 0 1000 0010(131) 1.0111 0010 1000 1111 1010 011 011Postnormalization 0 1000 0001(131) 1.0111 0010 1000 1111 1010 011 011

RSRounding 0 1000 0001(131) 1.0111 0010 1000 1111 1010 011 01Result 0 1000 0001(131) 1.0111 0010 1000 1111 1010 011

Note1: The original G bit can serve an R bit, and the original R and S bits must be ORed in order to generate a new sticky bit.

Note2: If the Boolean expression R•S + R •S’•L = R • (S + L) where L is LSB of the resultant significand equals 1, we must round up.


Flow of Floating-Point Addition

Start

1. Compare the exponents of the two numbers.Shift the smaller number to right until its exponent would match the larger exponent.

2. Add the significands

3. Normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent.

4. Round the significand to the appropriate number of bits

Done

Overflow or Underflow? ExceptionYes

No

Still normalized?Yes

No


Architecture ExampleSign Exponent Significand Sign Exponent Significand

Small ALU

Big ALU

0 1Exponentdifference

0 10 1

Increment orDecrement

Shift left or right

Sign Exponent Significand

Rounding Hardware

Control

Shift right


Schedule and score rules

Schedule:

Project 2-1 (deadline on 11/30): Step1 and 2

Project 2-2 (deadline on 12/21): Step3 (Post-normalization)

Project 2-3 (deadline on 01/11): Step4 (Rounding) Score rules:

(1). Completeness.

(2). Modulize.

(3). Architecture.

Documents

Digital DesignFloating-Point Number-0 CS3104: 數位系統導論 Principles of Digital Design [project2] floating-point number addition 吳中浩 教授 助教 高鵬程 國立清華大學資訊工程學系

Digital DesignFloating-Point Number-0 CS3104: 數位系統導論 Principles of Digital Design [project2] floating-point number addition 吳中浩教授助教高鵬程國立清華大學資訊工程學系