Upload
estella-lynch
View
231
Download
0
Tags:
Embed Size (px)
Citation preview
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 1
Part IIIThe Arithmetic/Logic Unit
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 2
About This Presentation
This presentation is intended to support the use of the textbook Computer Architecture: From Microprocessors to Supercomputers, Oxford University Press, 2005, ISBN 0-19-515455-X. It is updated regularly by the author as part of his teaching of the upper-division course ECE 154, Introduction to Computer Architecture, at the University of California, Santa Barbara. Instructors can use these slides freely in classroom teaching and for other educational purposes. Any other use is strictly prohibited. ©
Behrooz Parhami
Edition Released Revised Revised Revised Revised
First July 2003 July 2004 July 2005 Mar. 2006 Jan. 2007
Jan. 2008 Jan. 2009 Jan. 2011
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 3
III The Arithmetic/Logic Unit
Topics in This Part
Chapter 9 Number Representation
Chapter 10 Adders and Simple ALUs
Chapter 11 Multipliers and Dividers
Chapter 12 Floating-Point Arithmetic
Overview of computer arithmetic and ALU design:• Review representation methods for signed integers• Discuss algorithms & hardware for arithmetic ops• Consider floating-point representation & arithmetic
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 4
Preview of Arithmetic Unit in the Data Path
Fig. 13.3 Key elements of the single-cycle MicroMIPS data path.
/
ALU
Data cache
Instr cache
Next addr
Reg file
op
jta
fn
inst
imm
rs (rs)
(rt)
Data addr
Data in 0
1
ALUSrc ALUFunc DataWrite
DataRead
SE
RegInSrc
rt
rd
RegDst RegWrite
32 / 16
Register input
Data out
Func
ALUOvfl
Ovfl
31
0 1 2
Next PC
Incr PC
(PC)
Br&Jump
ALU out
PC
0 1 2
Instruction fetch Reg access / decode ALU operation Data access
Register writeback
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 5
9 Number Representation Arguably the most important topic in computer arithmetic:
• Affects system compatibility and ease of arithmetic• Two’s complement, flp, and unconventional methods
Topics in This Chapter
9.1 Positional Number Systems
9.2 Digit Sets and Encodings
9.3 Number-Radix Conversion
9.4 Signed Integers
9.5 Fixed-Point Numbers
9.6 Floating-Point Numbers
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 6
9.1 Positional Number Systems
Representations of natural numbers {0, 1, 2, 3, …}
||||| ||||| ||||| ||||| ||||| || sticks or unary code 27 radix-10 or decimal code 11011 radix-2 or binary code XXVII Roman numerals
Fixed-radix positional representation with k digits
Value of a number: x = (xk–1xk–2 . . . x1x0)r = S xi r i
For example: 27 = (11011)two = (124) + (123) + (022) + (121) +
(120)
k–1
i=0
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 7
Unsigned Binary Integers
Figure 9.1 Schematic representation of 4-bit code for integers in [0, 15].
0000 0001 1111
0010 1110
0011 1101
0100 1100
1000
0101 1011
0110 1010
0111 1001
0 1
2
3
4
5
6 7
15
11
14
13
12
8 9
10
Inside: Natural number Outside: 4-bit encoding
0
1 2
3
15
4
5 6
7 8 9
Turn x notches counterclockwise
to add x
Turn y notches clockwise
to subtract y
11
14 13
12
10
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 8
9.2 Digit Sets and Encodings
Conventional digit sets
• Decimal digits in [0, 9]
• 4-bit Binary Coded Decimal (BCD)
• 8-bit ASCII
• Hexadecimal (Hex): digits 0-9 & A-F
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 9
9.3 Number Radix Conversion
Perform arithmetic in the new radix R Suitable for conversion from radix r to radix 10
Horner’s rule: (xk–1xk–2 . . . x1x0)r = (…((0 + xk–1)r + xk–2)r + . . . + x1)r + x0 (1 0 1 1 0 1 0 1)two = 0 + 1 1 2 + 0 2 2 + 1 5 2 + 1
11 2 + 0 22 2 + 1 45 2 + 0 90 2 + 1 181
Perform arithmetic in the old radix r Suitable for conversion from radix 10 to radix R
Divide the number by R, use the remainder as the LSD and the quotient to repeat the process
19 / 3 rem 1, quo 6 / 3 rem 0, quo 2 / 3 rem 2, quo 0 Thus, 19 = (2 0 1)three
Two ways to convert numbers from an old radix r to a new radix R
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 10
9.4 Signed Integers
We dealt with representing the natural numbers
Signed or directed whole numbers = integers
{ . . . , -3, -2, -1, 0, 1, 2, 3, . . . }
Signed-magnitude representation
+27 in 8-bit signed-magnitude binary code 0 0011011 –27 in 8-bit signed-magnitude binary code 1 0011011
Biased representation
Represent the interval of numbers [-N, P] by the unsigned interval [0, P + N]; i.e., by adding N to every number
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 11
Two’s-Complement Representation
Figure 9.5 Schematic representation of 4-bit 2’s-complement code for integers in [–8, +7].
0000 0001 1111
0010 1110
0011 1101
0100 1100
1000
0101 1011
0110 1010
0111 1001
+0 +1
+2
+3
+4
+5
+6 +7
–1
–5
–2
–3
–4
–8 –7
–6
+ _ 0
1 2
3
–1
4 5
6 7
–8
–7
Turn x notches counterclockwise
to add x
Turn 16 – y notches counterclockwise to add –y (subtract y)
–5
–2 –3
–4
–6
With k bits, numbers in the range [–2k–1, 2k–1 – 1] represented.Negation is performed by inverting all bits and adding 1.
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 12
Conversion from 2’s-Complement to Decimal
Example 9.7
Convert x = (1 0 1 1 0 1 0 1)2’s-compl to decimal.
Solution
Given that x is negative, one could change its sign and evaluate –x.
Shortcut: Use Horner’s rule, but take the MSB as negative –1 2 + 0 –2 2 + 1 –3 2 + 1 –5 2 + 0 –10 2 + 1 –19
2 + 0 –38 2 + 1 –75
Example 9.8
Sign Change for a 2’s-Complement Number
Given y = (1 0 1 1 0 1 0 1)2’s-compl, find the representation of –y.
Solution
–y = (0 1 0 0 1 0 1 0) + 1 = (0 1 0 0 1 0 1 1)2’s-compl (i.e., 75)
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 13
Two’s-Complement Addition and Subtraction
Figure 9.6 Binary adder used as 2’s-complement adder/subtractor.
AddSub
x y
y
x
k /
k /
k /
y or y
Adder
c out
c in
k /
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 14
9.5 Fixed-Point NumbersPositional representation: k whole and l fractional digits
Value of a number: x = (xk–1xk–2 . . . x1x0 . x–1x–2 . . . x–l )r = S xi r i
For example:
2.375 = (10.011)two = (121) + (020) + (02-1) + (12-2) + (12-3)
Numbers in the range [0, rk – ulp] representable, where ulp = r –l
Fixed-point arithmetic same as integer arithmetic (radix point implied, not explicit)
Two’s complement properties (including sign change) hold here as well:
(01.011)2’s-compl = (–021) + (120) + (02–1) + (12–2) + (12–3) = +1.375
(11.011)2’s-compl = (–121) + (120) + (02–1) + (12–2) + (12–3) = –0.625
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 15
Fixed-Point 2’s-Complement Numbers
Figure 9.7 Schematic representation of 4-bit 2’s-complement encoding for (1 + 3)-bit fixed-point numbers in the range [–1, +7/8].
0.000 0.001 1.111
0.010 1.110
0.011 1.101
0.100 1.100
1.000
0.101 1.011
0.110 1.010
0.111 1.001
+0 +.125
+.25
+.375
+.5
+.625
+.75
+.875
–.125
–.625
–.25
–.375
–.5
–1 –.875
–.75
+ _
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 16
9.6 Floating-Point Numbers
Fixed-point representation must sacrifice precision for small values to represent large values
x = (0000 0000 . 0000 1001)two Small number
y = (1001 0000 . 0000 0000)two Large number
Neither y2 nor y / x is representable in the format above
Floating-point representation is like scientific notation: -20 000 000 = -2 10 7 +0.000 000 007 = +7 10–9
Useful for applications where very large and very small numbers are needed simultaneously
Also, 7E-9Significand
ExponentExponent base
Sign
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 17
ANSI/IEEE Standard Floating-Point Format (IEEE 754)
Figure 9.8 The two ANSI/IEEE standard floating-point formats.
Short (32-bit) format
Long (64-bit) format
Sign Exponent Significand
8 bits, bias = 127, –126 to 127
11 bits, bias = 1023, –1022 to 1023
52 bits for fractional part (plus hidden 1 in integer part)
23 bits for fractional part (plus hidden 1 in integer part)
Short exponent range is –127 to 128but the two extreme values
are reserved for special operands(similarly for the long format)
Revision (IEEE 754R) was completed in 2008: The revised version includes 16-bit and 128-bit binary formats, as well as 64- and 128-bit decimal formats
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 18
Short and Long IEEE 754 Formats: Features
Table 9.1 Some features of ANSI/IEEE standard floating-point formats Feature Single/Short Double/LongWord width in bits 32 64Significand in bits 23 + 1 hidden 52 + 1 hiddenSignificand range [1, 2 – 2–23] [1, 2 – 2–52]Exponent bits 8 11Exponent bias 127 1023Zero (±0) e + bias = 0, f = 0 e + bias = 0, f = 0Denormal e + bias = 0, f ≠ 0
represents ±0.f 2–126e + bias = 0, f ≠ 0represents ±0.f 2–1022
Infinity (∞) e + bias = 255, f = 0 e + bias = 2047, f = 0Not-a-number (NaN) e + bias = 255, f ≠ 0 e + bias = 2047, f ≠ 0Ordinary number e + bias [1, 254]
e [–126, 127]represents 1.f 2e
e + bias [1, 2046]e [–1022, 1023]represents 1.f 2e
min 2–126 1.2 10–38 2–1022 2.2 10–308
max 2128 3.4 1038 21024 1.8 10308
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 19
10 Adders and Simple ALUs Addition is the most important arith operation in computers:
• Even the simplest computers must have an adder• An adder, plus a little extra logic, forms a simple ALU
Topics in This Chapter
10.1 Simple Adders
10.2 Carry Propagation Networks
10.3 Counting and Incrementation
10.4 Design of Fast Adders
10.5 Logic and Shift Operations
10.6 Multifunction ALUs
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 20
10.1 Simple Adders
Figures 10.1/10.2 Binary half-adder (HA) and full-adder (FA).
x y c s 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 0
Inputs Outputs
HA
x y
c
s
x y c c s 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1
Inputs Outputs
c out c in
out in x
y
s
FA
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 21
Ripple-Carry Adder: Slow But Simple
Figure 10.4 Ripple-carry binary adder with 32-bit inputs and output.
x
s
y
c c
x
s
y
c
x
s
y
c
c out c in
0 0
0
c 0
1 1
1
1 2
31
31
31
31
FA FA FA 32 . . .
Critical path
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 22
10.2 Carry Propagation Networks
Figure 10.5 The main part of an adder is the carry network. The rest is just a set of gates to produce the g and p signals and the sum bits.
Carry network
. . . . . .
x i y i
g p
s
i i
i
c i c i+1
c k 1
c k
c k 2 c 1
c 0
g p 1 1 g p 0 0
g p k 2 k 2 g p i+1 i+1 g p k 1 k 1
c 0 . . . . . .
0 0 0 1 1 0 1 1
annihilated or killed propagated generated (impossible)
Carry is: g i p i
gi = xi yi pi = xi yi
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 23
Ripple-Carry Adder Revisited
Figure 10.6 The carry propagation network of a ripple-carry adder.
. . . c
k 1
c
k c
k 2
c
1
g
p
1
1
g
p
0
0
g
p
k 2
k 2
g
p
k 1
k 1
c
0 c
2
The carry recurrence: ci+1 = gi pi ci
Latency of k-bit adder is roughly 2k gate delays:
1 gate delay for production of p and g signals, plus 2(k – 1) gate delays for carry propagation, plus1 XOR gate delay for generation of the sum bits
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 24
First Carry Speed-Up Method: Carry Skip
Figures 10.7/10.8 A 4-bit section of a ripple-carry network with skip paths and the driving analogy.
c
g
p
4j+1
4j+1
g
p
4j
4j
g
p
4j+2
4j+2
g
p
4j+3
4j+3
c
4j
4j+4
c
4j+3
c
4j+2
c
4j+1
One-way street
Freeway
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 25
10.3 Counting and Incrementation
Figure 10.9 Schematic diagram of an initializable synchronous counter.
D Q
C _ Q
D
c out
c in
Adder
Update
/ k
k /
a (Increment
amount)
Count register k
/
1
0
Data in
k /
k /
IncrInit
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 26
Carries can be computed directly without propagation
For example, by unrolling the equation for c3, we get:
c3 = g2 p2 c2 = g2 p2 g1 p2 p1 g0 p2 p1 p0 c0
We define “generate” and “propagate” signals for a block extending from bit position a to bit position b as follows:
g[a,b] = gb pb gb–1 pb pb–1 gb–2 . . . pb pb–1 … pa+1 ga
p[a,b] = pb pb–1 . . . pa+1 pa
10.4 Design of Fast Adders
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 27
Carry-Lookahead Logic with 4-Bit Block
Figure 10.13 Blocks needed in the design of carry-lookahead adders with four-way grouping of bits.
Blo
ck s
ign
al g
en
era
tion
p [i, i+3]
c i
Inte
rme
idte
ca
rrie
s
c i+1 c i+2 c i+3 g [i, i+3]
p i+3 g i+3 p i+2 g i+2 p i+1 g i+1 p i g i
Carry Look-Ahead Adder
418_04 28
16-bit CLA Adder
418_04 29
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 30
Third Carry Speed-Up Method: Carry Select
Figure 10.14 Carry-select addition principle.
c out c in Adder
Version 1 of sum bits 1
0
x [a, b ]
c out c in Adder
Version 0 of sum bits
y [a, b]
s [a, b]
c a
0 1
Allows doubling of adder width with a single-mux additional delay
The lowera positions, (0 to a – 1) are added as usual
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 31
10.5 Logic and Shift Operations
Conceptually, shifts can be implemented by multiplexing
Figure 10.15 Multiplexer-based logical shifting unit.
Multiplexer
0 1 2 31 32 33 62 63
5
6
Right’Left Shift amount 0, x[31, 1]
x[31, 0]
00, x[30, 2]
00...0, x[31]
x[31, 0]
x[30, 0], 0
x[1, 0], 00...0
x[0], 00...0
. . . . . .
32
32 32 32 32 32 32 32 32
6-bit code specifying shift direction & amount
Right-shifted values
Left-shifted values
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 32
Arithmetic Shifts
Figure 10.16 The two arithmetic shift instructions of MiniMIPS.
Purpose: Multiplication and division by powers of 2
sra $t0,$s1,2 # $t0 ($s1) right-shifted by 2 srav $t0,$s1,$s0 # $t0 ($s1) right-shifted by ($s0)
1 1
1 1
0 0 0
fn
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0
31 25 20 15 0
ALU instruction
Unused Source register
op rs rt
R rd sh
10 5
Destination register
Shift amount
sra = 3
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
31 25 20 15 0
ALU instruction
Amount register
Source register
op rs rt
R rd sh
10 5 fn
Destination register
Unused srav = 7
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 33
Practical Shifting in Multiple Stages
Figure 10.17 Multistage shifting in a barrel shifter.
2
0, x[31, 1]
x[31, 0]
x[30, 0], 0
32
0 1 2 3
32 32 32 32
0 0 No shift 0 1 Logical left 1 0 Logical right 1 1 Arith right
x[31], x[31, 1]
Multiplexer
2
0 1 2 3 (0 or 4)-bit shift
2
0 1 2 3 (0 or 2)-bit shift
2
0 1 2 3 (0 or 1)-bit shift
(a) Single-bit shifter (b) Shifting by up to 7 bits
y[31, 0]
z[31, 0]
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 34
10.6 Multifunction ALUs
General structure of a simple arithmetic/logic unit.
Logicunit
Arithunit
0
1
Operand 1
Operand 2
Result
Logic fn (AND, OR, . . .)
Arith fn (add, sub, . . .)
Select fn type (logic or arith)
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 35
An ALU for MiniMIPS
Figure 10.19 A multifunction ALU with 8 control signals (2 for function class, 1 arithmetic, 3 shift, 2 logic) specifying the operation.
AddSub
x y
y
x
Adder
c 32
c 0
k /
Shifter
Logic unit
s
Logic function
Amount
5
2
Constant amount
Variable amount
5
5
ConstVar
0
1
0
1
2
3
Function class
2
Shift function
5 LSBs Shifted y
32
32
32
2
c 31
32-input NOR
Ovfl Zero
32
32
MSB
ALU
y
x
s
Shorthand symbol for ALU
Ovfl Zero
Func
Control
0 or 1
AND 00 OR 01
XOR 10 NOR 11
00 Shift 01 Set less 10 Arithmetic 11 Logic
00 No shift 01 Logical left 10 Logical right 11 Arith right
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 36
11 Multipliers and DividersModern processors perform many multiplications & divisions:
• Encryption, image compression, graphic rendering• Hardware vs programmed shift-add/sub algorithms
Topics in This Chapter
11.1 Shift-Add Multiplication
11.2 Hardware Multipliers
11.3 Programmed Multiplication
11.4 Shift-Subtract Division
11.5 Hardware Dividers
Computer Architecture, The Arithmetic/Logic Unit
Multiplicand
Partial products bit-matrix
x y
z
y x 2 0 0
y x 2 1 1
y x 2 2 2
y x 2 3 3
Multiplier
Product
Jan. 2011 Slide 37
11.1 Shift-Add Multiplication
Figure 11.1 Multiplication of 4-bit numbers in dot notation.
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 38
11.2 Hardware Multipliers
Multiplier y
Mux
Adder out c
0 1
Doublewidth partial product z
Multiplicand x
Shift
Shift
(j)
j y
Add’Sub
Enable
Select
in c
Figure 11.4 Hardware multiplier based on the shift-add algorithm.
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 39
The Shift Part of Shift-Add
Figure11.5 Shifting incorporated in the connections to the partial product register rather than as a separate phase.
out c
To adder j y
From adder
Sum
Partial product Multiplier
/ k – 1
/ k – 1
/ k
/ k
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 40
Array Multipliers
Figure 11.7 Array multiplier for 4-bit unsigned operands.
3
2
1
0
4 5 6 7
0
1
2
3
2 1 0 x x x
y
y
y
z
y
3 x
0
0
0
0
0
0
0
0
0
0
0
z
z
z
z z z z
HA FA FA
MA MA MA MA
MA MA MA MA
MA MA MA MA
MA MA MA MA
FA
0
Our original dot-notation representing multiplication
Straightened dots to depict array multiplier to the left
s c
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 41
11.3 Programmed Multiplication
MiniMIPS instructions related to multiplication
mult $s0,$s1 # set Hi,Lo to ($s0)($s1); signedmultu $s2,$s3 # set Hi,Lo to ($s2)($s3); unsignedmfhi $t0 # set $t0 to (Hi)mflo $t1 # set $t1 to (Lo)
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 42
Figure 11.8 Register usage for programmed multiplication superimposed on the block diagram for a hardware multiplier.
Emulating a Hardware Multiplier in Software
$t2 (counter)
Part of the control in hardware
Also, holds LSB of Hi during shift
Multiplier y
Mux
Adder out
c
0 1
Doublewidth partial product z
Multiplicand x
Shift
Shift
(j)
j
y
Add’Sub
Enable
Select
in
c
$a0 (multiplicand x)
$a1 (multiplier y)
$v1 (Lo part of z) $v0 (Hi part of z)
$t0 (carry-out)
$t1 (bit j of y)
Example 11.4 (MiniMIPS shift-add program for multiplication)
Computer Architecture, The Arithmetic/Logic Unit
2 1
2
y
2
x 2
2
1
0
3
0
Subtracted bit-matrix
Divisor x Dividend z
s Remainder
Quotient y
y x 3
y x 2
y x
Jan. 2011 Slide 43
11.4 Shift-Subtract Division
Figure11.9 Division of an 8-bit number by a 4-bit number in dot notation.
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 44
11.5 Hardware Dividers
Figure 11.12 Hardware divider based on the shift-subtract algorithm.
Load
Quotient y
Mux
Adder
0 1
Partial remainder z (initially z)
Divisor x
Shift
Shift
(j)
k– j
y
1
Enable
Select
Quotient
digit selector
1
out
c
in
c
Trial difference (Always subtract)
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 45
Array Dividers
Figure 11.14 Array divider for 8/4-bit unsigned integers.
2 1 0 x x x
1 y
2 y
3 y 3 z
0 y
3 x
2 z
1 z
0 z
4 z 5 z 6 z 7 z
MS MS MS MS
MS MS MS MS
MS MS MS MS
MS MS MS MS
Our original dot-notation for division
Straightened dots to depict an array divider 2 1 0 s s s 3 s
0
0
0
0
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 46
12 Floating-Point Arithmetic Floating-point is no longer reserved for high-end machines
• Multimedia and signal processing require flp arithmetic• Details of standard flp format and arithmetic operations
Topics in This Chapter
12.2 Special Values and Exceptions
12.3 Floating-Point Addition
12.4 Other Floating-Point Operations
12.6 Result Precision and Errors
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 47
ANSI/IEEE Standard Floating-Point Format (IEEE 754)
Figure 9.8 The two ANSI/IEEE standard floating-point formats.
Short (32-bit) format
Long (64-bit) format
Sign Exponent Significand
8 bits, bias = 127, –126 to 127
11 bits, bias = 1023, –1022 to 1023
52 bits for fractional part (plus hidden 1 in integer part)
23 bits for fractional part (plus hidden 1 in integer part)
Short exponent range is –127 to 128but the two extreme values
are reserved for special operands(similarly for the long format)
Revision (IEEE 754R) was completed in 2008: The revised version includes 16-bit and 128-bit binary formats, as well as 64- and 128-bit decimal formats
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 48
Figure 12.1 Distribution of floating-point numbers on the real line.
Denser Denser Sparser Sparser
Negative numbers FLP FLP 0 +
–
Overflow region
Overflow region
Underflow regions
Positive numbers
Underflow example
Overflow example
Midway example
Typical example
min max min max + + – – – +
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 49
12.2 Special Values and Exceptions
Zeros, infinities, and NaNs (not a number)
0 Biased exponent = 0, significand = 0 (no hidden 1) Biased exponent = 255 (short) or 2047 (long), significand = 0NaN Biased exponent = 255 (short) or 2047 (long), significand 0
Undefined results lead to NaN (not a number)
(0) / (0) = NaN (+) + (–) = NaN
(0) () = NaN() / () = NaN
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 50
12.3 Floating-Point Addition
Figure 12.4 Alignment shift and rounding in floating-point addition.
Operands after alignment shift: x = 2 1.00101101 y = 2 0.000111101101
Numbers to be added: x = 2 1.00101101 y = 2 1.11101101
5
5
Extra bits to be rounded off
Operand with smaller exponent to be preshifted
Result of addition: s = 2 1.010010111101 s = 2 1.01001100 Rounded sum
5
1
5 5
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 51
Hardware for Floating-Point
Addition
Figure 12.5 Simplified schematic of a floating-point adder.
Normalize & round
Add
Align significands
Possible swap & complement
Unpack
Control & sign logic
Pack
Input 1
Output
Significands Exponents Signs
Significand Exponent Sign
Sub
Mux
Input 2
AddSub
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 52
12.4 Other Floating-Point Operations
Floating-point multiplication
(2e1s1) (2e2s2) = 2e1+ e2(s1 s2)If necessary, halve to normalize (increment exponent)
Floating-point division
(2e1s1) / (2e2s2) = 2e1– e2(s1 / s2)If necessary, double to normalize (decrement exponent)
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 53
Hardware for Floating-Point Multiplication and Division
Figure 12.6 Simplified schematic of a floating-point multiply/divide unit.
Normalize & round
Multiply or divide
Unpack
Control & sign logic
MulDiv
Pack
Input 1
Output
Significands Exponents Signs
Significand Exponent Sign
Input 2
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 54
The Floating-Point Unit in MiniMIPS
Figure 5.1 Memory and processing subsystems for MiniMIPS.
Memory up to 2 words 30
Loc 0 Loc 4 Loc 8
Loc m 4
Loc m 8
4 B / location
m 2 32
$0
$1
$2
$31
Hi Lo
ALU
$0
$1
$2
$31 FP
arith
EPC
Cause
BadVaddr Status
EIU FPU
TMU
Execution & integer unit
Floating- point unit
Trap & memory unit
. . .
. . .
(Coproc. 1)
(Coproc. 0)
(Main proc.)
Integer mul/div
Chapter 10
Chapter 11
Chapter 12
Pairs of registers, beginning with an even-numbered one, are used for double operands
Coprocessor 1
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 55
12.6 Result Precision and ErrorsExample 12.4
Laws of algebra may not hold in floating-point arithmetic. For example, the following computations show that the associative law of addition, (a + b) + c = a + (b + c), is violated for the three numbers shown.
Compute a + b 2 0.00000011 a+b = 2 1.10000000 c =-2 1.01100101
Numbers to be added first a =-2 1.10101011 b = 2 1.10101110
5
5
Compute (a + b) + c 2 0.00011011 Sum = 2 1.10110000
5
2 2
2 6
Compute b + c (after preshifting c) 2 1.101010110011011 b+c = 2 1.10101011 (Round) a =-2 1.10101011
Numbers to be added first b = 2 1.10101110 c =-2 1.01100101
5
5
Compute a + (b + c) 2 0.00000000 Sum = 0 (Normalize to special code for 0)
5 5
5
2
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 56
Evaluation of Elementary FunctionsApproximating polynomials
ln x = 2(z + z3/3 + z5/5 + z7/7 + . . . ) where z = (x – 1)/(x + 1)ex = 1 + x/1! + x2/2! + x3/3! + x4/4! + . . . cos x = 1 – x2/2! + x4/4! – x6/6! + x8/8! – . . . tan–1 x = x – x3/3 + x5/5 – x7/7 + x9/9 – . . .
Iterative (convergence) schemes
For example, beginning with an estimate for x1/2, the following iterative formula provides a more accurate estimate in each step
q(i+1) = 0.5(q(i) + x/q(i))
Table lookup (with interpolation)
A pure table lookup scheme results in huge tables (impractical); hence, often a hybrid approach, involving interpolation, is used.
Computer Architecture, The Arithmetic/Logic UnitJan. 2011 Slide 57
Figure 12.12 Function evaluation by table lookup and linear interpolation.
Function Evaluation by Table Lookup
L
Add
x
Table for a
Output
Table for b
x Input x H L
f(x)
Multiply
h bits k - h bits
x H
f(x) x
x
Best linear approximation in subinterval
The linear approximation above is characterized by the line equation a + b x , where a and b are read out from tables based on x
L
H