View
218
Download
3
Embed Size (px)
Citation preview
Arithmetic IIICPSC 321
Andreas Klappenecker
Any Questions?
Today’s Menu
AdditionMultiplicationFloating Point Numbers
Recall: Full Adder
cin
ab
cout
s
3 gates delay for first adder, 2(n-1) for remaining adders
Ripple Carry Adders
• Each gates causes a delay• our example: 3 gates for carry generation • book has example with 2 gates
• Carry might ripple through all n adders• O(n) gates causing delay• intolerable delay if n is large
• Carry lookahead adders
Faster Adders
cin a b cout s
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
cout=ab+cin(a xor b) =ab+acin+bcin
=ab+(a+b)cin
= g + p cin
Generate g = abPropagate p = a+b
Why are they called like that?
Fast Adders
Iterate the idea, generate and propagateci+1 = gi + pici
= gi + pi(gi-1 + pi-1 ci-1)
= gi + pigi-1+ pipi-1ci-1
= gi + pigi-1+ pipi-1gi-2 +…+ pipi-1 …p1g0
+pipi-1 …p1p0c0
Two level AND-OR circuit Carry is known early!
• Need to support the set-on-less-than instruction
(slt)
• remember: slt is an arithmetic instruction
• produces 1 if rs < rt and 0 otherwise
• use subtraction: (a-b) < 0 implies a < b
• Need to support test for equality (beq $t5, $t6,
$t7)
• use subtraction: (a-b) = 0 implies a = b
A Simple ALU for MIPS
ALU
000 = and001 = or010 = add110 = subtract111 = slt
•Note: zero is a 1 when the result is zero!Set
a31
0
Result0a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Bnegate
Zero
ALU0Less
CarryIn
CarryOut
ALU1Less
CarryIn
CarryOut
ALU2Less
CarryIn
CarryOut
ALU31Less
CarryIn
Multipliers
• More complicated than addition• accomplished via shifting and addition• Let's look at 3 versions based on the grade school
algorithm
0010 (multiplicand)__ x_1011 (multiplier)
0010 x 1 00100 x 1 001000 x 0 0010000 x 1 00010110 • Shift and add if multiplier bit equals 1
Multiplication
Multiplication
Done
1. TestMultiplier0
1a. Add multiplicand to product andplace the result in Product register
2. Shift the Multiplicand register left 1 bit
3. Shift the Multiplier register right 1 bit
32nd repetition?
Start
Multiplier0 = 0Multiplier0 = 1
No: < 32 repetitions
Yes: 32 repetitions
64-bit ALU
Control test
MultiplierShift right
ProductWrite
MultiplicandShift left
64 bits
64 bits
32 bits
0010 (multiplicand)__ x_1011 (multiplier) 0010 x 1 00100 x 1
001000 x 0 0010000 x 1
0010110
Multiplication
If each step took a clock cycle, this algorithm would use almost 100 clock cycles to multiply two 32-bit numbers.
Requires 64-bit wide adderMultiplicand register 64-bit wide
Variations on a Theme
• Product register has to be 64-bit• Nothing we can do about that!• Can we take advantage of that fact?• Yes! Add multiplicand to 32 MSBs• product = product >> 1• Repeat last steps
0010 (multiplicand)__ x_1011 (multiplier) 0010 x 1 00100 x 1
001000 x 0 0010000 x 1
0010110
Second Version
MultiplierShift right
Write
32 bits
64 bits
32 bits
Shift right
Multiplicand
32-bit ALU
Product Control test
Done
1. TestMultiplier0
1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register
2. Shift the Product register right 1 bit
3. Shift the Multiplier register right 1 bit
32nd repetition?
Start
Multiplier0 = 0Multiplier0 = 1
No: < 32 repetitions
Yes: 32 repetitions
Version 1 versus Version 2
MultiplierShift right
Write
32 bits
64 bits
32 bits
Shift right
Multiplicand
32-bit ALU
Product Control test
64-bit ALU
Control test
MultiplierShift right
ProductWrite
MultiplicandShift left
64 bits
64 bits
32 bits
Critique
• Registers needed for • multiplicand• multiplier• product
• Use lower 32 bits of product register:• place multiplier in lower 32 bits• add multiplicand to higher 32 bits• product = product >> 1• repeat
Final Version
ControltestWrite
32 bits
64 bits
Shift rightProduct
Multiplicand
32-bit ALU
Done
1. TestProduct0
1a. Add multiplicand to the left half ofthe product and place the result inthe left half of the Product register
2. Shift the Product register right 1 bit
32nd repetition?
Start
Product0 = 0Product0 = 1
No: < 32 repetitions
Yes: 32 repetitionsMultiplier (shifts right)
Summary
It was possible to improve upon the well-known grade school algorithm by• reducing the adder from 64 to 32
bits• keeping the multiplicand fixed• shifting the product register• omitting the multiplier register
The Booth Multiplier
Let’s kick it up a notch!
Runs of 1’s
• 011102 = 14 = 8+4+2 = 16 – 2
• Runs of 1s (current bit, bit to the right):
• 10 beginning of run• 11 middle of a run• 01 end of a run of 1s• 00 middle of a run of 0s
Run’s of 1’s
• 0111 1111 11002 = 2044• How do you get this conversion
quickly?• 0111 11112 = 128 – 1 = 127• 0111 1111 11112 = 2048 – 1• 0111 1111 11002 = 2048 – 1 – 3 =
2048 – 4
0010
0110
0000 shift
-0010 sub
0000 shift
0010 add
00001100
Example
0010
0110
0000 shift
0010 add
0010 add
0000 shift
00001100
Booth Multiplication
Current and previous bit
00: middle of run of 0s, no action01: end of a run of 1s, add multiplicand10: beginning of a run of 1s, subtract
mcnd11: middle of string of 1s, no action
Example: 0010 x 0110
Iteration
Mcand
Step Product
0 0010 Initial values 0000 0110,0
1 0010 0010
00: no op arith>> 1
0000 0110,0
0000 0011,0
2 0010 0010
10: prod-=Mcandarith>> 1
1110 0011,0 1111 0001,1
3 0010 0010
11: no oparith>> 1
1111 0001,1 1111 1000,1
4 0010
0010
01: prod+=Mcandarith>> 1
0001 1000,1 0000 1100,0
Negative numbers
Booth’s multiplication works also with negative numbers:2 x -3 = -6 00102 x 11012 = 1111 10102
Negative Numbers
00102 x 11012 = 1111 10102
0) Mcnd 0010 Prod 0000 1101,0
1) Mcnd 0010 Prod 1110 1101,1 sub
1) Mcnd 0010 Prod 1111 0110,1 >>
2) Mcnd 0010 Prod 0001 0110,1 add
2) Mcnd 0010 Prod 0000 1011,0 >>
3) Mcnd 0010 Prod 1110 1011,0 sub
3) Mcnd 0010 Prod 1111 0101,1 >>
4) Mcnd 0010 Prod 1111 0101,1 nop
4) Mcnd 0010 Prod 1111 1010,1 >>
Summary
• Extends the final version of the grade school algorithm
• Simple change: add, subtract, or do nothing if last and previous bit respectively satisfy 0,1; 1,0 or 0,0; 1,1
• 0111 11002 = 128 – 4 = 1000 0002 – 0000 01002
Floating Point Numbers
Floating Point Numbers
We often use calculations based on real numbers, such as• e = 2.71828…• Pi = 3.14592…We represent approximations to such numbers by floating point numbers• 1.xxxxxxxxxx2 x 2yyyy
Floating-Point Representation: float
We need to distribute the 32 bits among sign, exponent, and significand• seeeeeeeexxxxxxxxxxxxxxxxxxxxxxxThe general form of such a number is • (-1)s x F x 2E • s is the sign, F is derived from the significand field, and E is derived from the exponent field
Floating Point Representation: double
• 1 bit sign, 11 bits for exponent, 52 bits for significand
• seeeeeeeeeeexxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Range of float: 2.0 x 10-38 … 2.0 x 1038
Range of double: 2.0 x 10-308 … 2.0 x 10308
IEEE 754 Floating-Point Standard
• Makes leading bit of normalized binary number implicit 1 + significand
• If significand is s1 s2 s3 s4 s5 s6 … then the value is
(-1)s x (1 + s1/2 + s2/4 + s3/8 + … ) 2E
• Design goal of IEEE 754: Integer comparisons should yield meaningful comparisons for floating point numbers
IEEE 754 Standard
• Negative exponents are a difficulty for sorting
• Idea: most positive … most negative 1111 1111 … 0000 0000• IEEE 754 uses a bias of 127 for single
precision. • Exponent -1 is represented by -1 + 127 = 126
IEEE 754 Example
Represent -0.75 in single precision format. -0.75 = -3/4 = -112 / 4 = -0.112
In scientific notation: -0.11 x 20 = -1.1 x 2-1
the latter form is normalized sc. notation
Value: (-1)s x (1+ significand) x 2(Expnt – 127)
Example (cont’d)
• -1.1 x 2-1 = (-1)1 x (1 + .1000 0000 0000 0000 0000
000) x 2(126 – 127)
The single precision representation is 1 0111 1110 1000 0000 0000 0000 0000
000
BAM!
Conclusion
• We learned how to multiply• Three variations on the grade school
algorithm • Booth multiplication• Floating point representation a la IEEE 754
(Photo’s are courtesy of www.emerils.com,some graphs are due to Patterson and Hennessy)