Chapter - University of Isfahanengold.ui.ac.ir/~nikmehr/Chapter_03_animated.pdf< 0 ≥ 0 A - B...

Preview:

Citation preview

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

5th Edition

Chapter 3 Arithmetic for Computers

Chapter 3 — Arithmetic for Computers — 2

Arithmetic for Computers Operations on integers

Addition and subtraction Multiplication and division Dealing with overflow

Floating-point real numbers Representation and operations

§3.1 Introduction

Chapter 3 — Arithmetic for Computers — 3

Integer Addition Example: 7 + 6

§3.2 Addition and Subtraction

Overflow if result out of range Adding +ve and –ve operands, no overflow Adding two +ve operands

Overflow if result sign is 1 Adding two –ve operands

Overflow if result sign is 0

Chapter 3 — Arithmetic for Computers — 4

Integer Subtraction Add negation of second operand Example: 7 – 6 = 7 + (–6)

+7: 0000 0000 … 0000 0111 –6: 1111 1111 … 1111 1010 +1: 0000 0000 … 0000 0001

Overflow if result out of range Subtracting two +ve or two –ve operands, no overflow Subtracting +ve from –ve operand

Overflow if result sign is 0 Subtracting –ve from +ve operand

Overflow if result sign is 1

Dealing with Overflow

Operation Operand A Operand B Result indicating overflow

A + B ≥ 0 ≥ 0 < 0 A + B < 0 < 0 ≥ 0 A - B ≥ 0 < 0 < 0 A - B < 0 ≥ 0 ≥ 0

Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit

When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur

Dealing with Overflow

Operation Operand A Operand B Result indicating overflow

A + B ≥ 0 ≥ 0 < 0 A + B < 0 < 0 ≥ 0 A - B ≥ 0 < 0 < 0 A - B < 0 ≥ 0 ≥ 0

Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit

When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur

MIPS signals overflow with an exception (aka interrupt) – an unscheduled procedure call where the EPC contains the address of the instruction that caused the exception

2’s Comp - Detecting Overflow When adding two's complement numbers, overflow will only

occur if the numbers being added have the same sign the sign of the result is different

If we perform the addition an-1 an-2 ... a1 a0

+bn-1bn-2… b1 b0 ----------------------------------

sn-1sn-2… s1 s0

Overflow can be detected as

Overflow can also be detected as

where cn-1and cn are the carry in and carry out of the most significant bit

111111 −⋅−⋅−+−⋅−⋅−= nnnnnn sbasbaV

1−⊗= nn ccV

Unsigned - Detecting Overflow For unsigned numbers, overflow occurs if there is carry out

of the most significant bit. For example, 1001 = 9 + 1000 = 8 0001 = 1 With the MIPS architecture

Overflow exceptions occur for two’s complement arithmetic add, sub, addi

Overflow exceptions do not occur for unsigned arithmetic addu, subu, addiu

ncV =

Chapter 3 — Arithmetic for Computers — 8

Dealing with Overflow Some languages (e.g., C) ignore overflow

Use MIPS addu, addui, subu instructions

Other languages (e.g., Ada, Fortran) require raising an exception Use MIPS add, addi, sub instructions On overflow, invoke exception handler

Save PC in exception program counter (EPC) register Jump to predefined handler address mfc0 (move from coprocessor reg) instruction can

retrieve EPC value, to return after corrective action

MIPS Arithmetic Logic Unit (ALU) Must support the Arithmetic/Logic

operations of the ISA add, addi, addiu, addu sub, subu mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu

32

32

32

m (operation)

result

A

B

ALU

4

zero ovf

1 1

MIPS Arithmetic Logic Unit (ALU) Must support the Arithmetic/Logic

operations of the ISA add, addi, addiu, addu sub, subu mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu

32

32

32

m (operation)

result

A

B

ALU

4

zero ovf

1 1

With special handling for sign extend – addi, addiu, slti, sltiu zero extend – andi, ori, xori overflow detection – add, addi, sub

Chapter 3 — Arithmetic for Computers — 10

Arithmetic for Multimedia Graphics and media processing operates on

vectors of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain

Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors SIMD (single-instruction, multiple-data)

Saturating operations On overflow, result is largest representable value eg., clipping in audio, saturation in video

Multiply Binary multiplication is just a bunch of right

shifts and adds

multiplicand

multiplier

partial product array

double precision product

n

2n

n

Multiply Binary multiplication is just a bunch of right

shifts and adds

multiplicand

multiplier

partial product array

double precision product

n

2n

n can be formed in parallel and added in parallel for faster multiplication

Chapter 3 — Arithmetic for Computers — 12

Multiplication Start with long-multiplication approach

1000 × 1001 1000 0000 0000 1000 1001000

Length of product is the sum of operand lengths

multiplicand

multiplier

product

§3.3 Multiplication

Chapter 3 — Arithmetic for Computers — 13

Multiplication Hardware

Initially 0

Multiplicand is zero extended (unsigned mul)

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

00001100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

00001100 Multiplier[0] = 0 => Do Nothing 00011000 0110 SLL Multiplicand and SRL Multiplier

00001100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier

00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier

00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +

00111100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

0001 01100000 SLL Multiplicand and SRL Multiplier

00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier

00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +

00111100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

0001 01100000 SLL Multiplicand and SRL Multiplier

00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier

00011000 0110 SLL Multiplicand and SRL Multiplier 00001100 Multiplier[0] = 1 => ADD +

00111100 Multiplier[0] = 1 => ADD +

10011100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Multiplication Example (Version 1) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example Multiplicand is zero extended because it is unsigned

0001 01100000 SLL Multiplicand and SRL Multiplier

00001100 Multiplier[0] = 0 => Do Nothing 0011 00110000 SLL Multiplicand and SRL Multiplier

00011000 0110 SLL Multiplicand and SRL Multiplier

0000 11000000 SLL Multiplicand and SRL Multiplier

00001100 Multiplier[0] = 1 => ADD +

00111100 Multiplier[0] = 1 => ADD +

10011100 Multiplier[0] = 1 => ADD +

2

00001100 00000000 1101 Initialize 0

1

3

4

Multiplicand Product Multiplier Iteration

Observation on Version 1 of Multiply Hardware in version 1 can be optimized Rather than shifting the multiplicand to the

left Instead, shift the product to the right Has same net effect and produces same results

Reduce Hardware Multiplicand register can be reduced to 32 bits We can also reduce the adder size to 32 bits

One cycle per iteration Shifting and addition can be done

simultaneously

Product = HI and LO registers, HI=0 Product is shifted right Reduced 32-bit Multiplicand & Adder

Version 2 of Multiplication Hardware

= 0

Start

Multiplier[0]?

HI = HI + Multiplicand

32nd Repetition?

Done

= 1

No

Yes

Shift Product = (HI,LO) Right 1 bit Shift Multiplier Right 1 bit

32-bit ALU

Control 64 bits

32 bits

write

add

Multiplicand

shift right

32 bits

33 bits

HI LO

32 bits carry

Multiplier

shift right

Multiplier[0] 32 bits

Eliminate Multiplier Register Initialize LO = Multiplier, HI=0 Product = HI and LO registers

Refined Version of Multiply Hardware

= 0

Start

LO[0]?

HI = HI + Multiplicand

32nd Repetition?

Done

= 1

No

Yes

LO=Multiplier, HI=0

Shift Product = (HI,LO) Right 1 bit 32-bit ALU

Control 64 bits

32 bits

write

add

LO[0]

Multiplicand

shift right

32 bits

33 bits

HI LO

32 bits carry

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 1 1 0 0 1

LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 1 1 0 0 1

LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +

1 0 0 1 1 1 0 0 1 LO[0] = 1 => ADD +

Multiply Example (Refined Version) Consider: 11002 × 11012 , Product = 100111002 4-bit multiplicand and multiplier are used in this example 4-bit adder produces a 5-bit sum (with carry)

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 1 1 0 0 1

LO[0] = 0 => Do Nothing 1 1 0 0 Shift Right Product = (HI, LO) 0 0 1 1 0 0 1 1

1 1 0 0 Shift Right Product = (HI, LO) 0 1 1 0 0 1 1 0

1 1 0 0 Shift Right Product = (HI, LO) 1 0 0 1 1 1 0 0

2

1 1 0 0 0 0 0 0 1 1 0 1 Initialize (LO = Multiplier, HI=0) 0

1

3

4

Multiplicand Product = HI, LO Carry Iteration

0 1 1 0 0 1 1 0 1 LO[0] = 1 => ADD +

0 1 1 1 1 0 0 1 1 LO[0] = 1 => ADD +

1 0 0 1 1 1 0 0 1 LO[0] = 1 => ADD +

Chapter 3 — Arithmetic for Computers — 20

Faster Multiplier Uses multiple adders

Cost/performance tradeoff

Can be pipelined Several multiplication performed in parallel

Chapter 3 — Arithmetic for Computers — 21

MIPS Multiplication Two 32-bit registers for product

HI: most-significant 32 bits LO: least-significant 32-bits

Instructions mult rs, rt / multu rs, rt

64-bit product in HI/LO mfhi rd / mflo rd

Move from HI/LO to rd Can test HI value to see if product overflows 32 bits

mul rd, rs, rt

Least-significant 32 bits of product –> rd

Unsigned Division (Paper & Pencil)

=

11

Chapter 3 — Arithmetic for Computers — 23

Division Check for 0 divisor Long division approach

If divisor ≤ dividend bits 1 bit in quotient, subtract

Otherwise 0 bit in quotient, bring down next

dividend bit Restoring division

Do the subtract, and if remainder goes < 0, add divisor back

Signed division Divide using absolute values Adjust sign of quotient and remainder

as required

1001 1000 1001010 -1000 10 101 1010 -1000 10

n-bit operands yield n-bit quotient and remainder

quotient

dividend

remainder

divisor

§3.4 Division

Chapter 3 — Arithmetic for Computers — 24

Division Hardware

Initially dividend

Initially divisor in left half

Division Example

Dividend = 0111 Divisor = 0010

Optimized Divider Uses two registers: HI and LO Initialize: HI = Remainder = 0 and LO = Dividend Shift (HI, LO) LEFT by 1 bit (also Shift Quotient LEFT)

Shift the remainder and dividend registers together LEFT

Compute: Difference = Remainder – Divisor If (Difference ≥ 0) then

Remainder = Difference Set Least significant Bit of Quotient

Observation to Reduce Hardware: LO register can be also used to store the computed

Quotient

Chapter 3 — Arithmetic for Computers — 27

Optimized Divider Initialize:

HI = 0, LO = Dividend

Results: HI = Remainder LO = Quotient

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0

2: Diff < 0 => Do Nothing

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0

2: Diff < 0 => Do Nothing

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1

1 1 1 1 1: Shift Left, Diff = HI - Divisor 0 0 1 0 0 1 0 0 0 0 1 1

2: Diff < 0 => Do Nothing

Unsigned Integer Division Example Example: 11102 / 00112 (4-bit dividend & divisor) Result Quotient = 01002 and Remainder = 00102 4-bit registers for Remainder and Divisor (4-bit ALU)

2: Rem = Diff, set lsb of LO 1 0 0 1 0 0 0 0

2: Diff < 0 => Do Nothing

2: Diff < 0 => Do Nothing

2

0 0 0 0 1 1 1 0 Initialize 0

1

3

4

HI Difference LO Iteration 0 0 1 1 Divisor

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 1 1 0 0 0 0 1 1

0 0 0 0 1: Shift Left, Diff = HI - Divisor 0 0 1 1 1 0 0 0 0 0 1 1

1 1 1 0 1: Shift Left, Diff = HI - Divisor 0 0 0 1 0 0 1 0 0 0 1 1

1 1 1 1 1: Shift Left, Diff = HI - Divisor 0 0 1 0 0 1 0 0 0 0 1 1

Chapter 3 — Arithmetic for Computers — 29

Combined Multiplier and Divider

One cycle per partial-remainder subtraction Looks a lot like a multiplier!

Same hardware can be used for both

Chapter 3 — Arithmetic for Computers — 30

Faster Division Can’t use parallel hardware as in multiplier

Subtraction is conditional on sign of remainder Faster dividers (e.g. SRT devision)

generate multiple quotient bits per step Still require multiple steps

Chapter 3 — Arithmetic for Computers — 31

MIPS Division Use HI/LO registers for result

HI: 32-bit remainder LO: 32-bit quotient

Instructions div rs, rt / divu rs, rt

No overflow or divide-by-0 checking Software must perform checks if required

Use mfhi, mflo to access result

Representing Big (and Small) Numbers What if we want to encode

approx. age of the earth? 4,600,000,000 or 4.6 x 109

weight in kg of one a.m.u. (atomic mass unit) 0.0000000000000000000000000166 or 1.6 x 10-27

There is no way we can encode either of them in a 32-bit integer

Floating point representation (-1)sign x F x 2E

Still have to fit everything in 32 bits (single precision)

Chapter 3 — Arithmetic for Computers — 33

Floating Point Representation for non-integral numbers

Including very small and very large numbers Like scientific notation

–2.34 × 1056 +0.002 × 10–4 +987.02 × 109

In binary ±1.xxxxxxx2 × 2yyyy

Types float and double in C

normalized

not normalized

§3.5 Floating Point

Chapter 3 — Arithmetic for Computers — 34

Floating Point Standard Defined by IEEE Std 754-1985 Developed in response to divergence of

representations Portability issues for scientific code

Now almost universally adopted Two representations

Single precision (32-bit) Double precision (64-bit)

Chapter 3 — Arithmetic for Computers — 35

IEEE Floating-Point Format

S: sign bit (0 ⇒ non-negative, 1 ⇒ negative) Normalized significand: 1.0 ≤ |significand| < 2.0

Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit)

Significand is Fraction with the “1.” restored

Exponent: excess representation: actual exponent + Bias Ensures exponent is unsigned Single: Bias = 127; Double: Bias = 1023

S Exponent Fraction

single: 8 bits double: 11 bits

single: 23 bits double: 52 bits

Bias)(ExponentS 2Fraction)(11)(x −×+×−=

Chapter 3 — Arithmetic for Computers — 36

IEEE Floating-Point Format

All 0

All 1

All 0

All 1

All 0

All 1

All 0

All 1

Chapter 3 — Arithmetic for Computers — 37

Single-Precision Range Exponents 00000000 and 11111111 reserved Smallest value

Exponent: 00000001 ⇒ actual exponent = 1 – 127 = –126

Fraction: 000…00 ⇒ significand = 1.0 ±1.0 × 2–126 ≈ ±1.2 × 10–38

Largest value exponent: 11111110 ⇒ actual exponent = 254 – 127 = +127

Fraction: 111…11 ⇒ significand ≈ 2.0 ±2.0 × 2+127 ≈ ±3.4 × 10+38

Chapter 3 — Arithmetic for Computers — 38

Double-Precision Range Exponents 0000…00 and 1111…11 reserved Smallest value

Exponent: 00000000001 ⇒ actual exponent = 1 – 1023 = –1022

Fraction: 000…00 ⇒ significand = 1.0 ±1.0 × 2–1022 ≈ ±2.2 × 10–308

Largest value Exponent: 11111111110 ⇒ actual exponent = 2046 – 1023 = +1023

Fraction: 111…11 ⇒ significand ≈ 2.0 ±2.0 × 2+1023 ≈ ±1.8 × 10+308

Chapter 3 — Arithmetic for Computers — 39

Floating-Point Precision Relative precision

all fraction bits are significant Single: approx 2–23

Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision

Double: approx 2–52

Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision

Chapter 3 — Arithmetic for Computers — 40

Floating-Point Example Represent –0.75

–0.75 = (–1)1 × 1.12 × 2–1

S = 1 Fraction = 1000…002 Exponent = –1 + Bias

Single: –1 + 127 = 126 = 011111102 Double: –1 + 1023 = 1022 = 011111111102

Single: 1011111101000…00 Double: 1011111111101000…00

Chapter 3 — Arithmetic for Computers — 41

Floating-Point Example What number is represented by the single-

precision float 11000000101000…00

S = 1 Fraction = 01000…002 Exponent = 100000012 = 129

x = (–1)1 × (1 + 0.012) × 2(129 – 127) = (–1) × 1.25 × 22 = –5.0

Chapter 3 — Arithmetic for Computers — 42

Representable FP Numbers

Denser Denser Sparser Sparser

Negative numbers FLP FLP ±0 +∞

–∞

Overflow region

Overflow region

Underflow regions

Positive numbers

Underflow example

Overflow example

Midway example

Typical example

min max min max + + – – – +

+1.0×2–126 -1.0×2–126 +2.0×2+127 -2.0×2+127

Chapter 3 — Arithmetic for Computers — 43

Zero Biased exponent=0 Fraction = 0 Hidden bit = 0.

Two representations for 0 (single precision)

+0.0 0x00000000

-0.0 0x80000000

0.020)(01)(x BiasS ±=×+×−= −

Chapter 3 — Arithmetic for Computers — 44

Denormal Numbers Exponent = 000...0 ⇒ hidden bit is 0

Smaller than normal numbers allow for gradual underflow, with diminishing

precision

1)BiasS 2Fraction)(01)(x −−×+×−= (

Denormalized Example Find the decimal equivalent of IEEE 754 number

0 00000000 11011000000000000000000 Determine the sign: 0, the number is positive Calculate the exponent

exponent of all 0s means the number is denormalized exponent is therefore -126

Convert the significand to decimal remember to add the implicit bit (this is 0. because the

number is denormalized) (0.11011)2 = (0.84375)10

0 00000000 11011000000000000000000 = 0.84375 x 2-126

Chapter 3 — Arithmetic for Computers — 45

Chapter 3 — Arithmetic for Computers — 46

Infinities and NaNs Exponent = 111...1, Fraction = 000...0

±Infinity Can be used in subsequent calculations,

avoiding need for overflow check Exponent = 111...1, Fraction ≠ 000...0

Not-a-Number (NaN) Indicates illegal or undefined result

e.g., 0.0 / 0.0 Can be used in subsequent calculations

Chapter 3 — Arithmetic for Computers — 47

NaNs 1. Operations with at least one NaN operand 2. Indeterminate forms

divisions 0/0 and ±∞/±∞ multiplications 0×±∞ and ±∞×0 additions ∞ + (−∞), (−∞) + ∞ and equivalent

subtractions 3. Real operations with complex results

square root of a negative number. logarithm of a negative number inverse sine or cosine of a number that is less than

−1 or greater than +1.

Chapter 3 — Arithmetic for Computers — 48

Floating-Point Addition Consider a 4-digit decimal example

9.999 × 101 + 1.610 × 10–1

1. Align decimal points Shift number with smaller exponent 9.999 × 101 + 0.016 × 101

2. Add significands 9.999 × 101 + 0.016 × 101 = 10.015 × 101

3. Normalize result & check for over/underflow 1.0015 × 102

4. Round and renormalize if necessary 1.002 × 102

Chapter 3 — Arithmetic for Computers — 49

Floating-Point Addition Now consider a 4-digit binary example

1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375) 1. Align binary points

Shift number with smaller exponent 1.0002 × 2–1 + –0.1112 × 2–1

2. Add significands 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1

3. Normalize result & check for over/underflow 1.0002 × 2–4, with no over/underflow

4. Round and renormalize if necessary 1.0002 × 2–4 (no change) = 0.0625

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Exponents are not equal

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Exponents are not equal

Shift the significand of the lesser exponent right until its exponent matches the larger number

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Exponents are not equal

Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Exponents are not equal

Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1

Difference between the two exponents = –1 – (–3) = 2

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Exponents are not equal

Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1

Difference between the two exponents = –1 – (–3) = 2

So, shift right by 2 bits

Floating Point Addition Example Consider adding: (1.111)2 × 2–1 + (1.011)2 × 2–3

For simplicity, we assume 4 bits of precision (or 3 bits of fraction)

Exponents are not equal

Shift the significand of the lesser exponent right until its exponent matches the larger number (1.011)2 × 2–3 = (0.1011)2 × 2–2 = (0.01011)2 × 2–1

Difference between the two exponents = –1 – (–3) = 2

So, shift right by 2 bits Now, add the significands:

Carry

1.111 0.01011

10.00111

+

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20

In this example, we have a carry

So, shift right by 1 bit and increment the exponent

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20

In this example, we have a carry

So, shift right by 1 bit and increment the exponent

Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20

In this example, we have a carry

So, shift right by 1 bit and increment the exponent

Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Round to nearest: (1.000111)2 ≈ (1.001)2

1.000 111 1

1.001

+

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20

In this example, we have a carry

So, shift right by 1 bit and increment the exponent

Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Round to nearest: (1.000111)2 ≈ (1.001)2

Renormalize if rounding generates a carry

1.000 111 1

1.001

+

Addition Example – cont’d So, (1.111)2 × 2–1 + (1.011)2 × 2–3 = (10.00111)2 × 2–1

However, result (10.00111)2 × 2–1 is NOT normalized

Normalize result: (10.00111)2 × 2–1 = (1.000111)2 × 20

In this example, we have a carry

So, shift right by 1 bit and increment the exponent

Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Round to nearest: (1.000111)2 ≈ (1.001)2

Renormalize if rounding generates a carry

Detect overflow / underflow

If exponent becomes too large (overflow) or too small (underflow) Represent overflow/underflow accordingly

1.000 111 1

1.001

+

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Convert subtraction into addition to 2's complement

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Convert subtraction into addition to 2's complement

+ 0.00001 × 22

– 1.00000 × 22

Sign

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Convert subtraction into addition to 2's complement

+ 0.00001 × 22

– 1.00000 × 22

0 0.00001 × 22

1 1.00000 × 22

Sign

2’s

Com

plem

ent

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Convert subtraction into addition to 2's complement

+ 0.00001 × 22

– 1.00000 × 22

0 0.00001 × 22

1 1.00000 × 22

1 1.00001 × 22

Sign

2’s

Com

plem

ent

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Convert subtraction into addition to 2's complement

+ 0.00001 × 22

– 1.00000 × 22

0 0.00001 × 22

1 1.00000 × 22

1 1.00001 × 22

Sign

Since result is negative, convert result from 2's complement to sign-magnitude

2’s

Com

plem

ent

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.000)2 × 22

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = 2 – (–3) = 5 Shift right by 5 bits: (1.000)2 × 2–3 = (0.00001000)2 × 22

Convert subtraction into addition to 2's complement

+ 0.00001 × 22

– 1.00000 × 22

0 0.00001 × 22

1 1.00000 × 22

1 1.00001 × 22

Sign

Since result is negative, convert result from 2's complement to sign-magnitude

2’s

Com

plem

ent

– 0.11111 × 22 2’s Complement

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Round to nearest: (1.1111)2 ≈ (10.000)2

1.111 1 1

10.000

+

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Round to nearest: (1.1111)2 ≈ (10.000)2 Renormalize: rounding generated a carry –1.11112 × 21 ≈ –10.0002 × 21 = –1.0002 × 22

1.111 1 1

10.000

+

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.000)2 × 22 = – 0.111112 × 22 Normalize result: – 0.111112 × 22 = – 1.11112 × 21 Round the significand to fit in appropriate number of bits

We assumed 4 bits of precision or 3 bits of fraction

Round to nearest: (1.1111)2 ≈ (10.000)2 Renormalize: rounding generated a carry –1.11112 × 21 ≈ –10.0002 × 21 = –1.0002 × 22

Result would have been accurate if more fraction bits are used

1.111 1 1

10.000

+

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3

Convert subtraction into addition to 2's complement

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3

Convert subtraction into addition to 2's complement

+ 1.0000 × 2-3

– 0.1100 × 2-3

Sign

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3

Convert subtraction into addition to 2's complement

+ 1.0000 × 2-3

– 0.1100 × 2-3

0 1.0000 × 2-3

1 1.0100 × 2-3

Sign

2’s

Com

plem

ent

Floating Point Subtraction Example Consider: (1.000)2 × 2–3 – (1.100)2 × 2-4

We assume again: 4 bits of precision (or 3 bits of fraction)

Shift right significand of the lesser exponent Difference between the two exponents = -3 – (–4) = 1 Shift right by 1 bits: (1.100)2 × 2–4 = (0.1100)2 × 2-3

Convert subtraction into addition to 2's complement

+ 1.0000 × 2-3

– 0.1100 × 2-3

0 1.0000 × 2-3

1 1.0100 × 2-3

0 0.0100 × 2-3

Sign

2’s

Com

plem

ent

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3 Normalize result: 0.01002 × 2-3 = 1.00002 × 2-5

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3 Normalize result: 0.01002 × 2-3 = 1.00002 × 2-5

For subtraction, we can have leading zeros

Subtraction Example – cont’d So, (1.000)2 × 2–3 – (1.100)2 × 2-4 = 0.01002 × 2-3 Normalize result: 0.01002 × 2-3 = 1.00002 × 2-5

For subtraction, we can have leading zeros

Round the significand to fit in appropriate number of bits We assumed 4 bits of precision or 3 bits of fraction

Answer: 1.0002 × 2-5

Chapter 3 — Arithmetic for Computers — 57

FP Adder Hardware Much more complex than integer adder Doing it in one clock cycle would take too

long Much longer than integer operations Slower clock would penalize all instructions

FP adder usually takes several cycles Can be pipelined

Chapter 3 — Arithmetic for Computers — 58

FP Adder Hardware

Step 1

Step 2

Step 3

Step 4

Chapter 3 — Arithmetic for Computers — 59

FP Adder Hardware

Chapter 3 — Arithmetic for Computers — 61

Floating-Point Multiplication Consider a 4-digit decimal example

1.110 × 1010 × 9.200 × 10–5

1. Add exponents For biased exponents, subtract bias from sum New exponent = 10 + –5 = 5

2. Multiply significands 1.110 × 9.200 = 10.212 ⇒ 10.212 × 105

3. Normalize result & check for over/underflow 1.0212 × 106

4. Round and renormalize if necessary 1.021 × 106

5. Determine sign of result from signs of operands +1.021 × 106

Chapter 3 — Arithmetic for Computers — 62

Floating-Point Multiplication Now consider a 4-digit binary example

1.0002 × 2–1 × –1.1102 × 2–2 (0.5 × –0.4375) 1. Add exponents

Unbiased: –1 + –2 = –3 Biased: (–1 + 127) + (–2 + 127) = –3 + 254 – 127 = –3 + 127

2. Multiply significands 1.0002 × 1.1102 = 1.1102 ⇒ 1.1102 × 2–3

3. Normalize result & check for over/underflow 1.1102 × 2–3 (no change) with no over/underflow

4. Round and renormalize if necessary 1.1102 × 2–3 (no change)

5. Determine sign: +ve × –ve ⇒ –ve –1.1102 × 2–3 = –0.21875

Chapter 3 — Arithmetic for Computers — 63

FP Arithmetic Hardware FP multiplier is of similar complexity to FP

adder But uses a multiplier for significands instead of

an adder FP arithmetic hardware usually does

Addition, subtraction, multiplication, division, reciprocal, square-root

FP ↔ integer conversion Operations usually takes several cycles

Can be pipelined

Chapter 3 — Arithmetic for Computers — 65

FP Instructions in MIPS FP hardware is coprocessor 1

Adjunct processor that extends the ISA Separate FP registers

32 single-precision: $f0, $f1, … $f31 Paired for double-precision: $f0/$f1, $f2/$f3, …

Release 2 of MIPs ISA supports 32 × 64-bit FP reg’s

FP instructions operate only on FP registers Programs generally don’t do integer ops on FP data, or

vice versa More registers with minimal code-size impact

FP load and store instructions lwc1, ldc1, swc1, sdc1

e.g., ldc1 $f8, 32($sp)

Chapter 3 — Arithmetic for Computers — 66

FP Instructions in MIPS Single-precision arithmetic

add.s, sub.s, mul.s, div.s add.s $f0, $f1, $f6 #$f0 = $f1 + $f6

Double-precision arithmetic add.d, sub.d, mul.d, div.d

add.d $f0, $f2, $f6 #$f0:$f1 = $f2:$f3 + $f6:$f7

mul.d $f4, $f4, $f6

Single- and double-precision comparison c.xx.s, c.xx.d (xx is eq, lt, le, …) Sets or clears FP condition-code bit (0 to 7)

c.lt.s $f3, $f4 #if($f3 < $f4) cond=1; else cond=0

Branch on FP condition code true or false bc1t, bc1f

bc1t TargetLabel #if(cond==1) go to TargetLabel

Chapter 3 — Arithmetic for Computers — 67

FP Example: °F to °C C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr - 32.0)); }

fahr in $f12, result in $f0, literals in global memory space

Compiled MIPS code: f2c: lwc1 $f16, const5($gp) lwc1 $f18, const9($gp) div.s $f16, $f16, $f18 lwc1 $f18, const32($gp) sub.s $f18, $f12, $f18 mul.s $f0, $f16, $f18 jr $ra

Chapter 3 — Arithmetic for Computers — 71

Accurate Arithmetic IEEE Std 754 specifies additional rounding

control Extra bits of precision (guard, round, sticky) Choice of rounding modes Allows programmer to fine-tune numerical behavior of

a computation Not all FP units implement all options

Most programming languages and FP libraries just use defaults

Trade-off between hardware complexity, performance, and market requirements

Support for Accurate Arithmetic

IEEE 754 FP rounding modes Always round up (toward +∞) Always round down (toward -∞) Truncate (toward zero) Round to nearest even (RTNE)

Support for Accurate Arithmetic

Rounding (except for truncation) requires the hardware to include extra F bits during calculations Guard bit – used to provide one F bit when shifting left to normalize

a result (e.g., when normalizing F after division or subtraction) Round bit – used to improve rounding accuracy Sticky bit – used to support Round to nearest even; is set to a 1

whenever a 1 bit shifts (right) through it (e.g., when aligning F during addition/subtraction) sticky OR of the discarded bits

IEEE 754 FP rounding modes Always round up (toward +∞) Always round down (toward -∞) Truncate (toward zero) Round to nearest even (RTNE)

F = 1 . xxxxxxxxxxxxxxxxxxxxxxx G R S

Rounding Examples

Chapter 3 — Arithmetic for Computers — 73

Number To +∞ To −∞ To Zero +1.000101 +1.0010 +1.0001 +1.0001 - 1.000111 - 1.0001 - 1.0010 - 1.0001 +1.010110 +1.0110 +1.0101 +1.0101 +1.010010 +1.0101 +1.0100 +1.0100 - 1.001110 - 1.0011 - 1.0100 - 1.0011

RTNE Examples GRS - Action (after first normalization – if required) 0xx - round down = do nothing (x means any bit value, 0 or 1) 100 - this is a tie: round up if the bit just before G is 1, else

round down = do nothing 101 - round up (increment by 1) 110 - round up (increment by 1) 111 - round up (increment by 1) Sig GRS Increment? Rounded 1.000000 N 1.000 1.101000 N 1.101 1.000100 N 1.000 1.001100 Y 1.010 1.000101 Y 1.001 1.111110 Y 10.000

Chapter 3 — Arithmetic for Computers — 74

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

1.00000000101100010001101 × 25

– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

1.00000000101100010001101 × 25

– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 100110 × 25 (2's complement)

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

1.00000000101100010001101 × 25

– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 100110 × 25 (2's complement)

Guard bit – do not discard

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

1.00000000101100010001101 × 25

– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 100110 × 25 (2's complement)

0 0.11111110101100010001011 0 100110 × 25 (add significands)

Guard bit – do not discard

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

1.00000000101100010001101 × 25

– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 100110 × 25 (2's complement)

0 0.11111110101100010001011 0 100110 × 25 (add significands)

+ 1.11111101011000100010110 1 001100 × 24 (normalized)

Guard bit – do not discard

Guard Bit Guard bit: guards against loss of a significant bit

Only one guard bit is needed to maintain accuracy of result Shifted left (if needed) during normalization as last fraction bit

Example on the need of a guard bit:

1.00000000101100010001101 × 25

– 1.00000000000000011011010 × 2-2 (subtraction)

1.00000000101100010001101 × 25

– 0.00000010000000000000001 1011010 × 25 (shift right 7 bits)

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 100110 × 25 (2's complement)

0 0.11111110101100010001011 0 100110 × 25 (add significands)

+ 1.11111101011000100010110 1 001100 × 24 (normalized)

Guard bit – do not discard

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized)

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized)

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized) Round bit

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized) Round bit Sticky bit

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized) Round bit Sticky bit

OR-reduce

Guard bit

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized)

Round bit Sticky bit

OR-reduce

Guard bit

Guard bit

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized)

Round bit Sticky bit

OR-reduce

Guard bit

Guard bit

Round and Sticky Bits Two extra bits are needed for rounding

Rounding performed after normalizing a result significand Round bit: appears after the guard bit Sticky bit: appears after the round bit (OR of all additional bits)

Consider the same example of previous slide:

0 1.00000000101100010001101 × 25

1 1.11111101111111111111110 0 1 00110 × 25 (2's complement)

0 0.11111110101100010001011 0 1 00110 × 25 (sum)

+ 1.11111101011000100010110 1 0 1 × 24 (normalized)

Round bit Sticky bit

OR-reduce

Guard bit

Guard bit

Recommended