ALGORITHMSAND DESIGN METHODS FOR DIGITAL COMPUTER …

ALGORITHMSAND DESIGN METHODS FOR

DIGITAL COMPUTER ARITHMETIC

I N T E R N A T I O N A L SECOND E D I T I O N

Behrooz Parhami Department of Electrical and Computer Engineering

University of California, Santa Barbara

NEW YORK OXFORD OXFORD UNIVERSITY PRESS

CONTENTS

Preface to the International Edition xv

PART I NUMBER REPRESENTATION 1

1 Numbers and Arithmetic 3 1.1 What is Computer Arithmetic ? 3 1.2 Motivating Examples 6 1.3 Numbers and Their Encodings 8 1.4 Fixed-Radix Positional Number Systems 10 1.5 Number Radix Conversion 12 1.6 Classes of Number Representations 16

Problems 17 References and Further Readings 23

2 Representing Signed Numbers 24 2.1 Signed-Magnitude Representation 24 2.2 Biased Representations 26 2.3 Complement Representations 27 2.4 2's-and l's-Complement Numbers 29 2.5 Direct and Indirect Signed Arithmetic 33 2.6 Using Signed Positions or Signed Digits 34


3 Redundant Number Systems 42 3.1 Coping with the Carry Problem 42 3.2 Redundancy in Computer Arithmetic 45 3.3 Digit Sets and Digit-Set Conversions 46 3.4 Generalized Signed-Digit Numbers 48 3.5 Carry-Free Addition Algorithms 51 3.6 Conversions and Support Functions 56


4 Residue Number Systems 64 4.1 RNS Representation and Arithmetic 64 4.2 Choosing the RNS Moduli 67 4.3 Encoding and Decoding of Numbers 70 4.4 Difficult RNS Arithmetic Operations 75 4.5 Redundant RNS Representations 78 4.6 Limits of Fast Arithmetic in RNS 78


PART I I ADDITION/SUBTRACTION 87

5 Basic Addition and Counting 89 5.1 Bit-Serial and Ripple-Carry Adders 89 5.2 Conditions and Exceptions 93 5.3 Analysis of Carry Propagation 94 5.4 Carry-Completion Detection 96 5.5 Addition of a Constant: Counters 98 5.6 Manchester Carry Chains and Adders 100


6 Carry-Lookahead Adders 109 6.1 Unrolling the Carry Recurrence 109 6.2 Carry-Lookahead Adder Design 111 6.3 Ling Adder and Related Designs 115 6.4 Carry Determination as Prefix Computation 116 6.5 Alternative Parallel Prefix Networks 118 6.6 VLSI Implementation Aspects 122


7 Variations in Fast Adders 130 7.1 Simple Carry-Skip Adders 130 7.2 Multilevel Carry-Skip Adders 133 7.3 Carry-Select Adders 136 7.4 Conditional-Sum Adder 139 7.5 Hybrid Designs and Optimizations 141 7.6 Modular Two-Operand Adders 143


Contents i x

8 MultiOperand Addition 152 8.1 Using Two-Operand Adders 152 8.2 Carry-Save Adders 155 8.3 Wallace and Dadda Trees 159 8.4 Parallel Counters and Compressors 161 8.5 Adding Multiple Signed Numbers 164 8.6 Modular Multioperand Adders 165


PART I I I MULTIPLICATION 175

9 Basic Multiplication Schemes 177 9.1 Shift/Add Multiplication Algorithms 177 9.2 Programmed Multiplication 179 9.3 Basic Hardware Multipliers 181 9.4 Multiplication of Signed Numbers 182 9.5 Multiplication by Constants 186 9.6 Preview of Fast Multipliers 189


10 High-Radix Multipliers 195 10.1 Radix-4 Multiplication 195 10.2 Modified Booth's Recoding 198 10.3 Using Carry-Save Adders 200 10.4 Radix-8 and Radix-16 Multipliers 203 10.5 Multibeat Multipliers 205 10.6 VLSI Complexity Issues 207


11 Tree and Array Multipliers 213 11.1 Full-Tree Multipliers 213 11.2 Alternative Reduction Trees 216 11.3 Tree Multipliers for Signed Numbers 219 11.4 Partial-Tree and Truncated Multipliers 222 11.5 Array Multipliers 224 11.6 Pipelined Tree and Array Multipliers 228


x Contents

12 Variations in Multipliers 236 12.1 Divide-and-Conquer Designs 236 12.2 Additive Multiply Modules 239 12.3 Bit-Serial Multipliers 241 12.4 Modular Multipliers 246 12.5 The Special Case of Squaring 248 12.6 Combined Multiply-Add Units 249


PART I V DIVISION 259

13 Basic Division Schemes 261 13.1 Shift/Subtract Division Algorithms 261 13.2 Programmed Division 264 13.3 Restoring Hardware Dividers 266 13.4 Nonrestoring and Signed Division 268 13.5 Division by Constants 273 13.6 Radix-2 SRT Division 275


14 High-Radix Dividers 286 14.1 Basics of High-Radix Division 286 14.2 Using Carry-Save Adders 288 14.3 Radix-4 SRT Division 292 14.4 General High-Radix Dividers 295 14.5 Quotient Digit Selection 296 14.6 Using p-d Plots in Practice 299


15 Variations in Dividers 308 15.1 Division with Prescaling 308 15.2 Overlapped Quotient Digit Selection 310 15.3 Combinational and Array Dividers 311 15.4 Modular Dividers and Reducers 314 15.5 The Special Case of Reciprocation 317 15.6 Combined Multiply/Divide Units 319


Contents x i

16 Division by Convergence 327 16.1 General Convergence Methods 327 16.2 Division by Repeated Multiplications 329 16.3 Division by Reciprocation 331 16.4 Speedup of Convergence Division 333 16.5 Hardware Implementation 336 16.6 Analysis of Lookup Table Size 337


P A R T V REAL ARITHMETIC 345

17 Floating-Point Representations 347 17.1 Floating-Point Numbers 347 17.2 The IEEE Floating-Point Standard 351 17.3 Basic Floating-Point Algorithms 354 17.4 Conversions and Exceptions 355 17.5 Rounding Schemes 357 17.6 Logarithmic Number Systems 362


18 Floating-Point Operations 370 18.1 Floating-Point Adders/Subtractors 370 18.2 Pre-and Postshifting 373 18.3 Rounding and Exceptions 376 18.4 Floating-Point Multipliers and Dividers 378 18.5 Fused-Multiply-Add Units 380 18.6 Logarithmic Arithmetic Unit 382


19 Errors and Error Control 391 19.1 Sources of Computational Errors 391 19.2 Invalidated Laws of Algebra 395 19.3 Worst-Case Error Accumulation 397 19.4 Error Distribution and Expected Errors 399 19.5 Forward Error Analysis 401 19.6 Backward Error Analysis 403


20 Precise and Certifiable Arithmetic 411 20.1 High Precision and Certifiability 411 20.2 Exact Arithmetic 412 20.3 Multiprecision Arithmetic 416 20.4 Variable-Precision Arithmetic 419 20.5 Error Bounding via Interval Arithmetic 421 20.6 Adaptive and Lazy Arithmetic 424


PART V I FUNCTION EVALUATION 433

21 Square-Rooting Methods 435 21.1 The Pencil-and-Paper Algorithm 435 21.2 Restoring Shift/Subtract Algorithm 438 21.3 Binary Nonrestoring Algorithm 440 21.4 High-Radix Square-Rooting 442 21.5 Square-Rooting by Convergence 444 21.6 Fast Hardware Square-Rooters 446


22 The CORDIC Algorithms 455 22.1 Rotations and Pseudorotations 455 22.2 Basic CORDIC Iterations 457 22.3 CORDIC Hardware 461 22.4 Generalized CORDIC 461 22.5 Using the CORDIC Method 464 22.6 An Algebraic Formulation 467


23 Variations in Function Evaluation 475 23.1 Normalization and Range Reduction 475 23.2 Computing Logarithms 477 23.3 Exponentiation 480 23.4 Division and Square-Rooting, Again 482 23.5 Use of Approximating Functions 485 23.6 Merged Arithmetic 487


Contents x i l i

24 Arithmetic by Table Lookup 495 24.1 Direct and Indirect Table Lookup 495 24.2 Binary-to-Unary Reduction 497 24.3 Tables in Bit-Serial Arithmetic 500 24.4 Interpolating Memory 502 24.5 Piecewise Lookup Tables 506 24.6 Multipartite Table Methods 509


PART V I I IMPLEMENTATION TOPICS 517

25 High-Throughput Arithmetic 519 25.1 Pipelining of Arithmetic Functions 519 25.2 Clock Rate and Throughput 522 25.3 The Earle Latch 524 25.4 Parallel and Digit-Serial Pipelines 526 25.5 On-Line or Digit-Pipelined Arithmetic 528 25.6 Systolic Arithmetic Units 532


26 Low-Power Arithmetic 541 26.1 The Need for Low-Power Design 541 26.2 Sources of Power Consumption 543 26.3 Reduction of Power Waste 546 26.4 Reduction of Activity 549 26.5 Transformations and Trade-offs 551 26.6 New and Emerging Methods 554


27 Fault-Tolerant Arithmetic 562 27.1 Faults, Errors, and Error Codes 562 27.2 Arithmetic Error-Detecting Codes 566 27.3 Arithmetic Error-Correcting Codes 571 27.4 Self-Checking Function Units 572 27.5 Algorithm-Based Fault Tolerance 574 27.6 Fault-Tolerant RNS Arithmetic 576


28 Reconfigurable Arithmetic 583 28.1 Programmable Logic Devices 583 28.2 Adder Designs for FPGAs 588 28.3 Multiplier and Divider Designs 590 28.4 Tabular and Distributed Arithmetic 593 28.5 Function Evaluation on FPGAs 594 28.6 Beyond Fine-Grained Devices 596


Appendix: Research Topics 605

Index 611

Documents

ALGORITHMSAND DESIGN METHODS FOR DIGITAL COMPUTER …