Author
others
View
5
Download
0
Embed Size (px)
Fundamentals of Computer Organization and Design
Sivarama P. Dandamudi
School of Computer Science
Carleton University
September 22, 2002
s
s
s
s s
s s s
s
y
c
y y y
y
s
y
s
y
s
s
y
y
y
y s
s
s s
—
s s k y
s
s y y
c y y
s
s y
s s s s y y
s
s y
k
To
my parents, Subba Rao and Prameela Rani,
my wife, Sobha,
and
my daughter, Veda
Preface
Computer science and engineering curricula have been evolving at a faster pace to keep up with
the developments in the area. This often dictates that traditional courses will have to be com-
pressed to accommodate new courses. In particular, it is no longer possible in these curricula
to include separate courses on digital logic, assembly language programming, and computer
organization. Often, these three topics are combined into a single course. The current textbooks
in the market cater to the old-style curricula in these disciplines, with separate books available
on each of these subjects. Most computer organization books do not cover assembly language
programming in sufficient detail. There is a definite need to support the courses that combine
assembly language programming and computer organization. This is the main motivation for
writing this book. It provides a comprehensive coverage of digital logic, assembly language
programming, and computer organization.
Intended UseThis book is intended as an undergraduate textbook for computer organization courses offered
by computer science and computer engineering/electrical engineering departments. Unlike
other textbooks in this area, this book provides extensive coverage of assembly language pro-
gramming and digital logic. Thus, the book serves the needs of compressed courses.
In addition, it can be used as a text in vocational training courses offered by community
colleges. Because of the teach-by-example style used in the book, it is also suitable for self-
study by computer professionals and engineers.
vii
viii Preface
PrerequisitesThe objective is to support a variety of courses on computer organization in computer science
and engineering departments. To satisfy this objective, we assume very little background on
the part of the student. The student is assumed to have had some programming experience in a
structured, high-level language such as C or Java™. This is the background almost all students
in computer science and computer engineering programs typically acquire in their first year
of study. This prerequisite also implies that the student has been exposed to the basics of the
software-development cycle.
FeaturesHere is a summary of the special features that set this book apart:
• Most computer organization books assume that the students have done a separate digital
logic course before taking the computer organization course. As a result, digital logic
is covered in an appendix to provide an overview. This book provides detailed cover-
age of digital logic, including sequential logic circuit design. Three complete chapters
are devoted to digital logic topics, where students are exposed to the practical side with
details on several example digital logic chips. There is also information on digital logic
simulators. Students can conveniently use these simulators to test their designs.
• This book provides extensive coverage of assembly language programming, comprising
assembly language of both CISC and RISC processors. We use the Pentium as the rep-
resentative of the CISC category and devote more than five chapters to introducing the
Pentium assembly language. The MIPS processor is used for RISC assembly language
programming. In both cases, students actually write and test working assembly language
programs. The book’s homepage has instructions on downloading assemblers for both
Pentium and MIPS processors.
• We introduce concepts first in simple terms to motivate the reader. Later, we relate these
concepts to practical implementations. In the digital logic part, we use several chips to
show the type of implementations done in practice. For the other topics, we consistently
use three processors—the Pentium, PowerPC, and MIPS—to cover the CISC to RISC
range. In addition, we provide details on the Itanium and SPARC processors.
• Most textbooks in the area treat I/O and interrupts as an appendage. As a result, this
topic is discussed very briefly. Consequently, students do not get any practical experience
on how interrupts work. In contrast, we use the Pentium to illustrate their operation.
Several assembly language programs are used to explain the interrupt concepts. We also
show how interrupt service routines can be written. For instance, one example in the
chapter on interrupts replaces the system-supplied keyboard service routine by our own.
By understanding the practical aspects of interrupt processing, students can write their
own programs to experiment with interrupts.
Preface ix
• Our coverage of system buses is comprehensive and up-to-date. We divide our coverage
into internal and external buses. Internal buses discussed include the ISA, PCI, PCI-X,
AGP, and PCMCIA buses. Our external bus coverage includes the EIA-232, SCSI, USB,
and IEEE 1394 (FireWire) serial buses.
• Extensive assembly programming examples are used to illustrate the points. A set of
input and output routines is provided so that the reader can focus on developing assembly
language programs rather than spending time in understanding how input and output can
be done using the basic I/O functions provided by the operating system.
• We do not use fragments of assembly language code in examples. All examples are
complete in the sense that they can be assembled and run to give a better feeling as to
how these programs work.
• All examples used in the textbook and other proprietary I/O software are available from
the book’s homepage (www.scs.carleton.ca/˜sivarama/org_book). In ad-dition, this Web site also has instructions on downloading the Pentium and MIPS assem-
blers to give opportunities for students to perform hands-on assembly programming.
• Most chapters are written in such a way that each chapter can be covered in two or three
60-minute lectures by giving proper reading assignments. Typically, important concepts
are emphasized in the lectures while leaving the other material in the book as a reading
assignment. Our emphasis on extensive examples facilitates this pedagogical approach.
• Interchapter dependencies are kept to a minimum to offer maximum flexibility to instruc-
tors in organizing the material. Each chapter clearly indicates the objectives and provides
an overview at the beginning and a summary and key terms at the end.
Instructional SupportThe book’s Web site has complete chapter-by-chapter slides for instructors. Instructors can use
these slides directly in their classes or can modify them to suit their needs. Please contact the
author if you want the PowerPoint source of the slides. Copies of these slides (four per page)
are also available for distribution to students. In addition, instructors can obtain the solutions
manual by contacting the publisher. For more up-to-date details, please see the book’s Web
page at www.scs.carleton.ca/˜sivarama/org_book.
Overview and OrganizationThe book is divided into eight parts. In addition, Appendices provide useful reference material.
Part I consists of a single chapter and gives an overview of basic computer organization and
design.
Part II presents digital logic design in three chapters—Chapters 2, 3, and 4. Chapter 2
covers the digital logic basics. We introduce the basic concepts and building blocks that we
use in the later chapters to build more complex digital circuits such as adders and arithmetic
logic units (ALUs). This chapter also discusses the principles of digital logic design using
Boolean algebra, Karnaugh maps, and Quine–McCluskey methods. The next chapter deals
x Preface
with combinational circuits. We present the design of adders, comparators, and ALUs. We
also show how programmable logic devices can be used to implement combinational logic
circuits. Chapter 4 covers sequential logic circuits. We introduce the concept of time through
clock signals. We discuss both latches and flip-flops, including master–slave JK flip-flops.
These elements form the basis for designing memories in a later chapter. After presenting some
example sequential circuits such as shift registers and counters, we discuss sequential circuit
design in detail. These three chapters together cover the digital logic topic comprehensively.
The amount of time spent on this part depends on the background of the students.
Part III deals with system interconnection structures. We divide the system buses into in-
ternal and external buses. Our classification is based on whether the bus interconnects compo-
nents that are typically inside a system. Part III consists of Chapter 5 and covers internal system
buses. We start this chapter with a discussion of system bus design issues. We discuss both syn-
chronous and asynchronous buses. We also introduce block transfer bus cycles as well as wait
states. Bus arbitration schemes are described next. We present five example buses including the
ISA, PCI, PCI-X, AGP, and PCMCIA buses. The external buses are covered in Part VIII, which
discusses the I/O issues.
Part IV consists of three chapters and discusses processor design issues. Chapter 6 presents
the basics of processor organization and performance. We discuss instruction set architectures
and instruction set design issues. This chapter also covers microprogrammed control. In addi-
tion, processor performance issues, including the SPEC benchmarks, are discussed. The next
chapter gives details about the Pentium processor. The information presented in this chapter
is useful when we discuss Pentium assembly language programming in Part V. Pipelining and
vector processors are discussed in the last chapter of this part. We use the Cray X-MP system
to look at the practical side of vector processors. After covering the material in Chapter 6,
instructors can choose the material from Chapters 7 and 8 to suit their course requirements.
Part V covers Pentium assembly language programming in detail. There are five chapters
in this part. Chapter 9 provides an overview of the Pentium assembly language. All necessary
basic features are covered in this chapter. After reading this chapter, students can write simple
Pentium assembly programs without needing the information presented in the later four chap-
ters. Chapter 10 describes the Pentium addressing modes in detail. This chapter gives enough
information for the student to understand why CISC processors provide complex addressing
modes. The next chapter deals with procedures. Our intent is to expose the student to the un-
derlying mechanics involved in procedure calls, parameter passing, and local variable storage.
In addition, recursive procedures are used to explore the principles involved in handling recur-
sion. In all these activities, the important role played by the stack is illustrated. Chapter 12
describes the Pentium instruction set. Our goal is not to present the complete Pentium instruc-
tions, but a representative sample. Chapter 13 deals with the high-level language interface,
which allows mixed-mode programming in more than one language. We use C and assembly
language to illustrate the principles involved in mixed-mode programming. Each chapter uses
several examples to show how various Pentium instructions are used.
Part VI covers RISC processors in two chapters. The first chapter introduces the general
RISC design principles. It also presents details about two RISC processors: the PowerPC and
Preface xi
Intel Itanium. Although both are considered RISC processors, they also have some CISC fea-
tures. We discuss a pure RISC processor in the next chapter. The Itanium is Intel’s 64-bit
processor that not only incorporates RISC characteristics but also several advanced architec-
tural features. These features include instruction-level parallelism, predication, and speculative
loads. The second chapter in this part describes the MIPS R2000 processor. The MIPS sim-
ulator SPIM runs the programs written for the R2000 processor. We present MIPS assembly
language programs that are complete and run on the SPIM. The programs we present here are
the same programs we have written in the Pentium assembly language (in Part V). Thus, the
reader has an opportunity to contrast the two assembly languages.
Part VII consists of Chapters 16 through 18 and covers memory design issues. Chapter 16
builds on the digital logic material presented in Part II. It describes how memory units can be
constructed using the basic latches and flip-flops presented in Chapter 4. Memory mapping
schemes, both full- and partial-mapping, are also discussed. In addition, we discuss how inter-
leaved memories are designed. The next chapter covers cache memory principles and design
issues. We use an extensive set of examples to illustrate the cache principles. Toward the end
of the chapter, we look at example cache implementations in the Pentium, PowerPC, and MIPS
processors. Chapter 18 discusses virtual memory systems. Note that our coverage of virtual
memory is from the computer organization viewpoint. As a result, we do not cover those as-
pects that are of interest from the operating-system point of view. As with the cache memory, we
look at the virtual memory implementations of the Pentium, PowerPC, and MIPS processors.
The last part covers the I/O issues. We cover the basic I/O interface issues in Chapter 19.
We start with I/O address mapping and then discuss three techniques often used to interface
with I/O devices: programmed I/O, interrupt-driven I/O, and DMA. We discuss interrupt-driven
I/O in detail in the next chapter. In addition, this chapter also presents details about external
buses. In particular, we cover the EIA-232, USB, and IEEE 1394 serial interfaces and the SCSI
parallel interface. The last chapter covers Pentium interrupts in detail. We use programming
examples to illustrate interrupt-driven access to I/O devices. We also present an example to
show how user-defined interrupt service routines can be written.
The appendices provide a wealth of reference material needed by the student. Appendix A
primarily discusses computer arithmetic. Character representation is discussed in Appendix B.
Appendix C gives information on the use of I/O routines provided with this book and the Pen-
tium assembler software. The debugging aspect of assembly language programming is dis-
cussed in Appendix D. Appendix E gives details on running the Pentium assembly programs
on a Linux system using the NASM assembler. Appendix F gives details on digital logic sim-
ulators. Details on the MIPS simulator SPIM are in Appendix G. Appendix H describes the
SPARC processor architecture. Finally, selected Pentium instructions are given in Appendix I.
AcknowledgmentsSeveral people have contributed to the writing of this book. First and foremost, I would like to
thank my wife, Sobha, and my daughter, Veda, for enduring my preoccupation with this project.
I thank Wayne Yuhasz, Executive Editor at Springer-Verlag, for his input and feedback in
xii Preface
developing this project. His guidance and continued support for the project are greatly appreci-
ated. I also want to thank Wayne Wheeler, Assistant Editor, for keeping track of the progress.
He has always been prompt in responding to my queries. Thanks are also due to the staff at
Springer-Verlag New York, Inc., particularly Francine McNeill, for its efforts in producing this
book. I would also like to thank Valerie Greco for doing an excellent job of copyediting the
text.
My sincere appreciation goes to the School of Computer Science at Carleton University for
allowing me to use part of my sabbatical leave to complete this book.
FeedbackWorks of this nature are never error-free, despite the best efforts of the authors and others
involved in the project. I welcome your comments, suggestions, and corrections by electronic
mail.
Ottawa, Ontario, Canada Sivarama P. Dandamudi
December 2001 [email protected]://www.scs.carleton.ca/˜sivarama
Contents
Preface vii
PART I: Overview 1
1 Overview of Computer Organization 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Basic Terms and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Programmer’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Advantages of High-Level Languages . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Why Program in Assembly Language? . . . . . . . . . . . . . . . . . . . . . 11
1.3 Architect’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Implementer’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 The Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.1 Pipelining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.2 RISC and CISC Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Basic Memory Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.2 Byte Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.6.3 Two Important Memory Design Issues . . . . . . . . . . . . . . . . . . . . . 24
1.7 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.8 Interconnection: The Glue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.9 Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.9.1 The Early Generations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.9.2 Vacuum Tube Generation: Around the 1940s and 1950s . . . . . . . . . . . 31
1.9.3 Transistor Generation: Around the 1950s and 1960s . . . . . . . . . . . . . 32
1.9.4 IC Generation: Around the 1960s and 1970s . . . . . . . . . . . . . . . . . 32
1.9.5 VLSI Generations: Since the Mid-1970s . . . . . . . . . . . . . . . . . . . . 32
1.10 Technological Advances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.11 Summary and Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
xiii
xiv Contents
PART II: Digital Logic Design 39
2 Digital Logic Basics 41
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2 Basic Concepts and Building Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.1 Simple Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2.2 Completeness and Universality . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3 Logic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3.1 Expressing Logic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3.2 Logical Circuit Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.4 Boolean Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.1 Boolean Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.2 Using Boolean Algebra for Logical Equivalence . . . . . . . . . . . . . . . 54
2.5 Logic Circuit Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.6 Deriving Logical Expressions from Truth Tables . . . . . . . . . . . . . . . . . . . . 56
2.6.1 Sum-of-Products Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.6.2 Product-of-Sums Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.6.3 Brute Force Method of Implementation . . . . . . . . . . . . . . . . . . . . 58
2.7 Simplifying Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.7.1 Algebraic Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.7.2 Karnaugh Map Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.7.3 Quine–McCluskey Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.8 Generalized Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.9 Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.10 Implementation Using Other Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.10.1 Implementation Using NAND and NOR Gates . . . . . . . . . . . . . . . . 75
2.10.2 Implementation Using XOR Gates . . . . . . . . . . . . . . . . . . . . . . . 77
2.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.12 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3 Combinational Circuits 83
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.2 Multiplexers and Demultiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.2.1 Implementation: A Multiplexer Chip . . . . . . . . . . . . . . . . . . . . . . 86
3.2.2 Efficient Multiplexer Designs . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.2.3 Implementation: A 4-to-1 Multiplexer Chip . . . . . . . . . . . . . . . . . . 87
3.2.4 Demultiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.3 Decoders and Encoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.3.1 Decoder Chips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.3.2 Encoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Contents xv
3.4 Comparators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.4.1 A Comparator Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.5 Adders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.5.1 An Example Adder Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.6 Programmable Logic Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.6.1 Programmable Logic Arrays (PLAs) . . . . . . . . . . . . . . . . . . . . . . 98
3.6.2 Programmable Array Logic Devices (PALs) . . . . . . . . . . . . . . . . . . 100
3.7 Arithmetic and Logic Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.7.1 An Example ALU Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4 Sequential Logic Circuits 109
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2 Clock Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3 Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.3.1 SR Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.3.2 Clocked SR Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.3.3 D Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.4 Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4.1 D Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4.2 JK Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.4.3 Example Chips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.5 Example Sequential Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.5.1 Shift Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.5.2 Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.6 Sequential Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.6.1 Binary Counter Design with JK Flip-Flops . . . . . . . . . . . . . . . . . . 127
4.6.2 General Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
PART III: Interconnection 145
5 System Buses 147
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2 Bus Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.2.1 Bus Width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.2.2 Bus Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.2.3 Bus Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.3 Synchronous Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.3.1 Basic Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
xvi Contents
5.3.2 Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.3.3 Block Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.4 Asynchronous Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.5 Bus Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.5.1 Dynamic Bus Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.5.2 Implementation of Dynamic Arbitration . . . . . . . . . . . . . . . . . . . . 161
5.6 Example Buses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.6.1 The ISA Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.6.2 The PCI Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.6.3 Accelerated Graphics Port (AGP) . . . . . . . . . . . . . . . . . . . . . . . 180
5.6.4 The PCI-X Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.6.5 The PCMCIA Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
5.8 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
5.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
PART IV: Processors 195
6 Processor Organization and Performance 197
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.2 Number of Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.2.1 Three-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.2.2 Two-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6.2.3 One-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.2.4 Zero-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6.2.5 A Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
6.2.6 The Load/Store Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 206
6.2.7 Processor Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.3 Flow of Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.3.1 Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.3.2 Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
6.4 Instruction Set Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
6.4.1 Operand Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
6.4.2 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.4.3 Instruction Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
6.4.4 Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.5 Microprogrammed Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.5.1 Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
6.5.2 Software Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
6.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
6.6.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
6.6.2 Execution Time Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Contents xvii
6.6.3 Means of Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
6.6.4 The SPEC Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
7 The Pentium Processor 251
7.1 The Pentium Processor Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.2 The Pentium Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
7.3 The Pentium Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
7.3.1 Data Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
7.3.2 Pointer and Index Registers . . . . . . . . . . . . . . . . . . . . . . . . . . 257
7.3.3 Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
7.3.4 Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
7.4 Real Mode Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
7.5 Protected Mode Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 265
7.5.1 Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
7.5.2 Segment Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
7.5.3 Segment Descriptor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
7.5.4 Segmentation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.5.5 Mixed-Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
7.5.6 Which Segment Register to Use . . . . . . . . . . . . . . . . . . . . . . . . 270
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
8 Pipelining and Vector Processing 273
8.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
8.2 Handling Resource Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
8.3 Data Hazards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
8.3.1 Register Forwarding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.3.2 Register Interlocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
8.4 Handling Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
8.4.1 Delayed Branch Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
8.4.2 Branch Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
8.5 Performance Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
8.5.1 Superscalar Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
8.5.2 Superpipelined Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
8.5.3 Very Long Instruction Word Architectures . . . . . . . . . . . . . . . . . . . 290
8.6 Example Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
8.6.1 Pentium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
8.6.2 PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
8.6.3 SPARC Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
8.6.4 MIPS Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
xviii Contents
8.7 Vector Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.7.1 What Is Vector Processing? . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
8.7.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
8.7.3 Advantages of Vector Processing . . . . . . . . . . . . . . . . . . . . . . . . 303
8.7.4 The Cray X-MP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
8.7.5 Vector Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
8.7.6 Vector Stride . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
8.7.7 Vector Operations on the Cray X-MP . . . . . . . . . . . . . . . . . . . . . 309
8.7.8 Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.8 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
8.8.1 Pipeline Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
8.8.2 Vector Processing Performance . . . . . . . . . . . . . . . . . . . . . . . . 314
8.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
8.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
PART V: Pentium Assembly Language 319
9 Overview of Assembly Language 321
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
9.2 Assembly Language Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
9.3 Data Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
9.3.1 Range of Numeric Operands . . . . . . . . . . . . . . . . . . . . . . . . . . 326
9.3.2 Multiple Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
9.3.3 Multiple Initializations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
9.3.4 Correspondence to C Data Types . . . . . . . . . . . . . . . . . . . . . . . . 330
9.3.5 LABEL Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
9.4 Where Are the Operands? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
9.4.1 Register Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
9.4.2 Immediate Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 333
9.4.3 Direct Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
9.4.4 Indirect Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
9.5 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
9.5.1 The mov Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
9.5.2 The xchg Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
9.5.3 The xlat Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
9.6 Pentium Assembly Language Instructions . . . . . . . . . . . . . . . . . . . . . . . 340
9.6.1 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
9.6.2 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
9.6.3 Iteration Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
9.6.4 Logical Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
9.6.5 Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
9.6.6 Rotate Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Contents xix
9.7 Defining Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
9.7.1 The EQU Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
9.7.2 The = Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
9.8 Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
9.9 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
9.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
9.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
9.12 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
10 Procedures and the Stack 387
10.1 What Is a Stack? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
10.2 Pentium Implementation of the Stack . . . . . . . . . . . . . . . . . . . . . . . . . 388
10.3 Stack Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
10.3.1 Basic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
10.3.2 Additional Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
10.4 Uses of the Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
10.4.1 Temporary Storage of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
10.4.2 Transfer of Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
10.4.3 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
10.5 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
10.6 Assembler Directives for Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 396
10.7 Pentium Instructions for Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 397
10.7.1 How Is Program Control Transferred? . . . . . . . . . . . . . . . . . . . . . 397
10.7.2 The ret Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
10.8 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
10.8.1 Register Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
10.8.2 Stack Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
10.8.3 Preserving Calling Procedure State . . . . . . . . . . . . . . . . . . . . . . . 406
10.8.4 Which Registers Should Be Saved? . . . . . . . . . . . . . . . . . . . . . . 406
10.8.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
10.9 Handling a Variable Number of Parameters . . . . . . . . . . . . . . . . . . . . . . 417
10.10 Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
10.11 Multiple Source Program Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
10.11.1 PUBLIC Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
10.11.2 EXTRN Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
10.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
10.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
10.14 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
11 Addressing Modes 435
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
xx Contents
11.2 Memory Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
11.2.1 Based Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
11.2.2 Indexed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
11.2.3 Based-Indexed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
11.3 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
11.4 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
11.4.1 One-Dimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
11.4.2 Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
11.4.3 Examples of Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
11.5 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
11.5.1 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
11.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
11.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
11.8 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
12 Selected Pentium Instructions 471
12.1 Status Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
12.1.1 The Zero Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
12.1.2 The Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
12.1.3 The Overflow Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
12.1.4 The Sign Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
12.1.5 The Auxiliary Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
12.1.6 The Parity Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
12.1.7 Flag Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
12.2 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
12.2.1 Multiplication Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
12.2.2 Division Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
12.2.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
12.3 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
12.3.1 Indirect Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
12.3.2 Conditional Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
12.4 Implementing High-Level Language Decision Structures . . . . . . . . . . . . . . . 504
12.4.1 Selective Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
12.4.2 Iterative Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
12.5 Logical Expressions in High-Level Languages . . . . . . . . . . . . . . . . . . . . 510
12.5.1 Representation of Boolean Data . . . . . . . . . . . . . . . . . . . . . . . . 510
12.5.2 Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
12.5.3 Bit Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
12.5.4 Evaluation of Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . 511
12.6 Bit Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
12.6.1 Bit Test and Modify Instructions . . . . . . . . . . . . . . . . . . . . . . . . 515
12.6.2 Bit Scan Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
Contents xxi
12.7 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
12.8 String Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
12.8.1 String Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
12.8.2 String Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
12.8.3 String Processing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 536
12.8.4 Testing String Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
12.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
12.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
12.11 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
13 High-Level Language Interface 551
13.1 Why Program in Mixed-Mode? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
13.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
13.3 Calling Assembly Procedures from C . . . . . . . . . . . . . . . . . . . . . . . . . 554
13.3.1 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
13.3.2 Returning Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
13.3.3 Preserving Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
13.3.4 Publics and Externals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
13.3.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
13.4 Calling C Functions from Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . 562
13.5 Inline Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
13.5.1 Compiling Inline Assembly Programs . . . . . . . . . . . . . . . . . . . . . 565
13.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
13.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
13.8 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
PART VI: RISC Processors 569
14 RISC Processors 571
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
14.2 Evolution of CISC Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
14.3 RISC Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
14.3.1 Simple Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
14.3.2 Register-to-Register Operations . . . . . . . . . . . . . . . . . . . . . . . . 576
14.3.3 Simple Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
14.3.4 Large Number of Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
14.3.5 Fixed-Length, Simple Instruction Format . . . . . . . . . . . . . . . . . . . 577
14.4 PowerPC Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
14.4.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
14.4.2 PowerPC Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
14.5 Itanium Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
14.5.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
xxii Contents
14.5.2 Itanium Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
14.5.3 Handling Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
14.5.4 Predication to Eliminate Branches . . . . . . . . . . . . . . . . . . . . . . . 605
14.5.5 Speculative Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
14.5.6 Branch Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
14.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
14.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
15 MIPS Assembly Language 615
15.1 MIPS Processor Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
15.1.1 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
15.1.2 General-Purpose Register Usage Convention . . . . . . . . . . . . . . . . . 617
15.1.3 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
15.1.4 Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
15.2 MIPS Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
15.2.1 Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
15.2.2 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
15.2.3 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
15.2.4 Logical Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
15.2.5 Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
15.2.6 Rotate Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
15.2.7 Comparison Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
15.2.8 Branch and Jump Instructions . . . . . . . . . . . . . . . . . . . . . . . . . 630
15.3 SPIM System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
15.4 SPIM Assembler Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
15.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
15.6 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
15.7 Stack Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
15.7.1 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
15.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
15.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
15.10 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
PART VII: Memory 663
16 Memory System Design 665
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
16.2 A Simple Memory Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
16.2.1 Memory Design with D Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . 667
16.2.2 Problems with the Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
16.3 Techniques to Connect to a Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
16.3.1 Using Multiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
Contents xxiii
16.3.2 Using Open Collector Outputs . . . . . . . . . . . . . . . . . . . . . . . . . 669
16.3.3 Using Tristate Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
16.4 Building a Memory Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
16.5 Building Larger Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
16.5.1 Designing Independent Memory Modules . . . . . . . . . . . . . . . . . . . 676
16.5.2 Designing Larger Memories Using Memory Chips . . . . . . . . . . . . . . 678
16.6 Mapping Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
16.6.1 Full Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
16.6.2 Partial Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
16.7 Alignment of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
16.8 Interleaved Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
16.8.1 The Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685
16.8.2 Synchronized Access Organization . . . . . . . . . . . . . . . . . . . . . . . 686
16.8.3 Independent Access Organization . . . . . . . . . . . . . . . . . . . . . . . 687
16.8.4 Number of Banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688
16.8.5 Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
16.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
16.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690
17 Cache Memory 693
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694
17.2 How Cache Memory Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695
17.3 Why Cache Memory Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697
17.4 Cache Design Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
17.5 Mapping Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700
17.5.1 Direct Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703
17.5.2 Associative Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707
17.5.3 Set-Associative Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708
17.6 Replacement Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711
17.7 Write Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.8 Space Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
17.9 Mapping Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717
17.10 Types of Cache Misses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
17.11 Types of Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719
17.11.1 Separate Instruction and Data Caches . . . . . . . . . . . . . . . . . . . . . 719
17.11.2 Number of Cache Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720
17.11.3 Virtual and Physical Caches . . . . . . . . . . . . . . . . . . . . . . . . . . 722
17.12 Example Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
17.12.1 Pentium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
17.12.2 PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
17.12.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
xxiv Contents
17.13 Cache Operation: A Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
17.13.1 Placement of a Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
17.13.2 Location of a Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728
17.13.3 Replacement Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728
17.13.4 Write Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728
17.14 Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
17.14.1 Cache Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
17.14.2 Cache Line Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
17.14.3 Degree of Associativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
17.15 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
17.16 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
18 Virtual Memory 735
18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
18.2 Virtual Memory Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
18.2.1 Page Replacement Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
18.2.2 Write Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
18.2.3 Page Size Tradeoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
18.2.4 Page Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
18.3 Page Table Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
18.3.1 Page Table Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
18.4 The Translation Lookaside Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
18.5 Page Table Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744
18.5.1 Searching Hierarchical Page Tables . . . . . . . . . . . . . . . . . . . . . . 745
18.6 Inverted Page Table Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746
18.7 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
18.8 Example Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750
18.8.1 Pentium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750
18.8.2 PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754
18.8.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
18.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
18.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
PART VIII: Input and Output 765
19 Input/Output Organization 767
19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 768
19.2 Accessing I/O Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770
19.2.1 I/O Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770
19.2.2 Accessing I/O Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770
19.3 An Example I/O Device: Keyboard . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
19.3.1 Keyboard Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772
19.3.2 8255 Programmable Peripheral Interface Chip . . . . . . . . . . . . . . . . . 772
Contents xxv
19.4 I/O Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774
19.4.1 Programmed I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775
19.4.2 DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
19.5 Error Detection and Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784
19.5.1 Parity Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784
19.5.2 Error Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785
19.5.3 Cyclic Redundancy Check . . . . . . . . . . . . . . . . . . . . . . . . . . . 787
19.6 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
19.6.1 Serial Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794
19.6.2 Parallel Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797
19.7 Universal Serial Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
19.7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
19.7.2 Additional USB Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . 802
19.7.3 USB Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803
19.7.4 Transfer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803
19.7.5 USB Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805
19.7.6 USB Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807
19.8 IEEE 1394 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 810
19.8.1 Advantages of IEEE 1394 . . . . . . . . . . . . . . . . . . . . . . . . . . . 810
19.8.2 Power Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811
19.8.3 Transfer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
19.8.4 Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813
19.8.5 Bus Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
19.8.6 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
19.9 The Bus Wars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820
19.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821
19.11 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
19.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
20 Interrupts 825
20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826
20.2 A Taxonomy of Pentium Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . 827
20.3 Pentium Interrupt Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
20.3.1 Interrupt Processing in Protected Mode . . . . . . . . . . . . . . . . . . . . 829
20.3.2 Interrupt Processing in Real Mode . . . . . . . . . . . . . . . . . . . . . . . 829
20.4 Pentium Software Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
20.4.1 DOS Keyboard Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832
20.4.2 BIOS Keyboard Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837
20.5 Pentium Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842
20.6 Pentium Hardware Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
20.6.1 How Does the CPU Know the Interrupt Type? . . . . . . . . . . . . . . . . . 847
20.6.2 How Can More Than One Device Interrupt? . . . . . . . . . . . . . . . . . . 848
xxvi Contents
20.6.3 8259 Programmable Interrupt Controller . . . . . . . . . . . . . . . . . . . 848
20.6.4 A Pentium Hardware Interrupt Example . . . . . . . . . . . . . . . . . . . . 850
20.7 Interrupt Processing in the PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . 855
20.8 Interrupt Processing in the MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857
20.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
20.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860
20.11 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862
APPENDICES 863
A Computer Arithmetic 865
A.1 Positional Number Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865
A.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867
A.2 Number Systems Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868
A.2.1 Conversion to Decimal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868
A.2.2 Conversion from Decimal . . . . . . . . . . . . . . . . . . . . . . . . . . . 870
A.2.3 Conversion Among Binary, Octal, and Hexadecimal . . . . . . . . . . . . . 871
A.3 Unsigned Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874
A.3.1 Arithmetic on Unsigned Integers . . . . . . . . . . . . . . . . . . . . . . . 875
A.4 Signed Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 881
A.4.1 Signed Magnitude Representation . . . . . . . . . . . . . . . . . . . . . . . 882
A.4.2 Excess-M Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 882
A.4.3 1’s Complement Representation . . . . . . . . . . . . . . . . . . . . . . . . 883
A.4.4 2’s Complement Representation . . . . . . . . . . . . . . . . . . . . . . . . 886
A.5 Floating-Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887
A.5.1 Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887
A.5.2 Representing Floating-Point Numbers . . . . . . . . . . . . . . . . . . . . . 890
A.5.3 Floating-Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . 891
A.5.4 Floating-Point Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896
A.5.5 Floating-Point Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . 896
A.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897
A.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 898
A.8 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 900
B Character Representation 901
B.1 Character Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 901
B.2 Universal Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903
B.3 Unicode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903
B.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
C Assembling and Linking Pentium Assembly Language Programs 907
C.1 Structure of Assembly Language Programs . . . . . . . . . . . . . . . . . . . . . . 908
Contents xxvii
C.2 Input/Output Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 910
C.2.1 Character I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912
C.2.2 String I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912
C.2.3 Numeric I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
C.3 Assembling and Linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915
C.3.1 The Assembly Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915
C.3.2 Linking Object Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924
C.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924
C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
C.6 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
D Debugging Assembly Language Programs 927
D.1 Strategies to Debug Assembly Language Programs . . . . . . . . . . . . . . . . . . 928
D.2 DEBUG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930
D.2.1 Display Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930
D.2.2 Execution Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933
D.2.3 Miscellaneous Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
D.2.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934
D.3 Turbo Debugger TD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 938
D.4 CodeView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943
D.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
D.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
D.7 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945
E Running Pentium Assembly Language Programs on a Linux System 947
E.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948
E.2 NASM Assembly Language Program Template . . . . . . . . . . . . . . . . . . . . 948
E.3 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 950
E.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
E.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
E.6 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
F Digital Logic Simulators 957
F.1 Testing Digital Logic Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957
F.2 Digital Logic Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 958
F.2.1 DIGSim Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 958
F.2.2 Digital Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959
F.2.3 Multimedia Logic Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . 961
F.2.4 Logikad Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 962
F.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 966
F.4 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 966
F.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967
xxviii Contents
G SPIM Simulator
and Debugger 969
G.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969
G.2 Simulator Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 972
G.3 Running and Debugging a Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 973
G.3.1 Loading and Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973
G.3.2 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974
G.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
G.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
G.6 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
H The SPARC Architecture 979
H.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 979
H.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 980
H.3 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 982
H.4 Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984
H.4.1 Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984
H.4.2 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 984
H.4.3 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986
H.4.4 Logical Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
H.4.5 Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988
H.4.6 Compare Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988
H.4.7 Branch Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989
H.5 Procedures and Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . 993
H.5.1 Procedure Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 993
H.5.2 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 994
H.5.3 Stack Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 995
H.5.4 Window Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996
H.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000
H.7 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000
H.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000
I Pentium Instruction Set 1001
I.1 Pentium Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001
I.1.1 Instruction Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001
I.1.2 General Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 1002
I.2 Selected Pentium Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1004
Bibliography 1033
Index 1037
Chapter 1
Overview of
Computer Organization
Objectives• To provide a high-level overview of computer organization;
• To discuss how architects, implementers, programmers, and users view the computer
system;
• To describe the three main components: processor, memory, and I/O;
• To give a brief historical perspective of computers.
We begin each chapter with an overview of what you can expect in the chapter. This is our first
overview. The main purpose of this chapter is to provide an overview of the computer systems.
We start off with a brief introduction to computer systems from the user’s viewpoint.
Computer systems are complex. To manage this complexity, we use a series of abstractions.
The kind of abstraction used depends on what you want to do with the system. We present
the material in this book from three perspectives: from the computer architect’s view, from the
programmer’s view, and from the implementer’s view. We give details about these three views
in Sections 1.2 through 1.4.
A computer system consists of three major components: a processor, a memory unit, and
an input/output (I/O) subsystem. A system bus interconnects these three components. The next
three sections discuss these three components in detail. Section 1.5 provides an overview of
the processor component. The processors we cover in this book include the Pentium, MIPS,
PowerPC, Itanium, and SPARC. Section 1.6 presents some basic concepts about the memory
system. Later chapters describe in detail cache and virtual memories. Section 1.7 gives a brief
overview of how input/output devices such as the keyboard are interfaced to the system. A more
3
4 Chapter 1 Overview of Computer Organization
System
hardware
System
software
Applications
software
Figure 1.1 A user’s view of a computer system.
detailed description on I/O interfacing can be found in the last two chapters. We conclude the
chapter by providing a perspective on the history of computers.
1.1 IntroductionThis book is about digital computer systems, which have been revolutionizing our society. Most
of us use computers for a variety of tasks, from serious scientific computations to entertainment.
You are reading this book because you are interested in learning more about these magnificent
machines.
As with any complex project, several stages and players are involved in designing, imple-
menting, and realizing a computer system. This book deals with inside details of a computer
system, focusing on both hardware and software.
Computer hardware is the electronic circuitry that performs the actual work. Hardware
includes things with which you are already familiar such as the processor, memory, keyboard,
CD burner, and so on. Miniaturization of hardware is the most recent advance in the computer
hardware area. This miniaturization gives us such compact things as PocketPCs and Flash
memories.
Computer software can be divided into application software and system software. A user
interacts with the system through an application program. For the user, the application is the
computer! For example, if you are interested in browsing the Internet, you interact with the
system through a Web browser such as the Netscape™ Communicator or Internet Explorer. For
you, the system appears as though it is executing the application program (i.e., Web browser),
as shown in Figure 1.1.
Section 1.1 Introduction 5
At the core is the basic hardware, over which a layer of system software hides the gory
details about the hardware. Early ancestors of the Pentium and other processors were called
microprocessors because they were less powerful than the processors used in the computers at
that time.
The system software manages the hardware resources efficiently and also provides nice
services to the application software layer. What is the system software? Operating systems
such as Windows™, UNIX™, and Linux are the most familiar examples. System software also
includes compilers, assemblers, and linkers that we discuss later in this book. You are probably
more familiar with application software, which includes Web browsers, word processors, music
players, and so on.
This book presents details on various aspects of computer system design and programming.
We discuss organization and architecture of computer systems, how they are designed, and how
they are programmed. In order to clarify the scope of this book, we need to explain these terms:
computer architecture, computer organization, computer design, and computer programming.
Computer architecture refers to the aspects with which a programmer is concerned. The
most obvious one is the design of an instruction set for the computer. For example, should
the processor understand instructions to process multimedia data? The answer depends on
the intended use of the system. Clearly, if the target applications involve multimedia, adding
multimedia instructions will help improve the performance. Computer architecture, in a sense,
describes the computer system at a logical level, from the programmer’s viewpoint. It deals
with the selection of the basic functional units such as the processor and memory, and how they
should be interconnected into a computer system.
Computer organization is concerned with how the various hardware components operate
and how they are interconnected to implement the architectural specifications. For example, if
the architecture specifies a divide instruction, we will have a choice to implement this instruc-
tion either in hardware or in software. In a high-performance model, we may implement the
division operation in hardware to provide improved performance at a higher price. In cheaper
models, we may implement it in software. But cost need not be the only deciding criterion.
For example, the Pentium processor implements the divide operation in hardware whereas the
next generation Itanium processor implements division in software. If the next version of Ita-
nium uses a hardware implementation of division, that does not change the architecture, only
its organization.
Computer design is an activity that translates architectural specifications of a system into
an implementation using a particular organization. As a result, computer design is sometimes
referred to as computer implementation. A computer designer is concerned with the hardware
design of the computer.
Computer programming involves expressing the problem at hand in a language that the com-
puter can understand. As we show later, the native language that a computer can understand is
called the machine language. But this is not a language with which we humans are comfort-
able. So we use a language that we can easily read and understand. These languages are called
high-level languages, and include languages such as Java™ and C. We do not devote any space
for these high-level languages as they are beyond the scope of this book. Instead, we discuss
6 Chapter 1 Overview of Computer Organization
in detail languages that are close to the architecture of a machine. This allows us to study the
internal details of computer systems.
Computers are complex systems. How do we manage complexity of these systems? We
can get clues from looking at how we manage complex systems in life. Think of how a large
corporation is managed. We use a hierarchical structure to simplify the management: president
at the top and employees at the bottom. Each level of management filters out unnecessary details
on the lower levels and presents only an abstracted version to the higher-level management. This
is what we refer to as abstraction. We study computer systems by using layers of abstraction.
Different people view computer systems differently depending on the type of their interac-
tion. We use the concept of abstraction to look at only the details that are necessary from a
particular viewpoint. For example, if you are a computer architect, you are interested in the in-
ternal details that do not interest a normal user of the system. One can look at computer systems
from several different perspectives. We have already talked about the user’s view. Our interest
in this book is not at this level. Instead, we concentrate on the following views: (i) a program-
mer’s view, (ii) an architect’s view, and (iii) an implementer’s view. The next three sections
briefly discuss these perspectives.
1.1.1 Basic Terms and Notation
The alphabet of computers, more precisely digital computers, consists of 0 and 1. Each is
called a bit, which stands for the binary digit. The term byte is used to represent a group of
8 bits. The term word is used to refer to a group of bytes that is processed simultaneously.
The exact number of bytes that constitute a word depends on the system. For example, in the
Pentium, a word refers to four bytes or 32 bits. On the other hand, eight bytes are grouped into
a word in the Itanium processor. The reasons for this difference are explained later. We use the
abbreviation “b” for bits, “B” for bytes, and “W” for words. Sometimes we also use doubleword
and quadword. A doubleword has twice the number of bits as the word and the quadword has
four times the number of bits in a word.
Bits in a word are usually ordered from right to left, as you would write digits in a decimal
number. The rightmost bit is called the least significant bit (LSB), and the leftmost bit is called
the most significant bit (MSB). However, some manufacturers use the opposite notation. For
example, the PowerPC manuals use this notation. In this book, we consistently write bits of a
word from right to left, with the LSB as the rightmost bit.
We use standard terms such as kilo (K), mega (M), giga (G), and so on to represent large
integers. Unfortunately, we use two different versions of each, depending on the number system,
decimal or binary. Table 1.1 summarizes the differences between the two systems. Typically,
computer-related attributes use the binary version. For example, when we say 128 megabyte
(MB) memory, we mean ��� � ��� bytes. Usually, communication-related quantities and time
units are expressed using the decimal system. For example, when we say that the data transfer
rate is 100 megabits/second (Mb/s), we mean ���� ��� Mb/s.
Throughout the text, we use various number systems: binary, octal, and hexadecimal. Now
is a good time to refresh your memory by reviewing the material on number systems presented
Section 1.2 Programmer’s View 7
Table 1.1 Terms to represent large integer values
Term Decimal (base 10) Binary (base 2)
K (kilo) ��� ���
M (mega) ��� ���
G (giga) ��� ���
T (tera) ���� ���
P (peta) ���� ���
in Appendix A. If the number system used is not clear from the context, we use a trailing
letter to specify the number system. We use “D” for decimal numbers, “B” for binary numbers,
“Q” for octal numbers, and “H” for hexadecimal (or hex for short) numbers. For example,
10110101B is an 8-bit binary number whereas 10ABH is a hex number.
1.2 Programmer’s ViewA programmer’s view of a computer system depends on the type and level of language she
intends to use. From the programmer’s viewpoint, there exists a hierarchy from low-level lan-
guages to high-level languages. As we move up in this hierarchy, the level of abstraction in-
creases. At the lowest level, we have the machine language that is the native language of the
machine. This is the language understood by the machine hardware. Since digital computers use
0 and 1 as their alphabet, machine language naturally uses 1s and 0s to encode the instructions.
One level up, t