of 1066 /1066
Fundamentals of Computer Organization and Design Sivarama P. Dandamudi School of Computer Science Carleton University September 22, 2002

Fundamentals of Computer Organization and Designs2.bitdl.ir/Ebook/Electronics/Dandamudi - Fundamentals of...Sivarama P. Dandamudi School of Computer Science Carleton University September

  • Author
    others

  • View
    5

  • Download
    0

Embed Size (px)

Text of Fundamentals of Computer Organization and Designs2.bitdl.ir/Ebook/Electronics/Dandamudi -...

  • Fundamentals of Computer Organization and Design

    Sivarama P. Dandamudi

    School of Computer Science

    Carleton University

    September 22, 2002

  • s

  • s

    s

    s s

    s s s

    s

    y

    c

    y y y

    y

    s

    y

  • s

  • y

    s

    s

    y

    y

    y

    y s

    s

    s s

    s s k y

    s

    s y y

    c y y

    s

    s y

    s s s s y y

    s

    s y

    k

  • To

    my parents, Subba Rao and Prameela Rani,

    my wife, Sobha,

    and

    my daughter, Veda

  • Preface

    Computer science and engineering curricula have been evolving at a faster pace to keep up with

    the developments in the area. This often dictates that traditional courses will have to be com-

    pressed to accommodate new courses. In particular, it is no longer possible in these curricula

    to include separate courses on digital logic, assembly language programming, and computer

    organization. Often, these three topics are combined into a single course. The current textbooks

    in the market cater to the old-style curricula in these disciplines, with separate books available

    on each of these subjects. Most computer organization books do not cover assembly language

    programming in sufficient detail. There is a definite need to support the courses that combine

    assembly language programming and computer organization. This is the main motivation for

    writing this book. It provides a comprehensive coverage of digital logic, assembly language

    programming, and computer organization.

    Intended UseThis book is intended as an undergraduate textbook for computer organization courses offered

    by computer science and computer engineering/electrical engineering departments. Unlike

    other textbooks in this area, this book provides extensive coverage of assembly language pro-

    gramming and digital logic. Thus, the book serves the needs of compressed courses.

    In addition, it can be used as a text in vocational training courses offered by community

    colleges. Because of the teach-by-example style used in the book, it is also suitable for self-

    study by computer professionals and engineers.

    vii

  • viii Preface

    PrerequisitesThe objective is to support a variety of courses on computer organization in computer science

    and engineering departments. To satisfy this objective, we assume very little background on

    the part of the student. The student is assumed to have had some programming experience in a

    structured, high-level language such as C or Java™. This is the background almost all students

    in computer science and computer engineering programs typically acquire in their first year

    of study. This prerequisite also implies that the student has been exposed to the basics of the

    software-development cycle.

    FeaturesHere is a summary of the special features that set this book apart:

    • Most computer organization books assume that the students have done a separate digital

    logic course before taking the computer organization course. As a result, digital logic

    is covered in an appendix to provide an overview. This book provides detailed cover-

    age of digital logic, including sequential logic circuit design. Three complete chapters

    are devoted to digital logic topics, where students are exposed to the practical side with

    details on several example digital logic chips. There is also information on digital logic

    simulators. Students can conveniently use these simulators to test their designs.

    • This book provides extensive coverage of assembly language programming, comprising

    assembly language of both CISC and RISC processors. We use the Pentium as the rep-

    resentative of the CISC category and devote more than five chapters to introducing the

    Pentium assembly language. The MIPS processor is used for RISC assembly language

    programming. In both cases, students actually write and test working assembly language

    programs. The book’s homepage has instructions on downloading assemblers for both

    Pentium and MIPS processors.

    • We introduce concepts first in simple terms to motivate the reader. Later, we relate these

    concepts to practical implementations. In the digital logic part, we use several chips to

    show the type of implementations done in practice. For the other topics, we consistently

    use three processors—the Pentium, PowerPC, and MIPS—to cover the CISC to RISC

    range. In addition, we provide details on the Itanium and SPARC processors.

    • Most textbooks in the area treat I/O and interrupts as an appendage. As a result, this

    topic is discussed very briefly. Consequently, students do not get any practical experience

    on how interrupts work. In contrast, we use the Pentium to illustrate their operation.

    Several assembly language programs are used to explain the interrupt concepts. We also

    show how interrupt service routines can be written. For instance, one example in the

    chapter on interrupts replaces the system-supplied keyboard service routine by our own.

    By understanding the practical aspects of interrupt processing, students can write their

    own programs to experiment with interrupts.

  • Preface ix

    • Our coverage of system buses is comprehensive and up-to-date. We divide our coverage

    into internal and external buses. Internal buses discussed include the ISA, PCI, PCI-X,

    AGP, and PCMCIA buses. Our external bus coverage includes the EIA-232, SCSI, USB,

    and IEEE 1394 (FireWire) serial buses.

    • Extensive assembly programming examples are used to illustrate the points. A set of

    input and output routines is provided so that the reader can focus on developing assembly

    language programs rather than spending time in understanding how input and output can

    be done using the basic I/O functions provided by the operating system.

    • We do not use fragments of assembly language code in examples. All examples are

    complete in the sense that they can be assembled and run to give a better feeling as to

    how these programs work.

    • All examples used in the textbook and other proprietary I/O software are available from

    the book’s homepage (www.scs.carleton.ca/˜sivarama/org_book). In ad-dition, this Web site also has instructions on downloading the Pentium and MIPS assem-

    blers to give opportunities for students to perform hands-on assembly programming.

    • Most chapters are written in such a way that each chapter can be covered in two or three

    60-minute lectures by giving proper reading assignments. Typically, important concepts

    are emphasized in the lectures while leaving the other material in the book as a reading

    assignment. Our emphasis on extensive examples facilitates this pedagogical approach.

    • Interchapter dependencies are kept to a minimum to offer maximum flexibility to instruc-

    tors in organizing the material. Each chapter clearly indicates the objectives and provides

    an overview at the beginning and a summary and key terms at the end.

    Instructional SupportThe book’s Web site has complete chapter-by-chapter slides for instructors. Instructors can use

    these slides directly in their classes or can modify them to suit their needs. Please contact the

    author if you want the PowerPoint source of the slides. Copies of these slides (four per page)

    are also available for distribution to students. In addition, instructors can obtain the solutions

    manual by contacting the publisher. For more up-to-date details, please see the book’s Web

    page at www.scs.carleton.ca/˜sivarama/org_book.

    Overview and OrganizationThe book is divided into eight parts. In addition, Appendices provide useful reference material.

    Part I consists of a single chapter and gives an overview of basic computer organization and

    design.

    Part II presents digital logic design in three chapters—Chapters 2, 3, and 4. Chapter 2

    covers the digital logic basics. We introduce the basic concepts and building blocks that we

    use in the later chapters to build more complex digital circuits such as adders and arithmetic

    logic units (ALUs). This chapter also discusses the principles of digital logic design using

    Boolean algebra, Karnaugh maps, and Quine–McCluskey methods. The next chapter deals

  • x Preface

    with combinational circuits. We present the design of adders, comparators, and ALUs. We

    also show how programmable logic devices can be used to implement combinational logic

    circuits. Chapter 4 covers sequential logic circuits. We introduce the concept of time through

    clock signals. We discuss both latches and flip-flops, including master–slave JK flip-flops.

    These elements form the basis for designing memories in a later chapter. After presenting some

    example sequential circuits such as shift registers and counters, we discuss sequential circuit

    design in detail. These three chapters together cover the digital logic topic comprehensively.

    The amount of time spent on this part depends on the background of the students.

    Part III deals with system interconnection structures. We divide the system buses into in-

    ternal and external buses. Our classification is based on whether the bus interconnects compo-

    nents that are typically inside a system. Part III consists of Chapter 5 and covers internal system

    buses. We start this chapter with a discussion of system bus design issues. We discuss both syn-

    chronous and asynchronous buses. We also introduce block transfer bus cycles as well as wait

    states. Bus arbitration schemes are described next. We present five example buses including the

    ISA, PCI, PCI-X, AGP, and PCMCIA buses. The external buses are covered in Part VIII, which

    discusses the I/O issues.

    Part IV consists of three chapters and discusses processor design issues. Chapter 6 presents

    the basics of processor organization and performance. We discuss instruction set architectures

    and instruction set design issues. This chapter also covers microprogrammed control. In addi-

    tion, processor performance issues, including the SPEC benchmarks, are discussed. The next

    chapter gives details about the Pentium processor. The information presented in this chapter

    is useful when we discuss Pentium assembly language programming in Part V. Pipelining and

    vector processors are discussed in the last chapter of this part. We use the Cray X-MP system

    to look at the practical side of vector processors. After covering the material in Chapter 6,

    instructors can choose the material from Chapters 7 and 8 to suit their course requirements.

    Part V covers Pentium assembly language programming in detail. There are five chapters

    in this part. Chapter 9 provides an overview of the Pentium assembly language. All necessary

    basic features are covered in this chapter. After reading this chapter, students can write simple

    Pentium assembly programs without needing the information presented in the later four chap-

    ters. Chapter 10 describes the Pentium addressing modes in detail. This chapter gives enough

    information for the student to understand why CISC processors provide complex addressing

    modes. The next chapter deals with procedures. Our intent is to expose the student to the un-

    derlying mechanics involved in procedure calls, parameter passing, and local variable storage.

    In addition, recursive procedures are used to explore the principles involved in handling recur-

    sion. In all these activities, the important role played by the stack is illustrated. Chapter 12

    describes the Pentium instruction set. Our goal is not to present the complete Pentium instruc-

    tions, but a representative sample. Chapter 13 deals with the high-level language interface,

    which allows mixed-mode programming in more than one language. We use C and assembly

    language to illustrate the principles involved in mixed-mode programming. Each chapter uses

    several examples to show how various Pentium instructions are used.

    Part VI covers RISC processors in two chapters. The first chapter introduces the general

    RISC design principles. It also presents details about two RISC processors: the PowerPC and

  • Preface xi

    Intel Itanium. Although both are considered RISC processors, they also have some CISC fea-

    tures. We discuss a pure RISC processor in the next chapter. The Itanium is Intel’s 64-bit

    processor that not only incorporates RISC characteristics but also several advanced architec-

    tural features. These features include instruction-level parallelism, predication, and speculative

    loads. The second chapter in this part describes the MIPS R2000 processor. The MIPS sim-

    ulator SPIM runs the programs written for the R2000 processor. We present MIPS assembly

    language programs that are complete and run on the SPIM. The programs we present here are

    the same programs we have written in the Pentium assembly language (in Part V). Thus, the

    reader has an opportunity to contrast the two assembly languages.

    Part VII consists of Chapters 16 through 18 and covers memory design issues. Chapter 16

    builds on the digital logic material presented in Part II. It describes how memory units can be

    constructed using the basic latches and flip-flops presented in Chapter 4. Memory mapping

    schemes, both full- and partial-mapping, are also discussed. In addition, we discuss how inter-

    leaved memories are designed. The next chapter covers cache memory principles and design

    issues. We use an extensive set of examples to illustrate the cache principles. Toward the end

    of the chapter, we look at example cache implementations in the Pentium, PowerPC, and MIPS

    processors. Chapter 18 discusses virtual memory systems. Note that our coverage of virtual

    memory is from the computer organization viewpoint. As a result, we do not cover those as-

    pects that are of interest from the operating-system point of view. As with the cache memory, we

    look at the virtual memory implementations of the Pentium, PowerPC, and MIPS processors.

    The last part covers the I/O issues. We cover the basic I/O interface issues in Chapter 19.

    We start with I/O address mapping and then discuss three techniques often used to interface

    with I/O devices: programmed I/O, interrupt-driven I/O, and DMA. We discuss interrupt-driven

    I/O in detail in the next chapter. In addition, this chapter also presents details about external

    buses. In particular, we cover the EIA-232, USB, and IEEE 1394 serial interfaces and the SCSI

    parallel interface. The last chapter covers Pentium interrupts in detail. We use programming

    examples to illustrate interrupt-driven access to I/O devices. We also present an example to

    show how user-defined interrupt service routines can be written.

    The appendices provide a wealth of reference material needed by the student. Appendix A

    primarily discusses computer arithmetic. Character representation is discussed in Appendix B.

    Appendix C gives information on the use of I/O routines provided with this book and the Pen-

    tium assembler software. The debugging aspect of assembly language programming is dis-

    cussed in Appendix D. Appendix E gives details on running the Pentium assembly programs

    on a Linux system using the NASM assembler. Appendix F gives details on digital logic sim-

    ulators. Details on the MIPS simulator SPIM are in Appendix G. Appendix H describes the

    SPARC processor architecture. Finally, selected Pentium instructions are given in Appendix I.

    AcknowledgmentsSeveral people have contributed to the writing of this book. First and foremost, I would like to

    thank my wife, Sobha, and my daughter, Veda, for enduring my preoccupation with this project.

    I thank Wayne Yuhasz, Executive Editor at Springer-Verlag, for his input and feedback in

  • xii Preface

    developing this project. His guidance and continued support for the project are greatly appreci-

    ated. I also want to thank Wayne Wheeler, Assistant Editor, for keeping track of the progress.

    He has always been prompt in responding to my queries. Thanks are also due to the staff at

    Springer-Verlag New York, Inc., particularly Francine McNeill, for its efforts in producing this

    book. I would also like to thank Valerie Greco for doing an excellent job of copyediting the

    text.

    My sincere appreciation goes to the School of Computer Science at Carleton University for

    allowing me to use part of my sabbatical leave to complete this book.

    FeedbackWorks of this nature are never error-free, despite the best efforts of the authors and others

    involved in the project. I welcome your comments, suggestions, and corrections by electronic

    mail.

    Ottawa, Ontario, Canada Sivarama P. Dandamudi

    December 2001 [email protected]://www.scs.carleton.ca/˜sivarama

  • Contents

    Preface vii

    PART I: Overview 1

    1 Overview of Computer Organization 3

    1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.1.1 Basic Terms and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2 Programmer’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.2.1 Advantages of High-Level Languages . . . . . . . . . . . . . . . . . . . . . 10

    1.2.2 Why Program in Assembly Language? . . . . . . . . . . . . . . . . . . . . . 11

    1.3 Architect’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.4 Implementer’s View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    1.5 The Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    1.5.1 Pipelining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    1.5.2 RISC and CISC Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    1.6 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    1.6.1 Basic Memory Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    1.6.2 Byte Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    1.6.3 Two Important Memory Design Issues . . . . . . . . . . . . . . . . . . . . . 24

    1.7 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    1.8 Interconnection: The Glue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    1.9 Historical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    1.9.1 The Early Generations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    1.9.2 Vacuum Tube Generation: Around the 1940s and 1950s . . . . . . . . . . . 31

    1.9.3 Transistor Generation: Around the 1950s and 1960s . . . . . . . . . . . . . 32

    1.9.4 IC Generation: Around the 1960s and 1970s . . . . . . . . . . . . . . . . . 32

    1.9.5 VLSI Generations: Since the Mid-1970s . . . . . . . . . . . . . . . . . . . . 32

    1.10 Technological Advances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    1.11 Summary and Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    1.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    xiii

  • xiv Contents

    PART II: Digital Logic Design 39

    2 Digital Logic Basics 41

    2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    2.2 Basic Concepts and Building Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    2.2.1 Simple Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    2.2.2 Completeness and Universality . . . . . . . . . . . . . . . . . . . . . . . . . 44

    2.2.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    2.3 Logic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    2.3.1 Expressing Logic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    2.3.2 Logical Circuit Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    2.4 Boolean Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    2.4.1 Boolean Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    2.4.2 Using Boolean Algebra for Logical Equivalence . . . . . . . . . . . . . . . 54

    2.5 Logic Circuit Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    2.6 Deriving Logical Expressions from Truth Tables . . . . . . . . . . . . . . . . . . . . 56

    2.6.1 Sum-of-Products Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    2.6.2 Product-of-Sums Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    2.6.3 Brute Force Method of Implementation . . . . . . . . . . . . . . . . . . . . 58

    2.7 Simplifying Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    2.7.1 Algebraic Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    2.7.2 Karnaugh Map Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    2.7.3 Quine–McCluskey Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    2.8 Generalized Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    2.9 Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    2.10 Implementation Using Other Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

    2.10.1 Implementation Using NAND and NOR Gates . . . . . . . . . . . . . . . . 75

    2.10.2 Implementation Using XOR Gates . . . . . . . . . . . . . . . . . . . . . . . 77

    2.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    2.12 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    2.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    3 Combinational Circuits 83

    3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    3.2 Multiplexers and Demultiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    3.2.1 Implementation: A Multiplexer Chip . . . . . . . . . . . . . . . . . . . . . . 86

    3.2.2 Efficient Multiplexer Designs . . . . . . . . . . . . . . . . . . . . . . . . . 86

    3.2.3 Implementation: A 4-to-1 Multiplexer Chip . . . . . . . . . . . . . . . . . . 87

    3.2.4 Demultiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    3.3 Decoders and Encoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    3.3.1 Decoder Chips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    3.3.2 Encoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

  • Contents xv

    3.4 Comparators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    3.4.1 A Comparator Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    3.5 Adders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

    3.5.1 An Example Adder Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    3.6 Programmable Logic Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    3.6.1 Programmable Logic Arrays (PLAs) . . . . . . . . . . . . . . . . . . . . . . 98

    3.6.2 Programmable Array Logic Devices (PALs) . . . . . . . . . . . . . . . . . . 100

    3.7 Arithmetic and Logic Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

    3.7.1 An Example ALU Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

    4 Sequential Logic Circuits 109

    4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

    4.2 Clock Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

    4.3 Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

    4.3.1 SR Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

    4.3.2 Clocked SR Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

    4.3.3 D Latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

    4.4 Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

    4.4.1 D Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

    4.4.2 JK Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

    4.4.3 Example Chips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

    4.5 Example Sequential Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

    4.5.1 Shift Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

    4.5.2 Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

    4.6 Sequential Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

    4.6.1 Binary Counter Design with JK Flip-Flops . . . . . . . . . . . . . . . . . . 127

    4.6.2 General Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

    4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

    4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

    PART III: Interconnection 145

    5 System Buses 147

    5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

    5.2 Bus Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

    5.2.1 Bus Width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

    5.2.2 Bus Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

    5.2.3 Bus Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

    5.3 Synchronous Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

    5.3.1 Basic Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

  • xvi Contents

    5.3.2 Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

    5.3.3 Block Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

    5.4 Asynchronous Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

    5.5 Bus Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

    5.5.1 Dynamic Bus Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

    5.5.2 Implementation of Dynamic Arbitration . . . . . . . . . . . . . . . . . . . . 161

    5.6 Example Buses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

    5.6.1 The ISA Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

    5.6.2 The PCI Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    5.6.3 Accelerated Graphics Port (AGP) . . . . . . . . . . . . . . . . . . . . . . . 180

    5.6.4 The PCI-X Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

    5.6.5 The PCMCIA Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

    5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

    5.8 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

    5.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

    PART IV: Processors 195

    6 Processor Organization and Performance 197

    6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

    6.2 Number of Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

    6.2.1 Three-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

    6.2.2 Two-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

    6.2.3 One-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

    6.2.4 Zero-Address Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

    6.2.5 A Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

    6.2.6 The Load/Store Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 206

    6.2.7 Processor Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

    6.3 Flow of Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

    6.3.1 Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

    6.3.2 Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

    6.4 Instruction Set Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

    6.4.1 Operand Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

    6.4.2 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

    6.4.3 Instruction Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

    6.4.4 Instruction Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

    6.5 Microprogrammed Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

    6.5.1 Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

    6.5.2 Software Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

    6.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

    6.6.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

    6.6.2 Execution Time Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . 238

  • Contents xvii

    6.6.3 Means of Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

    6.6.4 The SPEC Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

    6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

    6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

    7 The Pentium Processor 251

    7.1 The Pentium Processor Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

    7.2 The Pentium Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

    7.3 The Pentium Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

    7.3.1 Data Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

    7.3.2 Pointer and Index Registers . . . . . . . . . . . . . . . . . . . . . . . . . . 257

    7.3.3 Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

    7.3.4 Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

    7.4 Real Mode Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

    7.5 Protected Mode Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 265

    7.5.1 Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

    7.5.2 Segment Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

    7.5.3 Segment Descriptor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

    7.5.4 Segmentation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

    7.5.5 Mixed-Mode Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

    7.5.6 Which Segment Register to Use . . . . . . . . . . . . . . . . . . . . . . . . 270

    7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

    7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

    8 Pipelining and Vector Processing 273

    8.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

    8.2 Handling Resource Conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

    8.3 Data Hazards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

    8.3.1 Register Forwarding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

    8.3.2 Register Interlocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

    8.4 Handling Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

    8.4.1 Delayed Branch Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

    8.4.2 Branch Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

    8.5 Performance Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

    8.5.1 Superscalar Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

    8.5.2 Superpipelined Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

    8.5.3 Very Long Instruction Word Architectures . . . . . . . . . . . . . . . . . . . 290

    8.6 Example Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

    8.6.1 Pentium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

    8.6.2 PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

    8.6.3 SPARC Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

    8.6.4 MIPS Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

  • xviii Contents

    8.7 Vector Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

    8.7.1 What Is Vector Processing? . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

    8.7.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

    8.7.3 Advantages of Vector Processing . . . . . . . . . . . . . . . . . . . . . . . . 303

    8.7.4 The Cray X-MP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

    8.7.5 Vector Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306

    8.7.6 Vector Stride . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

    8.7.7 Vector Operations on the Cray X-MP . . . . . . . . . . . . . . . . . . . . . 309

    8.7.8 Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

    8.8 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

    8.8.1 Pipeline Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

    8.8.2 Vector Processing Performance . . . . . . . . . . . . . . . . . . . . . . . . 314

    8.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

    8.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

    PART V: Pentium Assembly Language 319

    9 Overview of Assembly Language 321

    9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

    9.2 Assembly Language Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

    9.3 Data Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

    9.3.1 Range of Numeric Operands . . . . . . . . . . . . . . . . . . . . . . . . . . 326

    9.3.2 Multiple Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

    9.3.3 Multiple Initializations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

    9.3.4 Correspondence to C Data Types . . . . . . . . . . . . . . . . . . . . . . . . 330

    9.3.5 LABEL Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331

    9.4 Where Are the Operands? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

    9.4.1 Register Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

    9.4.2 Immediate Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 333

    9.4.3 Direct Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

    9.4.4 Indirect Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

    9.5 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

    9.5.1 The mov Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

    9.5.2 The xchg Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

    9.5.3 The xlat Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340

    9.6 Pentium Assembly Language Instructions . . . . . . . . . . . . . . . . . . . . . . . 340

    9.6.1 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340

    9.6.2 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

    9.6.3 Iteration Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

    9.6.4 Logical Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

    9.6.5 Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

    9.6.6 Rotate Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

  • Contents xix

    9.7 Defining Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

    9.7.1 The EQU Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

    9.7.2 The = Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

    9.8 Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

    9.9 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

    9.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

    9.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

    9.12 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

    10 Procedures and the Stack 387

    10.1 What Is a Stack? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388

    10.2 Pentium Implementation of the Stack . . . . . . . . . . . . . . . . . . . . . . . . . 388

    10.3 Stack Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

    10.3.1 Basic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

    10.3.2 Additional Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

    10.4 Uses of the Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

    10.4.1 Temporary Storage of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

    10.4.2 Transfer of Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

    10.4.3 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

    10.5 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

    10.6 Assembler Directives for Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 396

    10.7 Pentium Instructions for Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . 397

    10.7.1 How Is Program Control Transferred? . . . . . . . . . . . . . . . . . . . . . 397

    10.7.2 The ret Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

    10.8 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

    10.8.1 Register Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

    10.8.2 Stack Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402

    10.8.3 Preserving Calling Procedure State . . . . . . . . . . . . . . . . . . . . . . . 406

    10.8.4 Which Registers Should Be Saved? . . . . . . . . . . . . . . . . . . . . . . 406

    10.8.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409

    10.9 Handling a Variable Number of Parameters . . . . . . . . . . . . . . . . . . . . . . 417

    10.10 Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

    10.11 Multiple Source Program Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 426

    10.11.1 PUBLIC Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427

    10.11.2 EXTRN Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427

    10.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

    10.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

    10.14 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

    11 Addressing Modes 435

    11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435

  • xx Contents

    11.2 Memory Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

    11.2.1 Based Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439

    11.2.2 Indexed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439

    11.2.3 Based-Indexed Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

    11.3 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

    11.4 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

    11.4.1 One-Dimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 449

    11.4.2 Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

    11.4.3 Examples of Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452

    11.5 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

    11.5.1 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456

    11.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464

    11.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464

    11.8 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

    12 Selected Pentium Instructions 471

    12.1 Status Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472

    12.1.1 The Zero Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472

    12.1.2 The Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

    12.1.3 The Overflow Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

    12.1.4 The Sign Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479

    12.1.5 The Auxiliary Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

    12.1.6 The Parity Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481

    12.1.7 Flag Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483

    12.2 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

    12.2.1 Multiplication Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 485

    12.2.2 Division Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488

    12.2.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491

    12.3 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

    12.3.1 Indirect Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

    12.3.2 Conditional Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500

    12.4 Implementing High-Level Language Decision Structures . . . . . . . . . . . . . . . 504

    12.4.1 Selective Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504

    12.4.2 Iterative Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508

    12.5 Logical Expressions in High-Level Languages . . . . . . . . . . . . . . . . . . . . 510

    12.5.1 Representation of Boolean Data . . . . . . . . . . . . . . . . . . . . . . . . 510

    12.5.2 Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

    12.5.3 Bit Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

    12.5.4 Evaluation of Logical Expressions . . . . . . . . . . . . . . . . . . . . . . . 511

    12.6 Bit Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515

    12.6.1 Bit Test and Modify Instructions . . . . . . . . . . . . . . . . . . . . . . . . 515

    12.6.2 Bit Scan Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516

  • Contents xxi

    12.7 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516

    12.8 String Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526

    12.8.1 String Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526

    12.8.2 String Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

    12.8.3 String Processing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 536

    12.8.4 Testing String Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 540

    12.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542

    12.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543

    12.11 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545

    13 High-Level Language Interface 551

    13.1 Why Program in Mixed-Mode? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552

    13.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552

    13.3 Calling Assembly Procedures from C . . . . . . . . . . . . . . . . . . . . . . . . . 554

    13.3.1 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554

    13.3.2 Returning Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556

    13.3.3 Preserving Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556

    13.3.4 Publics and Externals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557

    13.3.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557

    13.4 Calling C Functions from Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . 562

    13.5 Inline Assembly Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565

    13.5.1 Compiling Inline Assembly Programs . . . . . . . . . . . . . . . . . . . . . 565

    13.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566

    13.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567

    13.8 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567

    PART VI: RISC Processors 569

    14 RISC Processors 571

    14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572

    14.2 Evolution of CISC Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572

    14.3 RISC Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575

    14.3.1 Simple Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575

    14.3.2 Register-to-Register Operations . . . . . . . . . . . . . . . . . . . . . . . . 576

    14.3.3 Simple Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 576

    14.3.4 Large Number of Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . 576

    14.3.5 Fixed-Length, Simple Instruction Format . . . . . . . . . . . . . . . . . . . 577

    14.4 PowerPC Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

    14.4.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

    14.4.2 PowerPC Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

    14.5 Itanium Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590

    14.5.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591

  • xxii Contents

    14.5.2 Itanium Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594

    14.5.3 Handling Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604

    14.5.4 Predication to Eliminate Branches . . . . . . . . . . . . . . . . . . . . . . . 605

    14.5.5 Speculative Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606

    14.5.6 Branch Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610

    14.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611

    14.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612

    15 MIPS Assembly Language 615

    15.1 MIPS Processor Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616

    15.1.1 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616

    15.1.2 General-Purpose Register Usage Convention . . . . . . . . . . . . . . . . . 617

    15.1.3 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618

    15.1.4 Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619

    15.2 MIPS Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619

    15.2.1 Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620

    15.2.2 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 621

    15.2.3 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623

    15.2.4 Logical Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627

    15.2.5 Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627

    15.2.6 Rotate Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628

    15.2.7 Comparison Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628

    15.2.8 Branch and Jump Instructions . . . . . . . . . . . . . . . . . . . . . . . . . 630

    15.3 SPIM System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

    15.4 SPIM Assembler Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634

    15.5 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636

    15.6 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643

    15.7 Stack Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648

    15.7.1 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649

    15.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657

    15.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658

    15.10 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659

    PART VII: Memory 663

    16 Memory System Design 665

    16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666

    16.2 A Simple Memory Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666

    16.2.1 Memory Design with D Flip-Flops . . . . . . . . . . . . . . . . . . . . . . . 667

    16.2.2 Problems with the Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 667

    16.3 Techniques to Connect to a Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669

    16.3.1 Using Multiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669

  • Contents xxiii

    16.3.2 Using Open Collector Outputs . . . . . . . . . . . . . . . . . . . . . . . . . 669

    16.3.3 Using Tristate Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671

    16.4 Building a Memory Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673

    16.5 Building Larger Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674

    16.5.1 Designing Independent Memory Modules . . . . . . . . . . . . . . . . . . . 676

    16.5.2 Designing Larger Memories Using Memory Chips . . . . . . . . . . . . . . 678

    16.6 Mapping Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681

    16.6.1 Full Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681

    16.6.2 Partial Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682

    16.7 Alignment of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683

    16.8 Interleaved Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684

    16.8.1 The Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685

    16.8.2 Synchronized Access Organization . . . . . . . . . . . . . . . . . . . . . . . 686

    16.8.3 Independent Access Organization . . . . . . . . . . . . . . . . . . . . . . . 687

    16.8.4 Number of Banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688

    16.8.5 Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689

    16.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689

    16.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690

    17 Cache Memory 693

    17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694

    17.2 How Cache Memory Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695

    17.3 Why Cache Memory Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697

    17.4 Cache Design Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699

    17.5 Mapping Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700

    17.5.1 Direct Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703

    17.5.2 Associative Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707

    17.5.3 Set-Associative Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708

    17.6 Replacement Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 711

    17.7 Write Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713

    17.8 Space Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715

    17.9 Mapping Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717

    17.10 Types of Cache Misses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718

    17.11 Types of Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719

    17.11.1 Separate Instruction and Data Caches . . . . . . . . . . . . . . . . . . . . . 719

    17.11.2 Number of Cache Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720

    17.11.3 Virtual and Physical Caches . . . . . . . . . . . . . . . . . . . . . . . . . . 722

    17.12 Example Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722

    17.12.1 Pentium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722

    17.12.2 PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724

    17.12.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726

  • xxiv Contents

    17.13 Cache Operation: A Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727

    17.13.1 Placement of a Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727

    17.13.2 Location of a Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728

    17.13.3 Replacement Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728

    17.13.4 Write Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728

    17.14 Design Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729

    17.14.1 Cache Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729

    17.14.2 Cache Line Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729

    17.14.3 Degree of Associativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731

    17.15 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731

    17.16 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733

    18 Virtual Memory 735

    18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736

    18.2 Virtual Memory Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737

    18.2.1 Page Replacement Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . 738

    18.2.2 Write Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739

    18.2.3 Page Size Tradeoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740

    18.2.4 Page Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741

    18.3 Page Table Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741

    18.3.1 Page Table Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742

    18.4 The Translation Lookaside Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . 743

    18.5 Page Table Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744

    18.5.1 Searching Hierarchical Page Tables . . . . . . . . . . . . . . . . . . . . . . 745

    18.6 Inverted Page Table Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746

    18.7 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748

    18.8 Example Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750

    18.8.1 Pentium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750

    18.8.2 PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754

    18.8.3 MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756

    18.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760

    18.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761

    PART VIII: Input and Output 765

    19 Input/Output Organization 767

    19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 768

    19.2 Accessing I/O Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770

    19.2.1 I/O Address Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770

    19.2.2 Accessing I/O Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770

    19.3 An Example I/O Device: Keyboard . . . . . . . . . . . . . . . . . . . . . . . . . . . 772

    19.3.1 Keyboard Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772

    19.3.2 8255 Programmable Peripheral Interface Chip . . . . . . . . . . . . . . . . . 772

  • Contents xxv

    19.4 I/O Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774

    19.4.1 Programmed I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775

    19.4.2 DMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777

    19.5 Error Detection and Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784

    19.5.1 Parity Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784

    19.5.2 Error Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785

    19.5.3 Cyclic Redundancy Check . . . . . . . . . . . . . . . . . . . . . . . . . . . 787

    19.6 External Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791

    19.6.1 Serial Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794

    19.6.2 Parallel Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797

    19.7 Universal Serial Bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801

    19.7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801

    19.7.2 Additional USB Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . 802

    19.7.3 USB Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803

    19.7.4 Transfer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803

    19.7.5 USB Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805

    19.7.6 USB Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807

    19.8 IEEE 1394 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 810

    19.8.1 Advantages of IEEE 1394 . . . . . . . . . . . . . . . . . . . . . . . . . . . 810

    19.8.2 Power Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811

    19.8.3 Transfer Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812

    19.8.4 Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813

    19.8.5 Bus Arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815

    19.8.6 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815

    19.9 The Bus Wars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820

    19.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821

    19.11 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823

    19.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823

    20 Interrupts 825

    20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826

    20.2 A Taxonomy of Pentium Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . 827

    20.3 Pentium Interrupt Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829

    20.3.1 Interrupt Processing in Protected Mode . . . . . . . . . . . . . . . . . . . . 829

    20.3.2 Interrupt Processing in Real Mode . . . . . . . . . . . . . . . . . . . . . . . 829

    20.4 Pentium Software Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831

    20.4.1 DOS Keyboard Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832

    20.4.2 BIOS Keyboard Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837

    20.5 Pentium Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842

    20.6 Pentium Hardware Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847

    20.6.1 How Does the CPU Know the Interrupt Type? . . . . . . . . . . . . . . . . . 847

    20.6.2 How Can More Than One Device Interrupt? . . . . . . . . . . . . . . . . . . 848

  • xxvi Contents

    20.6.3 8259 Programmable Interrupt Controller . . . . . . . . . . . . . . . . . . . 848

    20.6.4 A Pentium Hardware Interrupt Example . . . . . . . . . . . . . . . . . . . . 850

    20.7 Interrupt Processing in the PowerPC . . . . . . . . . . . . . . . . . . . . . . . . . . 855

    20.8 Interrupt Processing in the MIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857

    20.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859

    20.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860

    20.11 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862

    APPENDICES 863

    A Computer Arithmetic 865

    A.1 Positional Number Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865

    A.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867

    A.2 Number Systems Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868

    A.2.1 Conversion to Decimal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868

    A.2.2 Conversion from Decimal . . . . . . . . . . . . . . . . . . . . . . . . . . . 870

    A.2.3 Conversion Among Binary, Octal, and Hexadecimal . . . . . . . . . . . . . 871

    A.3 Unsigned Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874

    A.3.1 Arithmetic on Unsigned Integers . . . . . . . . . . . . . . . . . . . . . . . 875

    A.4 Signed Integer Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 881

    A.4.1 Signed Magnitude Representation . . . . . . . . . . . . . . . . . . . . . . . 882

    A.4.2 Excess-M Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 882

    A.4.3 1’s Complement Representation . . . . . . . . . . . . . . . . . . . . . . . . 883

    A.4.4 2’s Complement Representation . . . . . . . . . . . . . . . . . . . . . . . . 886

    A.5 Floating-Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887

    A.5.1 Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 887

    A.5.2 Representing Floating-Point Numbers . . . . . . . . . . . . . . . . . . . . . 890

    A.5.3 Floating-Point Representation . . . . . . . . . . . . . . . . . . . . . . . . . 891

    A.5.4 Floating-Point Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896

    A.5.5 Floating-Point Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . 896

    A.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897

    A.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 898

    A.8 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 900

    B Character Representation 901

    B.1 Character Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 901

    B.2 Universal Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903

    B.3 Unicode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903

    B.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904

    C Assembling and Linking Pentium Assembly Language Programs 907

    C.1 Structure of Assembly Language Programs . . . . . . . . . . . . . . . . . . . . . . 908

  • Contents xxvii

    C.2 Input/Output Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 910

    C.2.1 Character I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912

    C.2.2 String I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912

    C.2.3 Numeric I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913

    C.3 Assembling and Linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915

    C.3.1 The Assembly Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915

    C.3.2 Linking Object Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924

    C.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924

    C.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925

    C.6 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925

    D Debugging Assembly Language Programs 927

    D.1 Strategies to Debug Assembly Language Programs . . . . . . . . . . . . . . . . . . 928

    D.2 DEBUG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930

    D.2.1 Display Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930

    D.2.2 Execution Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933

    D.2.3 Miscellaneous Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

    D.2.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934

    D.3 Turbo Debugger TD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 938

    D.4 CodeView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943

    D.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944

    D.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944

    D.7 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945

    E Running Pentium Assembly Language Programs on a Linux System 947

    E.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948

    E.2 NASM Assembly Language Program Template . . . . . . . . . . . . . . . . . . . . 948

    E.3 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 950

    E.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955

    E.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955

    E.6 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955

    F Digital Logic Simulators 957

    F.1 Testing Digital Logic Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957

    F.2 Digital Logic Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 958

    F.2.1 DIGSim Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 958

    F.2.2 Digital Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959

    F.2.3 Multimedia Logic Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . 961

    F.2.4 Logikad Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 962

    F.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 966

    F.4 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 966

    F.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967

  • xxviii Contents

    G SPIM Simulator

    and Debugger 969

    G.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969

    G.2 Simulator Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 972

    G.3 Running and Debugging a Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 973

    G.3.1 Loading and Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 973

    G.3.2 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974

    G.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977

    G.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977

    G.6 Programming Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977

    H The SPARC Architecture 979

    H.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 979

    H.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 980

    H.3 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 982

    H.4 Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984

    H.4.1 Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984

    H.4.2 Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 984

    H.4.3 Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986

    H.4.4 Logical Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987

    H.4.5 Shift Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988

    H.4.6 Compare Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 988

    H.4.7 Branch Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989

    H.5 Procedures and Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . 993

    H.5.1 Procedure Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 993

    H.5.2 Parameter Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 994

    H.5.3 Stack Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 995

    H.5.4 Window Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996

    H.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000

    H.7 Web Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000

    H.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000

    I Pentium Instruction Set 1001

    I.1 Pentium Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001

    I.1.1 Instruction Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001

    I.1.2 General Instruction Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 1002

    I.2 Selected Pentium Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1004

    Bibliography 1033

    Index 1037

  • Chapter 1

    Overview of

    Computer Organization

    Objectives• To provide a high-level overview of computer organization;

    • To discuss how architects, implementers, programmers, and users view the computer

    system;

    • To describe the three main components: processor, memory, and I/O;

    • To give a brief historical perspective of computers.

    We begin each chapter with an overview of what you can expect in the chapter. This is our first

    overview. The main purpose of this chapter is to provide an overview of the computer systems.

    We start off with a brief introduction to computer systems from the user’s viewpoint.

    Computer systems are complex. To manage this complexity, we use a series of abstractions.

    The kind of abstraction used depends on what you want to do with the system. We present

    the material in this book from three perspectives: from the computer architect’s view, from the

    programmer’s view, and from the implementer’s view. We give details about these three views

    in Sections 1.2 through 1.4.

    A computer system consists of three major components: a processor, a memory unit, and

    an input/output (I/O) subsystem. A system bus interconnects these three components. The next

    three sections discuss these three components in detail. Section 1.5 provides an overview of

    the processor component. The processors we cover in this book include the Pentium, MIPS,

    PowerPC, Itanium, and SPARC. Section 1.6 presents some basic concepts about the memory

    system. Later chapters describe in detail cache and virtual memories. Section 1.7 gives a brief

    overview of how input/output devices such as the keyboard are interfaced to the system. A more

    3

  • 4 Chapter 1 Overview of Computer Organization

    System

    hardware

    System

    software

    Applications

    software

    Figure 1.1 A user’s view of a computer system.

    detailed description on I/O interfacing can be found in the last two chapters. We conclude the

    chapter by providing a perspective on the history of computers.

    1.1 IntroductionThis book is about digital computer systems, which have been revolutionizing our society. Most

    of us use computers for a variety of tasks, from serious scientific computations to entertainment.

    You are reading this book because you are interested in learning more about these magnificent

    machines.

    As with any complex project, several stages and players are involved in designing, imple-

    menting, and realizing a computer system. This book deals with inside details of a computer

    system, focusing on both hardware and software.

    Computer hardware is the electronic circuitry that performs the actual work. Hardware

    includes things with which you are already familiar such as the processor, memory, keyboard,

    CD burner, and so on. Miniaturization of hardware is the most recent advance in the computer

    hardware area. This miniaturization gives us such compact things as PocketPCs and Flash

    memories.

    Computer software can be divided into application software and system software. A user

    interacts with the system through an application program. For the user, the application is the

    computer! For example, if you are interested in browsing the Internet, you interact with the

    system through a Web browser such as the Netscape™ Communicator or Internet Explorer. For

    you, the system appears as though it is executing the application program (i.e., Web browser),

    as shown in Figure 1.1.

  • Section 1.1 Introduction 5

    At the core is the basic hardware, over which a layer of system software hides the gory

    details about the hardware. Early ancestors of the Pentium and other processors were called

    microprocessors because they were less powerful than the processors used in the computers at

    that time.

    The system software manages the hardware resources efficiently and also provides nice

    services to the application software layer. What is the system software? Operating systems

    such as Windows™, UNIX™, and Linux are the most familiar examples. System software also

    includes compilers, assemblers, and linkers that we discuss later in this book. You are probably

    more familiar with application software, which includes Web browsers, word processors, music

    players, and so on.

    This book presents details on various aspects of computer system design and programming.

    We discuss organization and architecture of computer systems, how they are designed, and how

    they are programmed. In order to clarify the scope of this book, we need to explain these terms:

    computer architecture, computer organization, computer design, and computer programming.

    Computer architecture refers to the aspects with which a programmer is concerned. The

    most obvious one is the design of an instruction set for the computer. For example, should

    the processor understand instructions to process multimedia data? The answer depends on

    the intended use of the system. Clearly, if the target applications involve multimedia, adding

    multimedia instructions will help improve the performance. Computer architecture, in a sense,

    describes the computer system at a logical level, from the programmer’s viewpoint. It deals

    with the selection of the basic functional units such as the processor and memory, and how they

    should be interconnected into a computer system.

    Computer organization is concerned with how the various hardware components operate

    and how they are interconnected to implement the architectural specifications. For example, if

    the architecture specifies a divide instruction, we will have a choice to implement this instruc-

    tion either in hardware or in software. In a high-performance model, we may implement the

    division operation in hardware to provide improved performance at a higher price. In cheaper

    models, we may implement it in software. But cost need not be the only deciding criterion.

    For example, the Pentium processor implements the divide operation in hardware whereas the

    next generation Itanium processor implements division in software. If the next version of Ita-

    nium uses a hardware implementation of division, that does not change the architecture, only

    its organization.

    Computer design is an activity that translates architectural specifications of a system into

    an implementation using a particular organization. As a result, computer design is sometimes

    referred to as computer implementation. A computer designer is concerned with the hardware

    design of the computer.

    Computer programming involves expressing the problem at hand in a language that the com-

    puter can understand. As we show later, the native language that a computer can understand is

    called the machine language. But this is not a language with which we humans are comfort-

    able. So we use a language that we can easily read and understand. These languages are called

    high-level languages, and include languages such as Java™ and C. We do not devote any space

    for these high-level languages as they are beyond the scope of this book. Instead, we discuss

  • 6 Chapter 1 Overview of Computer Organization

    in detail languages that are close to the architecture of a machine. This allows us to study the

    internal details of computer systems.

    Computers are complex systems. How do we manage complexity of these systems? We

    can get clues from looking at how we manage complex systems in life. Think of how a large

    corporation is managed. We use a hierarchical structure to simplify the management: president

    at the top and employees at the bottom. Each level of management filters out unnecessary details

    on the lower levels and presents only an abstracted version to the higher-level management. This

    is what we refer to as abstraction. We study computer systems by using layers of abstraction.

    Different people view computer systems differently depending on the type of their interac-

    tion. We use the concept of abstraction to look at only the details that are necessary from a

    particular viewpoint. For example, if you are a computer architect, you are interested in the in-

    ternal details that do not interest a normal user of the system. One can look at computer systems

    from several different perspectives. We have already talked about the user’s view. Our interest

    in this book is not at this level. Instead, we concentrate on the following views: (i) a program-

    mer’s view, (ii) an architect’s view, and (iii) an implementer’s view. The next three sections

    briefly discuss these perspectives.

    1.1.1 Basic Terms and Notation

    The alphabet of computers, more precisely digital computers, consists of 0 and 1. Each is

    called a bit, which stands for the binary digit. The term byte is used to represent a group of

    8 bits. The term word is used to refer to a group of bytes that is processed simultaneously.

    The exact number of bytes that constitute a word depends on the system. For example, in the

    Pentium, a word refers to four bytes or 32 bits. On the other hand, eight bytes are grouped into

    a word in the Itanium processor. The reasons for this difference are explained later. We use the

    abbreviation “b” for bits, “B” for bytes, and “W” for words. Sometimes we also use doubleword

    and quadword. A doubleword has twice the number of bits as the word and the quadword has

    four times the number of bits in a word.

    Bits in a word are usually ordered from right to left, as you would write digits in a decimal

    number. The rightmost bit is called the least significant bit (LSB), and the leftmost bit is called

    the most significant bit (MSB). However, some manufacturers use the opposite notation. For

    example, the PowerPC manuals use this notation. In this book, we consistently write bits of a

    word from right to left, with the LSB as the rightmost bit.

    We use standard terms such as kilo (K), mega (M), giga (G), and so on to represent large

    integers. Unfortunately, we use two different versions of each, depending on the number system,

    decimal or binary. Table 1.1 summarizes the differences between the two systems. Typically,

    computer-related attributes use the binary version. For example, when we say 128 megabyte

    (MB) memory, we mean ��� � ��� bytes. Usually, communication-related quantities and time

    units are expressed using the decimal system. For example, when we say that the data transfer

    rate is 100 megabits/second (Mb/s), we mean ���� ��� Mb/s.

    Throughout the text, we use various number systems: binary, octal, and hexadecimal. Now

    is a good time to refresh your memory by reviewing the material on number systems presented

  • Section 1.2 Programmer’s View 7

    Table 1.1 Terms to represent large integer values

    Term Decimal (base 10) Binary (base 2)

    K (kilo) ��� ���

    M (mega) ��� ���

    G (giga) ��� ���

    T (tera) ���� ���

    P (peta) ���� ���

    in Appendix A. If the number system used is not clear from the context, we use a trailing

    letter to specify the number system. We use “D” for decimal numbers, “B” for binary numbers,

    “Q” for octal numbers, and “H” for hexadecimal (or hex for short) numbers. For example,

    10110101B is an 8-bit binary number whereas 10ABH is a hex number.

    1.2 Programmer’s ViewA programmer’s view of a computer system depends on the type and level of language she

    intends to use. From the programmer’s viewpoint, there exists a hierarchy from low-level lan-

    guages to high-level languages. As we move up in this hierarchy, the level of abstraction in-

    creases. At the lowest level, we have the machine language that is the native language of the

    machine. This is the language understood by the machine hardware. Since digital computers use

    0 and 1 as their alphabet, machine language naturally uses 1s and 0s to encode the instructions.

    One level up, t