Basic Concepts1

Embed Size (px)

Citation preview

  • 7/31/2019 Basic Concepts1

    1/18

    1

    Unit 3Pipelining

  • 7/31/2019 Basic Concepts1

    2/18

    2

    Overview

    Pipelining is widely used in modern processors.

    Pipelining improves system performance in

    terms of throughput.

    Pipelined organization requires sophisticated

    compilation techniques.

  • 7/31/2019 Basic Concepts1

    3/18

    3

    Basic Concepts

  • 7/31/2019 Basic Concepts1

    4/18

    4

    Making the Execution of

    Programs Faster

    Use faster circuit technology to build the

    processor and the main memory.

    Arrange the hardware so that more than one

    operation can be performed at the same time.

    In the latter way, the number of operations

    performed per second is increased even

    though the elapsed time needed to perform anyone operation is not changed.

  • 7/31/2019 Basic Concepts1

    5/18

    5

    Traditional Pipeline Concept

    Traditional Pipeline Concept

    A B C D

  • 7/31/2019 Basic Concepts1

    6/18

    6

    Traditional Pipeline Concept

    Traditional

    Pipeline

    Concept

    A

    B

    C

    D

    30 40 20 30 40 20 30 40 20 30 40 20

    6 PM 7 8 9 10 11 Midnight

    Time

  • 7/31/2019 Basic Concepts1

    7/18

    7

    Traditional Pipeline Concept

    Traditional

    PipelineConcept

    A

    B

    C

    D

    6 PM 7 8 9 10 11 Midnight

    T

    a

    s

    k

    O

    rd

    e

    r

    Time

    30 40 40 40 40 20

  • 7/31/2019 Basic Concepts1

    8/18

    8

    Traditional Pipeline Concept

    TraditionalPipeline

    ConceptA

    B

    C

    D

    6 PM 7 8 9

    T

    a

    s

    k

    O

    r

    d

    e

    r

    Time

    30 40 40 40 40 20

  • 7/31/2019 Basic Concepts1

    9/18

    9

    Use the Idea of Pipelining in a

    Computer

    F1

    E1

    F2

    E2

    F3

    E3

    I1

    I2

    I3

    (a) Sequential execution

    Instructionfetchunit

    Ex ecutionunit

    Interstage buffer

    B1

    (b) Hardware organization

    T ime

    F1

    E1

    F2

    E2

    F3

    E3

    I1

    I2

    I3

    Instruction

    (c) Pipelined execution

    Figure 8.1. Basic idea of instruction pipelining.

    Clock cycle 1 2 3 4

    Time

    Fetch + Execution

  • 7/31/2019 Basic Concepts1

    10/18

    10

    Use the Idea of Pipelining in a

    Computer

    F4I4

    F1

    F2

    F3

    I1

    I2

    I3

    D1

    D2

    D3

    D4

    E1

    E2

    E3

    E4

    W1

    W2

    W3

    W4

    Instruction

    Figure 8.2. A 4-stage pipeline.

    Clock cycle 1 2 3 4 5 6 7

    (a) Instruction execution div ided into f our steps

    F : Fetch

    instruction

    D : Decodeinstruction

    and fetchoperands

    E: Execute

    operation

    W : Write

    results

    Interstage buffers

    (b) Hardware organization

    B1 B2 B3

    Time

    Fetch + Decode

    + Execution + Write

    Textbook page: 457

  • 7/31/2019 Basic Concepts1

    11/18

    11

    Role of Cache Memory

    Each pipeline stage is expected to complete in one clock

    cycle.

    The clock period should be long enough to let the slowest

    pipeline stage to complete. Faster stages can only wait for the slowest one to

    complete.

    Since main memory is very slow compared to the

    execution, if each instruction needs to be fetched frommain memory, pipeline is almost useless.

    Fortunately, we have cache.

  • 7/31/2019 Basic Concepts1

    12/18

    12

    Pipeline Performance

    The potential increase in performance resulting

    from pipelining is proportional to the number of

    pipeline stages.

    However, this increase would be achieved only

    if all pipeline stages require the same time to

    complete, and there is no interruption

    throughout program execution. Unfortunately, this is not true.

  • 7/31/2019 Basic Concepts1

    13/18

    13

    Pipeline Performance

    F1

    F2

    F3

    I1

    I2

    I3

    E1

    E2

    E3

    D1

    D2

    D3

    W1

    W2

    W3

    Instruction

    F4

    D4

    I4

    Clock cycle 1 2 3 4 5 6 7 8 9

    Figure 8.3. Effect of an execution operation taking more than one clock cycle.

    E4

    F5I5 D5

    Time

    E5

    W4

  • 7/31/2019 Basic Concepts1

    14/18

    14

    Pipeline Performance

    The previous pipeline is said to have been stalled for two clock cycles. Any condition that causes a pipeline to stall is called a hazard. Data hazard any condition in which either the source or the destination

    operands of an instruction are not available at the time expected in thepipeline. So some operation has to be delayed, and the pipeline stalls.

    Instruction (control) hazard a delay in the availability of an instructioncauses the pipeline to stall.

    Structural hazard the situation when two instructions require the use ofa given hardware resource at the same time.

  • 7/31/2019 Basic Concepts1

    15/18

    15

    Pipeline Performance

    F1

    F2

    F3

    I1

    I2

    I3

    D1

    D2

    D3

    E1

    E2

    E3

    W1

    W2

    W3

    Instruction

    Figure 8.4. Pipeline stall caused by a cache miss in F2.

    1 2 3 4 5 6 7 8 9Clock cycle

    (a) Instruction execution steps in successiv e clock cy cles

    1 2 3 4 5 6 7 8Clock cycle

    Stage

    F: Fetch

    D: Decode

    E: Execute

    W: Write

    F1 F2 F3

    D1 D2 D3idle idle idle

    E1 E2 E3idle idle idle

    W1 W2idle idle idle

    (b) Function perf ormed by each processor stage in success iv e clock cy cles

    9

    W3

    F2 F2 F2

    Time

    Time

    Idle periods

    stalls (bubbles)

    Instruction

    hazard

  • 7/31/2019 Basic Concepts1

    16/18

    16

    Pipeline Performance

    F1

    F2

    F3

    I1

    I2 (Load)

    I3

    E1

    M2

    D1

    D2

    D3

    W1

    W2

    Instruction

    F4I4

    Clock cycle 1 2 3 4 5 6 7

    Figure 8.5. Effect of a Load instruction on pipeline timing.

    F5I5 D5

    Time

    E2

    E3 W3

    E4D4

    Load X(R1), R2Structural

    hazard

  • 7/31/2019 Basic Concepts1

    17/18

    17

    Pipeline Performance

    Again, pipelining does not result in individual instructions

    being executed faster; rather, it is the throughput that

    increases.

    Throughput is measured by the rate at which instructionexecution is completed.

    Pipeline stall causes degradation in pipeline performance.

    We need to identify all hazards that may cause the

    pipeline to stall and to find ways to minimize their impact.

  • 7/31/2019 Basic Concepts1

    18/18

    18

    Quiz

    Four instructions, the I2 takes two clock cycles

    for execution. Pls draw the figure for 4-stage

    pipeline, and figure out the total cycles needed

    for the four instructions to complete.