34
ECE 456 Computer Architecture Lecture #14 CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Embed Size (px)

Citation preview

Page 1: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

ECE 456 Computer Architecture

Lecture #14 – CPU (III)

Instruction Cycle & Pipelining

Instructor: Dr. Honggang Wang

Fall 2013

Page 2: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang Lecture #13 2

Administrative Issues(Wednesday, Dec 4)

• Project–Report Due Dec 9

–Presentation Due 2:00 pm, Dec 9

–Order: Group 1, Group 2, Group 3, Group 4

• Exam 2 review

Page 3: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Review of Lecture #12 & 13

• Machine instruction characteristics– constituent elements, instruction representation, instruction

types, and number of addresses

• Instruction set design– types of operands– operation repertoire– Addressing modes (how is the operand address specified?):

immediate, direct, indirect, register, register indirect, displacement (relative, base-register, indexing), stack

– Instruction formats

• Little-, big-, and bi-endian (byte ordering, bit ordering)

Page 4: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Topics

• Instruction cycle • Instruction pipelining

– Principle

– Performance

– Problems (L15)

– Examples (L15)

Page 5: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Instruction Cycle

+ Indirect Cycle (for indirect addressing operands)

Page 6: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Instruction Cycle with Indirect Sub-Cycle

Page 7: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Instruction Cycle State Diagram

Page 8: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Data Flow in Each Cycle

Page 9: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Data Flow (1: Fetch Cycle)

– PC contains address of next instruction

– Address moved to MAR

– Address placed on address bus

– Control unit requests a memory read

– Result placed on data bus, copied to MBR, then to IR

– Meanwhile PC incremented by 1

Page 10: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Data Flow (2: Indirect Cycle)

• IR is examined

• If indirect addressing, indirect cycle is performed– Right most N bits of MBR

transferred to MAR

– Control unit requests a memory read

– Result (address of operand) moved to MBR

Page 11: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Data Flow (3: Execute Cycle)

• May take many forms• Depends on instruction being executed• May include

– Register transfers

– Memory read/write

– Input/Output

– ALU operations

Page 12: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Data Flow (4: Interrupt Cycle)

• Simple &Predictable

• Current PC saved to allow resumption after interrupt– Contents of PC copied to MBR– Special memory location (e.g.

stack pointer) loaded to MAR– MBR written to memory

• PC loaded with address of ISR

• Next instruction (first of ISR) can be fetched

Page 13: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Agenda

• Instruction cycle – Fetch, indirect, execute, interrupt cycle

– Data flow

• Instruction pipelining– Principle

– Performance

– Problems

– Examples

Page 14: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

A Laundry Example

• Let us assume there are four steps to the weekly (monthly) laundry: 4 loads

Page 15: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Do the Laundry

• Pipelined 4 loads

16 cycles

7 cycles

• Sequential 4 loads:

Page 16: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Principles of Pipelining• Tasks are subdivided into successive subtasks• A pipeline stage is associated with each subtask• The same amount of time is allocated to each subtask• All pipeline stages operate like an assembly line; 1st stage accepts

input, the last stage delivers the output• Basic pipeline is synchronous

Page 17: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Instruction Pipelining

• A key, powerful technique to make fast CPU• An ‘assembly line’ in computing used for instruction

processing; 6 stages of (nearly) equal duration– Fetch instruction (FI)– Decode instruction (DI)– Calculate operands, i.e. EAs (CO)– Fetch operands (FO)– Execute instruction (EI)– Write operand / result (WO)

• Multiple instructions are overlapped in execution

Page 18: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Timing of Instruction Pipeline (1)

54 cycles 14 cycles

Page 19: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Timing of Instruction Pipeline (2)

• Time progresses vertically down the figure

• Each row shows the state of the pipeline at a given point in time

• Pipeline is full at time 6 through 9 with different instructions in different stages

Page 20: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Comments (1)• Each instruction is assumed to go through all 6

stages of the pipeline – not always the case, e.g., no WO stage for ‘LOAD’– timing is set up so for simplifying pipeline

hardware

• Assume no potential hazard– data dependency, branch, interrupt

Page 21: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Comments (2)• Assumes no memory conflicts

– Most memory systems don’t permit simultaneous accesses

– Desired value may be in cache, or FO, or WO may be null stage, or separate instruction and data memories are used pipeline is not slowed down for much of time

Page 22: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Timing of Instruction Pipeline (1)

Page 23: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Agenda

• Instruction cycle – Fetch, indirect, execute, interrupt cycle

– Data flow

• Instruction pipelining– Principle

– Performance

– Problems

– Examples

Page 24: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Pipeline Performance (1)

• Cycle time – the time available for each stage to accomplish the

required operations

– Determined by the worst-case processing time of the longest stage

– Currently pipelined processors: 2-20 ns

Page 25: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Pipeline Performance (2)

• Total time to execute n instructions– k: number of stages in the pipeline

– To complete the execution of the 1st ins: k cycles

– The remaining n-1 ins require n-1 cycles

)]1([ nkTk

Page 26: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Pipeline Performance (3)

• Speedup factor– Compared to execution without pipeline:

– The larger the # of pipeline stages, the larger the potential for speedup

)1()]1([1

nk

nk

nk

nk

T

TS

kk

Page 27: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Speedup Factor

Illustration

Pipeline Performance (3)

Page 28: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Pipeline Performance (4)

• Throughput– Also called “repetition rate”

– The shortest possible time interval between subsequent independent instructions in the pipeline

– When the basic pipe is full, throughput is 1 cycle

Page 29: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Hands-On Problem

If you have a simple 6-stage pipeline executing a basic code block containing 10 instructions. Assume the pipeline clock cycle time is 10ns and there is no potential hazard (data / branch / interrupt).

1. What is the total time to execute this block of code?

2. What is the repetition rate of this pipeline for this basic block?

3. What is the speedup factor?

Page 30: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Agenda

• Instruction cycle – Fetch, indirect, execute, interrupt cycle

– Data flow

• Instruction pipelining– Principle

– Performance

– Problems

– Examples

Page 31: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Difficulties with Pipelining

• The stages are not of equal duration– use the worst-case processing time of the longest stage

– waiting must be involved

• Data hazard due to Read-After-Write dependency• Conditional branch instructions could invalidate

the fetched instructions behind them• Interrupt could invalidate the fetched instructions

Page 32: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Summary of Lecture #14

• Instruction cycle (elaborated version)– Fetch, indirect, execute, interrupt cycle– Data flow

• Instruction pipelining– Principle: assembly line– Performance measures– Problems / difficulties introduction

Page 33: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Things To Do

• Work on the project

• Check out the class website about lecture notes

Page 34: ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013

Dr. Wang

Solution• T=(k+(n-1))*c, where k=6, the number of stages in the

pipeline; n=10, the number of instructions to be executed; c=10ns, the clock cycle time, so, the total time to execute the code is: 150ns

• Repetition rate also known as throughput, for this pipeline, the throughput is 1 cycle

• Speedup factor is the ratio of total execution time without pipelining to total execution time with pipelining. The total time without pipelining is n*k*c=600ns. So, the speedup factor s=600/150=4