Upload
suzan-stafford
View
228
Download
1
Tags:
Embed Size (px)
Citation preview
ECE 456 Computer Architecture
Lecture #14 – CPU (III)
Instruction Cycle & Pipelining
Instructor: Dr. Honggang Wang
Fall 2013
Dr. Wang Lecture #13 2
Administrative Issues(Wednesday, Dec 4)
• Project–Report Due Dec 9
–Presentation Due 2:00 pm, Dec 9
–Order: Group 1, Group 2, Group 3, Group 4
• Exam 2 review
Dr. Wang
Review of Lecture #12 & 13
• Machine instruction characteristics– constituent elements, instruction representation, instruction
types, and number of addresses
• Instruction set design– types of operands– operation repertoire– Addressing modes (how is the operand address specified?):
immediate, direct, indirect, register, register indirect, displacement (relative, base-register, indexing), stack
– Instruction formats
• Little-, big-, and bi-endian (byte ordering, bit ordering)
Dr. Wang
Topics
• Instruction cycle • Instruction pipelining
– Principle
– Performance
– Problems (L15)
– Examples (L15)
Dr. Wang
Instruction Cycle
+ Indirect Cycle (for indirect addressing operands)
Dr. Wang
Instruction Cycle with Indirect Sub-Cycle
Dr. Wang
Instruction Cycle State Diagram
Dr. Wang
Data Flow in Each Cycle
Dr. Wang
Data Flow (1: Fetch Cycle)
– PC contains address of next instruction
– Address moved to MAR
– Address placed on address bus
– Control unit requests a memory read
– Result placed on data bus, copied to MBR, then to IR
– Meanwhile PC incremented by 1
Dr. Wang
Data Flow (2: Indirect Cycle)
• IR is examined
• If indirect addressing, indirect cycle is performed– Right most N bits of MBR
transferred to MAR
– Control unit requests a memory read
– Result (address of operand) moved to MBR
Dr. Wang
Data Flow (3: Execute Cycle)
• May take many forms• Depends on instruction being executed• May include
– Register transfers
– Memory read/write
– Input/Output
– ALU operations
Dr. Wang
Data Flow (4: Interrupt Cycle)
• Simple &Predictable
• Current PC saved to allow resumption after interrupt– Contents of PC copied to MBR– Special memory location (e.g.
stack pointer) loaded to MAR– MBR written to memory
• PC loaded with address of ISR
• Next instruction (first of ISR) can be fetched
Dr. Wang
Agenda
• Instruction cycle – Fetch, indirect, execute, interrupt cycle
– Data flow
• Instruction pipelining– Principle
– Performance
– Problems
– Examples
Dr. Wang
A Laundry Example
• Let us assume there are four steps to the weekly (monthly) laundry: 4 loads
Dr. Wang
Do the Laundry
• Pipelined 4 loads
16 cycles
7 cycles
• Sequential 4 loads:
Dr. Wang
Principles of Pipelining• Tasks are subdivided into successive subtasks• A pipeline stage is associated with each subtask• The same amount of time is allocated to each subtask• All pipeline stages operate like an assembly line; 1st stage accepts
input, the last stage delivers the output• Basic pipeline is synchronous
Dr. Wang
Instruction Pipelining
• A key, powerful technique to make fast CPU• An ‘assembly line’ in computing used for instruction
processing; 6 stages of (nearly) equal duration– Fetch instruction (FI)– Decode instruction (DI)– Calculate operands, i.e. EAs (CO)– Fetch operands (FO)– Execute instruction (EI)– Write operand / result (WO)
• Multiple instructions are overlapped in execution
Dr. Wang
Timing of Instruction Pipeline (1)
54 cycles 14 cycles
Dr. Wang
Timing of Instruction Pipeline (2)
• Time progresses vertically down the figure
• Each row shows the state of the pipeline at a given point in time
• Pipeline is full at time 6 through 9 with different instructions in different stages
Dr. Wang
Comments (1)• Each instruction is assumed to go through all 6
stages of the pipeline – not always the case, e.g., no WO stage for ‘LOAD’– timing is set up so for simplifying pipeline
hardware
• Assume no potential hazard– data dependency, branch, interrupt
Dr. Wang
Comments (2)• Assumes no memory conflicts
– Most memory systems don’t permit simultaneous accesses
– Desired value may be in cache, or FO, or WO may be null stage, or separate instruction and data memories are used pipeline is not slowed down for much of time
Dr. Wang
Timing of Instruction Pipeline (1)
Dr. Wang
Agenda
• Instruction cycle – Fetch, indirect, execute, interrupt cycle
– Data flow
• Instruction pipelining– Principle
– Performance
– Problems
– Examples
Dr. Wang
Pipeline Performance (1)
• Cycle time – the time available for each stage to accomplish the
required operations
– Determined by the worst-case processing time of the longest stage
– Currently pipelined processors: 2-20 ns
Dr. Wang
Pipeline Performance (2)
• Total time to execute n instructions– k: number of stages in the pipeline
– To complete the execution of the 1st ins: k cycles
– The remaining n-1 ins require n-1 cycles
)]1([ nkTk
Dr. Wang
Pipeline Performance (3)
• Speedup factor– Compared to execution without pipeline:
– The larger the # of pipeline stages, the larger the potential for speedup
)1()]1([1
nk
nk
nk
nk
T
TS
kk
Dr. Wang
Speedup Factor
Illustration
Pipeline Performance (3)
Dr. Wang
Pipeline Performance (4)
• Throughput– Also called “repetition rate”
– The shortest possible time interval between subsequent independent instructions in the pipeline
– When the basic pipe is full, throughput is 1 cycle
Dr. Wang
Hands-On Problem
If you have a simple 6-stage pipeline executing a basic code block containing 10 instructions. Assume the pipeline clock cycle time is 10ns and there is no potential hazard (data / branch / interrupt).
1. What is the total time to execute this block of code?
2. What is the repetition rate of this pipeline for this basic block?
3. What is the speedup factor?
Dr. Wang
Agenda
• Instruction cycle – Fetch, indirect, execute, interrupt cycle
– Data flow
• Instruction pipelining– Principle
– Performance
– Problems
– Examples
Dr. Wang
Difficulties with Pipelining
• The stages are not of equal duration– use the worst-case processing time of the longest stage
– waiting must be involved
• Data hazard due to Read-After-Write dependency• Conditional branch instructions could invalidate
the fetched instructions behind them• Interrupt could invalidate the fetched instructions
Dr. Wang
Summary of Lecture #14
• Instruction cycle (elaborated version)– Fetch, indirect, execute, interrupt cycle– Data flow
• Instruction pipelining– Principle: assembly line– Performance measures– Problems / difficulties introduction
Dr. Wang
Things To Do
• Work on the project
• Check out the class website about lecture notes
Dr. Wang
Solution• T=(k+(n-1))*c, where k=6, the number of stages in the
pipeline; n=10, the number of instructions to be executed; c=10ns, the clock cycle time, so, the total time to execute the code is: 150ns
• Repetition rate also known as throughput, for this pipeline, the throughput is 1 cycle
• Speedup factor is the ratio of total execution time without pipelining to total execution time with pipelining. The total time without pipelining is n*k*c=600ns. So, the speedup factor s=600/150=4