60
PROCESSOR STRUCTURE AND FUNCTION MEMBERS: ZHE GENG JORGE MONTENEGRO CARLOS GARRIDO WALLI BUTT ADRIAN SUAREZ DIEGO ARIAS

Processor structure and function

  • Upload
    lysa

  • View
    51

  • Download
    0

Embed Size (px)

DESCRIPTION

Processor structure and function . Members: Zhe Geng Jorge Montenegro Carlos Garrido wAlli Butt Adrian Suarez Diego Arias . Contents . Processor Organization Register Organization User-visible registers Control and Status register Example Microprocessor Register Organizations - PowerPoint PPT Presentation

Citation preview

Page 1: Processor structure and function

PROCESSOR STR

UCTURE

AND FUNCTIO

N

M E M B E R S : Z H E G E N G

J O R G E M O N T E N E G R OC A R L O S G A R R I D O

W A L L I B U T TA D R I A N S U A R E Z

D I E G O A R I A S

Page 2: Processor structure and function

CONTENTS Processor Organization  Register Organization User-visible registers Control and Status register Example Microprocessor Register Organizations  Instruction Cycle The Indirect Cycle Data Flow 

 Instruction Pipeline  Strategy Branch Prediction   

Page 3: Processor structure and function

PROCESSOR

ORGANIZATION

ZH E G

E N G

Page 4: Processor structure and function

CPU MUST DO THE FOLLOWING THINGS :• Fetch instruction --- Read instruction from memory• Interpret instruction --- The instruction is decoded• Fetch data --- Read data from memory or an I/O module• Process data --- Perform arithmetic or logical operation• Write data --- Write data to memory or an I/O module

Page 5: Processor structure and function

CPU WITH SYSTEM BUS

Page 6: Processor structure and function

CPU INTERNAL STRUCTURE

Page 7: Processor structure and function

REGISTER ORGANIZATION

C A R L O S GA R R I D

O & J O

R G E MO N T E N E G R O

Page 8: Processor structure and function

REGISTERS

• CPU must have some working space (temporary storage)

• Called registers

• Number and function vary between processor designs

• One of the major design decisions

• Top level of memory arrangement

Page 9: Processor structure and function

USER VISIBLE REGISTERS

• General Purpose

• Data

• Address

• Condition Codes

Page 10: Processor structure and function

HOW MANY GP REGISTERS?• Between 8 - 32

• Fewer = more memory references

• More does not reduce memory references and takes up processor real

estate

• See also RISC

• One cycle execution time

• Pipelining

• Large number of registers

Page 11: Processor structure and function

HOW BIG?

• Large enough to hold full address

• Large enough to hold full word

• Often possible to combine two data registers

• C programming

• double int a;

• long int a;

Page 12: Processor structure and function

CONDITION CODE REGISTERS• ADVANTAGES:• Because condition codes are set by normal arithmetic and data

movement instructions• Conditional instructions, such as BRANCH are simplified relative to

composite instructions, such as TEST AND BRANCH.• Condition codes facilitate multi-way branches. For example, a TEST

instruction can be followed by two branches, one on less than or equal to zero and one on greater than zero.

Page 13: Processor structure and function

CONDITION CODE REGISTERS• DISADVANTAGES:• Condition codes add complexity, both to the hardware and

software. Condition code bits are often modified in different ways by different instructions.

• Condition codes are irregular, they are typically not part of the main data path, so they require extra hardware connections.

• Often condition code machines must add special non-condition-code instructions for special situations anyway.

• In a pipelined implementation, condition codes require special synchronization to avoid conflicts.

Page 14: Processor structure and function

CONTROL AND STATUS REGISTERS• Program Counter (PC): Contains the address of an instruction

to be fetched.

• Instruction Decoding Register (IR): Contains the instruction most recently fetched.

• Memory Address Register (MAR): Contains the address of a location in memory

• Memory Buffer Register (MBR): Contains a word of data to be written to memory or the most recently read.

Page 15: Processor structure and function

PROGRAM STATUS WORD (PSW)• Sign: Contains the sign bit of the result of the last arithmetic operation.

• Zero: Set when the register is “0”

• Carry: Set if an operation resulted in a carry addition or

subtraction of a higher order bit

• Equal: Set if a logical compare result is equality.

• Overflow: Used to indicated arithmetic overflow.

• Interrupt enable/disable: Used to enable or disable interrupts.

Page 16: Processor structure and function

SUPERVISOR MODE Supervisor: Indicates whether the processor is executing in supervisor mode or user mode• Privilege instruction

• Address space

• Memory management

Protection domain or Protection Ring

• Ring

• Kernel

Page 17: Processor structure and function

MICROPROCESSOR REGISTER ORGANIZATION

Page 18: Processor structure and function

INSTRUCTIO

N CYCLE

W A L L I BU T T

Page 19: Processor structure and function

SECTION 12.3INSTRUCTION CYCLE

• A N I N S T R U C T I O N C Y C L E ( S O M E T I M E S C A L L E D F E T C H - A N D -E X E C U T E C Y C L E , F E T C H - D E C O D E -E X E C U T E C Y C L E , O R F D X ) I S T H E B A S I C O P E R AT I O N C Y C L E O F A C O M P U T E R .

• It is the process by which a computer retrieves a program instruction from its memory, determines what actions the

instruction requires, and carries out those actions.

• This cycle is repeated continuously by the central processing unit (CPU), from bootup to

when the computer is shut down.

Page 20: Processor structure and function

The circuits used in the CPU during the cycle are:

Program Counter (PC) – Memory Address Register (MAR) - Memory Data Register (MDR) - Instruction register (IR) – Control Unit (CU) - Arithmetic logic unit (ALU) - There are typically four stages of an instruction

cycle that the CPU carries out: 1) Fetch the instruction from memory. 2) "Decode" the instruction. 3) "Read the effective address" from memory if the instruction has an indirect address. 4) "Execute" the instruction.

Page 21: Processor structure and function

Fetch Reads the next instruction from memory into the processor.

Indirect Cycle May require memory access to fetch operands, therefore more memory accesses.

Interrupt Save current instruction and service the interrupt.

Execute Interpret the opcode and perform the indicated operation.

The instruction cycle is the time in which a single instruction is fetched from memory, decoded, and executed. THE FOUR SUB-CYCLES:

Page 22: Processor structure and function

There are six fundamental phases of the instruction cycle:

1.) fetch instruction (aka pre-fetch) 2.) decode instruction 3.) evaluate address (address generation) 4.) fetch operands (read memory data) 5.) execute (ALU access) 6.) store result (writeback memory data)

Page 23: Processor structure and function

DECODE EVALUATE AND FETCH• Decoding the instruction?• The decoder interprets what?• What is being fetched from

memory?• What decision is made next?• Based on the decision what are the

options?• What if decision is a direct memory

operation?• What if decision is an indirect

memory operation?

Page 24: Processor structure and function

A NOTE ON ADDRESSING MODES

Page 25: Processor structure and function

INSTRUCTION CYCLE WITH AND WITHOUT INDIRECT CYCLE…

Page 26: Processor structure and function

SAMPLE QUESTION:Given that the instruction cycle is the

time in which a single instruction is fetched from memory, decoded, and executed:

A microprocessor provides an instruction capable of moving a string of bytes from one area of memory to another. The fetching and initial decoding of the instruction takes 10 clock cycles.Thereafter, it takes 15 clock cycles to transfer each byte.The microprocessor is clocked at a rate of 10 GHz. Determine the length of the instruction cycle for the case of a string of 64 bytes.

ANSWER:The length of a clock cycle is 0.1 ns. The

length of the instruction cycle for thiscase is [10 + (15 × 64)] × 0.1 = 960 ns.

Page 27: Processor structure and function

ANOTHER EXAMPLE: TOTAL NUMBER OF CYCLES REQUIREDTo execute the SAL instruction: add A, B, C 1.) Fetch instruction (add) from memory address PC. 2.) Increment PC to address of next instruction. 3.) Decode the instruction and operands. 4.) Load the operands B and C from memory. 5.) Execute the add operation. 6.) Store the result into memory location A. Execution Time Suppose each memory access (fetch, load, store) requires 10 clock

cycles and that the PC update, instruction decode, and execution each require 1 clock cycle. The total number of cycles to execute the add instruction is:

10+1+1+10+10+1+10 = 43 cycles/instruction. A CPU running at 100 Mhz (100,000,000 cycles/sec) can execute

add instructions at a rate of 100,000,000/43 = 2,325,581 instructions/sec, or ~2.3 Mips (million instructions/sec).

Page 28: Processor structure and function

DATA FLOW: FETCH CYCLE

IR MBR

Page 29: Processor structure and function

DATA FLOW: INDIRECT CYCLE

MBR

Memory

Page 30: Processor structure and function

DATA FLOW: INTERRUPT CYCLE

ControlUnit

PC

Page 31: Processor structure and function

DATA FLOW: EXECUTE CYCLEThe execute cycle:Takes many forms the form depends on which of the various machine instructions is in the IR.

This cycle may involve transferring data among registers read or write from memory I/O invocation of the ALU

Page 32: Processor structure and function

INSTRUCTIO

N PIPELIN

E

A D R I AN S

U A R E Z & D

I EG O A

R I AS

Page 33: Processor structure and function

INSTRUCTION PIPELININGBy separating an instruction cycle into stages, multiple

instructions at different stages can be worked on at the same time.

For example, Stage 2 of the current instruction can be overlapped with Stage 1 of the next instruction.

Page 34: Processor structure and function

A TWO-STAGE PIPELINEAn instruction cycle can be divided into two stages:• Fetch: get an op-code from main memory and put it in a

register• Execute: decode an op-code and execute the instruction

The execute stage of the current instruction would overlap with the fetch stage of the next instruction.

Assuming that fetch and execute use the same number of clock cycles, this would double the speed (in reality, execute takes longer).

Page 35: Processor structure and function

A TWO-STAGE PIPELINE

Page 36: Processor structure and function

A SIX-STAGE PIPELINE• Fetch instruction (FI): get op-code from memory and put it in a

register• Decode instruction (DI): decode op-code and determine

addressing mode• Calculate operand (CO): get effective address of source operands• Fetch operands (FO): get operands from memory and put them in

registers• Execute instruction (EI): execute instruction and write result to a

register• Write operand (WO): store the result in memoryThis pipeline is more typical of modern computers, especially RISC

computers (e.g. MIPS, SPARC, and DLX).Each stage occupies about the same number of clock cycles.

Page 37: Processor structure and function

A SIX-STAGE PIPELINE

Page 38: Processor structure and function

WHY NOT A 100-STAGE PIPELINE?If 1 instruction per cycle can be achieved with a 5-stage

pipeline, adding more stages would only increase the number registers without increasing speed (it might actually make the computer less efficient).

Overlapping of instructions requires additional logic to account for dependencies between instructions (i.e. a memory read after a memory write to the same location).

Page 39: Processor structure and function

PIPELINE HAZARDS• Resource hazards: when two instructions in the

pipeline need to use the same resource

• Data hazards: when two instructions must be executed in sequence (i.e. a memory write followed by a memory read)

• Branch hazards: when a conditional branch occurs and the pipeline fetches the wrong instructions

Page 40: Processor structure and function

RESOURCE HAZARDS

A resource hazard occurs when two stages in the pipeline need to use the same resource at the same time. For example, two stages need to read from main memory (assuming that the data hasn’t been cached) when an operand fetch is overlapped with an instruction fetch.

In this case, the two stages must be executed in series rather than in parallel, and a delay must be introduced in the pipeline.

Page 41: Processor structure and function

RESOURCE HAZARDS

Page 42: Processor structure and function

RESOURCE HAZARDS

Page 43: Processor structure and function

DATA HAZARDSA data hazard occurs whenever data is fetched from a

location before it contains the correct value.The “correct” value is whatever value it would contain if

the instructions were executed in sequence.Whenever the fetch stage must access data that hasn’t

yet been written, the pipeline must be delayed at the fetch stage.

Page 44: Processor structure and function

ADD EAX, EBXSUB ECX, EAX; I3; I4

DATA HAZARDS

Page 45: Processor structure and function

BRANCH HAZARDS

Page 46: Processor structure and function

PIPELINE IMPLEMENTATIONA pipeline is implemented as a series of sequential circuits,

with each stage taking its input from the output of the previous stage

Page 47: Processor structure and function

DEALING WITH BRANCHESThe most difficult part in designing an instruction

pipeline is assuring a steady flow of instructions to the initial stages of the pipeline. Several approaches have been taken for dealing with conditional branches.

• Multiple streams• Prefetch branch target• Loop buffer• Branch prediction• Delayed branch.

Page 48: Processor structure and function

DEALING WITH BRANCHES

Page 49: Processor structure and function

MULTIPLE STREAMSA pipeline has disadvantages for a branch

instruction because it must choose one of two instructions to fetch the next instruction and may take the wrong choice. One way of dealing with this is to allow the pipeline to fetch both instructions, making use of both streams.

• With multiple pipelines there are delays for accessing register and memory.

• Additional branch instructions may enter the pipeline before the original branch decision is resolved.

Page 50: Processor structure and function

PREFETCH BRANCH TARGET

The target of the branch is prefetched when a conditional branch is recognized in addition to the instruction following the branch.

The target is saved until execution, if a branch is taken that means that it has already been prefetched.

Page 51: Processor structure and function

LOOP BUFFER• A loop buffer is high speed memory that works

in sequence with the instruction fetch stage of the pipeline and it contains the most recently fetched instruction.

• Instructions fetched in sequence will be available without the usual memory access time.

• If a branch occurs to be ahead of the address of the branch instruction, the target will already be in the buffer.

• If the loop buffer is large enough to contain all the instruction in the loop , then the instructions will only have to be fetched once.

Page 52: Processor structure and function

LOOP BUFFER DIAGRAM

Page 53: Processor structure and function

BRANCH PREDICTION

The are several techniques to predict whether or not a branch will be taken.

• Predict never taken• Predict always taken• Predict by opcode• Taken/ not taken switch• Branch history table

Page 54: Processor structure and function

BRANCH PREDICTION FLOWCHART

Page 55: Processor structure and function

BRANCH PREDICTION STATE DIAGRAM

Page 56: Processor structure and function

DELAYED BRANCH

It’s possible to improve the performance of a pipeline be rearranging instructions within a program, so that the instructions occurs later than actually desired. This branch will not take effect until after the execution of the following instruction.

Page 57: Processor structure and function

INTEL 80486 PIPELINING

The Intel 80486 implements a five stage pipeline.• Fetch • Decode stage 1• Decode stage 2• Execute• Write back

Page 58: Processor structure and function

80486 INSTRUCTION PIPELINE EXAMPLES

Page 59: Processor structure and function

QUESTIONS1 What’s the function of internal processing bus?2 What’s the similarity between the internal structure as

a whole and the internal structure of the CPU?3. What is an instruction cycle?4. What are the four sub cycles of an instruction cycle?5. Is the fetch or execute cycle the same for all CPU?6. What is the sequence of an interrupt cycle?7. How does pipelining increase processor speed?8. What are some pipeline hazards?9. Which computers use a 5-stage pipeline?10. What are the five ways to deal with conditional branches?11. What happens in the fetch cycle inside an Intel 80486?

Page 60: Processor structure and function

REFERENCES Computer Organization and Architecture, Designing for Performance, 8/E, Stallings,

William

Embedded System Design: A Unified Hardware/Software Approach, Vahid, Frank, and Givargis, Tony

Wikipedia, “Instruction Cycle” http://en.wikipedia.org/wiki/Instruction_cycle

CIS-77 Introduction to Computer Systems http://www.c-jump.com/CIS77/CPU/InstrCycle/lecture.html