EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IFID/OFEXMEMWB...

Preview:

DESCRIPTION

EE524/CptS561 Jose G. Delgado-Frias 3 Datapath (Arith/Logic Inst.) IR Reg imm A B ALUALU PC Inst. Mem. Data Mem. +4 zero ALUoutput  A op B ALUoutput  A op Imm Reg[IR16..20]  ALUoutput IR  Mem[PC] NPC  PC + 4 A  Reg[IR ] B  Reg[IR ] Imm  ((IR 16 ) 16 ## IR ]

Citation preview

EE524/CptS561 Jose G. Delgado-Frias 1

Processor Basic steps to process an instruction

IF ID/OF EX MEM WB

Instruction FetchInstruction Decode / Operand Fetch

ExecuteMemory Access

Write Back

EE524/CptS561 Jose G. Delgado-Frias 2

Instruction FetchWrite

Back

Memory

AccessExecute

Inst. Dec.

Op. Fetch

Datapath

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

IR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

NPC

Multiplexers(mux)

EE524/CptS561 Jose G. Delgado-Frias 3

Datapath (Arith/Logic Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput A op B

ALUoutput A op ImmReg[IR16..20] ALUoutputIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

EE524/CptS561 Jose G. Delgado-Frias 4

Datapath (Load Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput A op ImmReg[IR11-15] LMDIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

EE524/CptS561 Jose G. Delgado-Frias 5

Datapath (Store Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput A op ImmIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

Mem[ALUoutput] B

EE524/CptS561 Jose G. Delgado-Frias 6

Datapath (Branch Inst.)

IRReg

imm

A

B

A

L

UPC

Inst.

Mem.

Data

Mem.

+4zero

ALUoutput (PC+4) op ImmIR Mem[PC]

NPC PC + 4

A Reg[IR 6..10]

B Reg[IR 11..15]

Imm ((IR16)16## IR 11..15]

Instructions of a program

EE524/CptS561 Jose G. Delgado-Frias 7

1 IF ID EX MEM WB

IF ID EX WB2

IF ID3

Time (clock cycles)

Instructions of a program

EE524/CptS561 Jose G. Delgado-Frias 8

1

2

3

45

6

ID

IF

EX

IF

ID

MEM

IF

EX

ID

WB

IF

MEM

EX

ID

ID

WB

MEM

EX

ID

IF

IF

CLOCK CYCLE

WB

MEM

EX

ID

IF

WB

MEM

EX

IF

ID

WB

MEM

ID

EX7

8

Pipelining Lessons

EE524/CptS561 Jose G. Delgado-Frias 9

• Pipelining doesn’t help latency of single task, it helps throughput of entire workload

• Pipeline rate limited by slowest pipeline stage• Multiple tasks operating simultaneously• Potential speedup = Number pipe stages• Unbalanced lengths of pipe stages reduces speedup• Time to “fill” pipeline and time to “drain” it reduces speedup

EE524/CptS561 Jose G. Delgado-Frias 10

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Pipeline registers

Clock

EE524/CptS561 Jose G. Delgado-Frias 11

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

EE524/CptS561 Jose G. Delgado-Frias 12

Pipeline

IF1 ID/OF

IF2

3

4

5

6

7

8

9

INST

RU

CTI

ON

S

CLOCK CYCLE

1 2 3 4 5 6 7 8 9

EX

ID/OF

IF

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

WB

MEM

EX

ID/OF

IF

EE524/CptS561 Jose G. Delgado-Frias 13

Pipeline Hazards

• Structural Hazards– two or more instructions use same hardware at the same time.

• Data Hazards– Data dependencies– Result from inst. j is needed by inst. k

• Control Hazards– Branch changes flow, what happen with the following

instruction(s)

EE524/CptS561 Jose G. Delgado-Frias 14

ResourcesMem

(IM)Reg

Mem

(IM)

ALU

Reg

Mem

(IM)

Mem

(DM)

ALU

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

ALU

RegMem

(DM)Reg

ALU

EE524/CptS561 Jose G. Delgado-Frias 15

Data HazardsMem

(IM)Reg

Mem

(IM)

ALU

Reg

Mem

(IM)

Mem

(DM)

ALU

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

ALU

RegMem

(DM)Reg

ALU

R1 R2+R3

R5 R1+R3

R8 R1-R6

EE524/CptS561 Jose G. Delgado-Frias 16

Data ForwardingMem

(IM)Reg

Mem

(IM)

ALU

Reg

Mem

(IM)

Mem

(DM)

ALU

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

ALU

RegMem

(DM)Reg

ALU

R1 R2+R3

R5 R1+R3

R8 R1-R6

EE524/CptS561 Jose G. Delgado-Frias 17

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

EE524/CptS561 Jose G. Delgado-Frias 18

Example

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

ADD R1,R2,R3 ADD R1,R2,R3SUB R4,R3,R1 ADD R1,R2,R3SUB R4,R3,R1XOR R7,R8,R1 ADD R1,R2,R3SUB R4,R3,R1XOR R7,R8,R1

EE524/CptS561 Jose G. Delgado-Frias 19

Example

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

ADD R1..SUB R4,R3,R1XOR R7,R8,R1

EE524/CptS561 Jose G. Delgado-Frias 20

Example

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

ADD R1..SUB R8,R3,R1XOR R7,R8,R1

EE524/CptS561 Jose G. Delgado-Frias 21

Data Hazard Classification

• RAW (Read After Write)– w/ forward only load presents a problem

• WAW

• WAR

• RAR

j: R1

k: RY R1

j: R1

k: R1

j: R1

k: R1

j: R1

k: R1

EE524/CptS561 Jose G. Delgado-Frias 22

Data Forwarding (load)Mem

(IM)Reg

Mem

(IM)

ALU

Reg

Mem

(IM)

Mem

(DM)

ALU

Reg

Mem

(IM)

Reg

Mem

(DM)Reg

Mem

(DM)Reg

ALU

RegMem

(DM)Reg

ALU

R1 LD[Mem]

R5 R1+R3

R8 R1-R6

EE524/CptS561 Jose G. Delgado-Frias 23

Data hazard (load)

IFLW R1,0(R1) ID

IFSUB R4,R1,R5

EX

ID

IF

WB

EX

ID

IF

MEM

EX

IDAND R6,R1,R7OR R8,R1,R9

MEM

stall

stall

stall

MEM

EX

WB

“R1”

EE524/CptS561 Jose G. Delgado-Frias 24

Branch

BR R1, LABEL_AADD R2,R3,R7AND R5,R7,R11:

:LD R4,R2,005LABEL_A:

EE524/CptS561 Jose G. Delgado-Frias 25

BranchMem

(IM)Reg

Mem

(IM)

Reg

BR R1, LABEL_A

ALU

Reg

Mem

(IM)

Reg

Mem

(DM)

ALU

Reg

Reg

Mem

(DM)

ALU

Reg

Mem

(DM)

ADD R2,R3,R7

AND R5,R7,R11

LD R4,R2,005

Mem

(DM)

ALU

Reg

Mem

(IM)

EE524/CptS561 Jose G. Delgado-Frias 26

Datapath w/ pipeline

RegA

L

U

Data

Mem.

zero

PCInst.

Mem.

+4

Forwarding unit

EE524/CptS561 Jose G. Delgado-Frias 27

What to do w/ branch

• Reduce the number of cycles to decide on a branch.

• Delayed branch (Software Solutions)– NO-OP– move instructions

• from before• from target• from fall through

EE524/CptS561 Jose G. Delgado-Frias 28

BranchMem

(IM)Reg

Mem

(IM)

Reg

BR R1, LABEL_A

ALU

Reg

Mem

(IM)

Mem

(DM)

ALU

Reg

Mem

(IM)

Reg

Mem

(DM)

ALU

Reg

Reg

Mem

(DM)

ALU

Reg

Mem

(DM)

ADD R2,R3,R7

LD R4,R2,005

EE524/CptS561 Jose G. Delgado-Frias 29

NO-OP

BranchNO-OP

EE524/CptS561 Jose G. Delgado-Frias 30

From Before

Branch

EE524/CptS561 Jose G. Delgado-Frias 31

From Target

Branch

EE524/CptS561 Jose G. Delgado-Frias 32

From Fall Through

Branch

33

Multicycle Operations

I F I D MEM W B

EXinst. unit

FPmultiply

FPadder

FPdivider

34

FP operations

• FP Add: 4 cycles• FP Multiply: 7 cycles• FP Divide: 25 cycles

35

Out of order completionExecution starts in order

Example

MULTD

ADDD

LD

SD

1

IF

2

ID

IF

3

m1

ID

IF

4

m2

a1

ID

IF

5

m3

a2

X

ID

6

m4

a3

M

X

7

m5

a4

W

M

8

m6

M

W

9

m7

W

10

M

11

W

36

MIPS R4000(Superpipelining)

instruction memory

IF IS

AL

U

EX

data memory

DF DS TC

Reg

WB

Reg

RF

IF: Instruction fetch First half

IS: Instruction fetch Second half

RF: Inst. Decode & Register Fetch

EX: Execution

DF: Data fetch First half

DS: Data fetch Second half

TC: Tag Check

WB: Write Back

37

Load

instruction memory AL

Udata memory RegReg

instruction memory AL

Udata memory RegReg

instruction memory AL

Udata memory RegReg

instruction memory AL

Udata memoryReg

LW R1

Instruction 1

Instruction 2

ADD R2,R1

CC1 CC2 CC3 CC4 CC5 CC6 CC7

38

Branch

instruction memory AL

U

data memory RegReg

instruction memory AL

U

data memory RegReg

instruction memory AL

U

data memory RegReg

instruction memory AL

U

data memoryReg

BEQZ

instruction memory data memoryReg AL

U

39

Branch (taken)

Branch inst IF IS RF EX DF DS TC WB

Delay slot IF IS RF EX DF DS TC WB

stall S S S S S S S S

stall S S S S S S S S

Branch target IF IS RF EX DF DS TC WB

40

Branch (not taken)

Branch inst IF IS RF EX DF DS TC WB

Delay slot IF IS RF EX DF DS TC WB

Branch inst+2 IF IS RF EX DF DS TC WB

Branch inst+3 IF IS RF EX DF DS TC WB

Branch inst+4 IF IS RF EX DF DS TC WB

Recommended