Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Review Session
CS 465-‐ Fall 2015 Prof. Daniel A. Menasce
Department of Computer Science
Exercise 2.22 Write a minimal set of MIPS assembly instrucFons that does the idenFcal operaFon as the C code below. Assume the base address of C is in $s1 and that A is in $s2. Use the minimum number of registers. Do not destroy the contents of $s1 or $s2. A = C[0] << 4;
Solution to Exercise 2.22 Write a minimal set of MIPS assembly instrucFons that does the idenFcal operaFon as the C code below. Assume the base address of C is in $s1 and that A is in $s2. Use the minimum number of registers. Do not destroy the contents of $s1 or $s2. A = C[0] << 4; lw $t1, 0($s1) # loads C[0] into A sll $t1,$t1,4 # shiU leU by 4 bits contents of $t1 sw $t1, 0($s2) # store C[0] << 4 into A
Exercise 2.26.1 Consider the following MIPS code with the following iniFal values: $t1 = 10 and $s2 = 0. LOOP: slt $t2, $0, $t1 beq $t2, $0, DONE subi $t1, $t1, 1 addi $s2, $s2, 2 j LOOP DONE: What is the final value of $s2?
Solution to Exercise 2.26.1 Consider the following MIPS code with the following iniFal values: $t1 = 10 and $s2 = 0. LOOP: slt $t2, $0, $t1 # if $t1 > 0 then $t2 = 1 else $t2 = 0 beq $t2, $0, DONE # if $t2 = 0 then go to DONE subi $t1, $t1, 1 # $t1 = $t1 -‐1 addi $s2, $s2, 2 # $s2 = $s2 + 2 j LOOP # Go to LOOP DONE: Number of loop execuFons: $t1 at top = 10; $t1 at bocom = 9 … t1 at top = 1; $t1 at bocom = 0 è 10 execuFons è$s2 = 2x10 = 20
Exercise 1 Describe what the following MIPS code does.
addi $s2,$0,$0 addi $t1,$0,$0
LOOP lw $s1,0($s0) add $s2,$s2,$s1 addi $s0,$s0,4 addi $t1,$t1,1 slF $t2,$t1,100 bne $t2,$0,LOOP
DONE:
Solution to Exercise 1 Describe what the following MIPS code does.
addi $s2,$0,$0 # $s2 = 0 addi $t1,$0,$0 # $t1 = 0
LOOP lw $s1,0($s0) # $s1 = Mem[$s0] add $s2,$s2,$s1 # $s2 = $s2+Mem[$s0] addi $s0,$s0,4 # $s0 = $s0 + 4 addi $t1,$t1,1 # $t1 = $t1 + 1 slF $t2,$t1,100 # $t1 = 1 if $t1 < 100; $t1 = 0 otherwise bne $t2,$0,LOOP # branch to LOOP if $t2 ≠ 0 ($t1 < 100)
DONE:
Code meaning: store in $s2 the sum of all 100 words stored starFng at address $s0
Exercise 2 Consider a mulFprocessor with p processors. Assume that 25% of the instrucFons of a program can be executed in parallel using all p processors. The remaining 75% of the instrucFons have to be executed sequenFally. Assume that the Fme to execute the program sequenFally (i.e., using only one processor) is Ts. Give an expression for S(p), the speedup obtained when using p processors. What is the maximum possible speedup? I.e.(lim S(p) when p -‐> ∞)
Solution to Exercise 2 Consider a mulFprocessor with p processors. Assume that 25% of the instrucFons of a program can be executed in parallel using all p processors. The remaining 75% of the instrucFons have to be executed sequenFally. Assume that the Fme to execute the program sequenFally (i.e., using only one processor) is Ts. Give an expression for S(p), the speedup obtained when using p processors. S(p) = Ts / (0.25 Ts + 0.75 Ts/p) = 1 / (0.25 + 0.75/p) Lim S(p) when p -‐> ∞ = 1 / 0.25 = 100 / 25 = 4
Solution to Exercise 2
Amdahl’s Law! The speedup is dominated by the sequenFal porFon of the program.
Exercise 3
1
Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
Solution to Exercise 3
$t2
$t1
off
0
1
Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
$t2
$t1
off
32-bit off
$t2
1
0
1
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
$t2
$t1
off
32-bit off
$t2
1
$t2+off add 0
1
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
$t2
$t1
off
32-bit off
$t2
1
$t2+off add
1
1
0
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
$t2
$t1
off
32-bit off
$t2
1
$t2+off add
1
1
0
Mem [$t2+off]
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
$t2
$t1
off
32-bit off
$t2
1
$t2+off add
1
1
0
Mem [$t2+off]
1
Solution to Exercise 3 Describe in detail the operaFon of a lw $t1,off($t2) for a single cycle data path.
Exercise 4 Consider that the MIPS code below is executed in a 5-‐stage pipelined architecture as discussed in class. What is the minimum number of cycles needed to execute the code and why? lw $t1,0($s0) addi $t1,$t1,10 sw $t1,0($s0)
Solution to Exercise 4 Consider that the MIPS code below is executed in a 5-‐stage pipelined architecture as discussed in class. What is the minimum number of cycles needed to execute the code and why? lw $t1,0($s0) IF ID EX MEM WB addi $t1,$t1,10 IF ID EX MEM WB sw $t1,0($s0) IF ID EX MEM WB Eight cycles are needed. The addi instrucFon needs $t1, which can be forwarded from the end of the MEM stage of the lw instrucFon. This requires the pipeline to stall for one cycle. The sw instrucFon can only be fetched aUer the addi instrucFon is fetched to avoid a structural hazard.
Exercise 5 Which of the statements below regarding pipeline registers are correct? (a) Pipeline registers are needed because each stage of the
pipeline needs its own register file to execute instrucFons that need registers.
(b) A pipeline register at the interface of stages i and i+1 needs to store all the control unit values needed to execute an instrucFon execuFng at stage i+1 as well as the outputs of stage i that are needed as input to stage i+1.
(c) The contents of the pipeline registers are updated at each clock cycle.
(d) The pipeline registers store values to be forwarded to previous stages in the pipeline.
Solutions to Exercise 5 Which of the statements below regarding pipeline registers are correct? (a) Pipeline registers are needed because each stage of the
pipeline needs its own register file to execute instrucFons that need registers.
Incorrect: There is only one register file, which has all registers of the CPU. The pipeline registers hold, among other things, values for rs, rt, and rd.
Solutions to Exercise 5 Which of the statements below regarding pipeline registers are correct? b) A pipeline register at the interface of stages i and i+1 needs to
store all the control unit values needed to execute an instrucFon execuFng at stage i+1 as well as the outputs of stage i that are needed as input to stage i+1.
Correct.
Solutions to Exercise 5 Which of the statements below regarding pipeline registers are correct? c) The contents of the pipeline registers are updated at each clock
cycle. Correct d) The pipeline registers store values to be forwarded to previous
stages in the pipeline.
A pipeline register at the intersecFon of stages i and i+1 stores a value generated by an instrucFon at stage i. This value can be forwarded to an instrucFon at stage i-‐1. This instrucFon was issued aUer the instrucFon at stage i.
Exercise 6 Consider the format of R-‐type instrucFons given below and the following sequence of instrucFons. I1 add ra,rb,rc I2 sub re,rf,rg
0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6
Provide an expression, as a funcFon of ra,rb,rc,re,rf,rg and the pipeline registers, that can be used to determine if that sequence causes a data hazard.
Solutions to Exercise 6 Consider the format of R-‐type instrucFons given below and the following sequence of instrucFons. I1 add ra,rb,rc I2 sub re,rf,rg
0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6
Provide an expression, as a funcFon of ra,rb,rc,re,rf,rg and the pipeline registers, that can be used to determine if that sequence causes a data hazard.
((EX/MEM.RegisterRd = ID/EX.RegisterRs = ra) or (EX/MEM.RegisterRd = ID/EX.RegisterRt =ra)) and EX/MEM.RegWrite and EX/MEM.RegisterRd ≠0
Exercise 7 Consider the following sequence of instrucFons. Assume that the register comparison in a branch instrucFon is done at the instrucFon decode stage of the pipeline. Can that sequence be executed without and without stalling?
add $t1, $t2, $0 add $t3, $t4, $t5 beq $t1, $t3, LABEL
Solution to Exercise 7 Consider the following sequence of instrucFons. Assume that the register comparison in a branch instrucFon is done at the instrucFon decode stage of the pipeline. Can that sequence be executed without and without stalling?
add $t1, $t2, $0 IF ID EX MEM WB add $t3, $t4, $t5 IF ID EX MEM WB beq $t1, $t3, LABEL IF ID EX MEM WB
No. The pipeline needs to stall for one cycle because $t1 and $t3 are needed at the beginning of its ID stage. The value of $t3 is only available at the end of the EX stage of the second add.
Exercise 8 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address Outcome
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 1000 and how would the table change if the branch is mispredicted?
Solution to Exercise 8 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address State
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 1000 and how would the table change if the branch is mispredicted? The predicFon would be take the branch. In case of a mispredicFon, the new state should be Temporary Branch Taken.
Exercise 9 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address Outcome
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 2500 and how would the table change if the branch is mispredicted?
Solution to Exercise 9 Consider a 2-‐bit branch predicFon and the following recent history of branch instrucFons. Branch instruc/on address State
1000 Solid Branch Taken
2500 Temporary Branch Not Taken
3657 Solid Branch Not Taken
4512 Solid Branch Taken
What is the predicFon for a branch at address 2500 and how would the table change if the branch is mispredicted? The predicFon would be do not take the branch. In case of a mispredicFon, the new state should be Temporary Branch Taken.