31
Review Session CS 465 Fall 2015 Prof. Daniel A. Menasce Department of Computer Science

Review&Session& - George Mason University

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Review&Session& - George Mason University

Review  Session  

CS  465-­‐  Fall  2015  Prof.  Daniel  A.  Menasce  

Department  of  Computer  Science  

Page 2: Review&Session& - George Mason University

Exercise 2.22 Write  a  minimal  set  of  MIPS  assembly  instrucFons  that  does  the  idenFcal  operaFon  as  the  C  code  below.  Assume  the  base  address  of  C  is  in  $s1  and  that  A  is  in  $s2.  Use  the  minimum  number  of  registers.  Do  not  destroy  the  contents  of  $s1  or  $s2.      A  =  C[0]  <<  4;    

Page 3: Review&Session& - George Mason University

Solution to Exercise 2.22 Write  a  minimal  set  of  MIPS  assembly  instrucFons  that  does  the  idenFcal  operaFon  as  the  C  code  below.  Assume  the  base  address  of  C  is  in  $s1  and  that  A  is  in  $s2.  Use  the  minimum  number  of  registers.  Do  not  destroy  the  contents  of  $s1  or  $s2.      A  =  C[0]  <<  4;    lw  $t1,  0($s1)    #  loads  C[0]  into  A  sll  $t1,$t1,4    #  shiU  leU  by  4  bits  contents  of  $t1  sw  $t1,  0($s2)    #  store  C[0]  <<  4  into  A    

Page 4: Review&Session& - George Mason University

Exercise 2.26.1 Consider  the  following  MIPS  code  with  the  following  iniFal  values:  $t1  =  10  and  $s2  =  0.    LOOP:  slt          $t2,  $0,  $t1                          beq      $t2,  $0,  DONE                          subi      $t1,  $t1,  1                          addi      $s2,  $s2,  2                          j                    LOOP  DONE:    What  is  the  final  value  of  $s2?  

Page 5: Review&Session& - George Mason University

Solution to Exercise 2.26.1 Consider  the  following  MIPS  code  with  the  following  iniFal  values:  $t1  =  10  and  $s2  =  0.    LOOP:  slt          $t2,  $0,  $t1                      #  if  $t1  >  0  then  $t2  =  1  else  $t2  =  0                          beq      $t2,  $0,  DONE            #  if  $t2  =  0  then  go  to  DONE                          subi      $t1,  $t1,  1                        #    $t1  =  $t1  -­‐1                          addi      $s2,  $s2,  2                      #      $s2  =  $s2  +  2                          j                    LOOP                                        #  Go  to  LOOP  DONE:    Number  of  loop  execuFons:    $t1  at  top  =  10;  $t1  at  bocom  =  9                …    t1  at  top  =    1;    $t1  at  bocom  =  0  è  10  execuFons  è$s2  =  2x10  =  20  

Page 6: Review&Session& - George Mason University

Exercise 1 Describe  what  the  following  MIPS  code  does.    

   addi  $s2,$0,$0      addi  $t1,$0,$0  

LOOP  lw    $s1,0($s0)      add  $s2,$s2,$s1      addi  $s0,$s0,4      addi  $t1,$t1,1      slF    $t2,$t1,100      bne  $t2,$0,LOOP  

DONE:    

Page 7: Review&Session& - George Mason University

Solution to Exercise 1 Describe  what  the  following  MIPS  code  does.    

   addi  $s2,$0,$0    #  $s2  =  0      addi  $t1,$0,$0    #  $t1  =  0  

LOOP  lw    $s1,0($s0)    #  $s1  =  Mem[$s0]      add  $s2,$s2,$s1  #  $s2  =  $s2+Mem[$s0]      addi  $s0,$s0,4    #  $s0  =  $s0  +  4      addi  $t1,$t1,1    #  $t1  =  $t1  +  1        slF    $t2,$t1,100  #  $t1  =  1  if  $t1  <  100;  $t1  =  0  otherwise      bne  $t2,$0,LOOP  #  branch  to  LOOP  if  $t2  ≠  0  ($t1  <  100)  

DONE:    

Code  meaning:  store  in  $s2  the  sum  of  all  100  words  stored  starFng  at  address  $s0  

Page 8: Review&Session& - George Mason University

Exercise 2 Consider  a  mulFprocessor  with  p  processors.  Assume  that  25%  of  the  instrucFons  of  a  program  can  be  executed  in  parallel  using  all  p  processors.  The  remaining  75%  of  the  instrucFons  have  to  be  executed  sequenFally.  Assume  that  the  Fme    to  execute  the  program  sequenFally  (i.e.,  using  only  one  processor)  is  Ts.  Give  an  expression  for  S(p),  the  speedup  obtained  when  using  p  processors.        What  is  the  maximum  possible  speedup?  I.e.(lim  S(p)  when  p  -­‐>  ∞)    

Page 9: Review&Session& - George Mason University

Solution to Exercise 2 Consider  a  mulFprocessor  with  p  processors.  Assume  that  25%  of  the  instrucFons  of  a  program  can  be  executed  in  parallel  using  all  p  processors.  The  remaining  75%  of  the  instrucFons  have  to  be  executed  sequenFally.  Assume  that  the  Fme    to  execute  the  program  sequenFally  (i.e.,  using  only  one  processor)  is  Ts.  Give  an  expression  for  S(p),  the  speedup  obtained  when  using  p  processors.      S(p)  =  Ts  /  (0.25  Ts  +  0.75  Ts/p)  =  1  /  (0.25  +  0.75/p)    Lim  S(p)  when  p  -­‐>  ∞ =  1  /  0.25  =  100  /  25  =  4    

Page 10: Review&Session& - George Mason University

Solution to Exercise 2

Amdahl’s  Law!  The  speedup  is  dominated  by  the  sequenFal  porFon  of  the  program.  

Page 11: Review&Session& - George Mason University

Exercise 3

1

Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 12: Review&Session& - George Mason University

Solution to Exercise 3

$t2

$t1

off

0

1

Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 13: Review&Session& - George Mason University

$t2  

$t1  

off

32-bit off

$t2

1

0

1

Solution to Exercise 3 Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 14: Review&Session& - George Mason University

$t2  

$t1  

off

32-bit off

$t2

1

$t2+off add 0

1

Solution to Exercise 3 Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 15: Review&Session& - George Mason University

$t2  

$t1  

off

32-bit off

$t2

1

$t2+off add

1

1

0

Solution to Exercise 3 Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 16: Review&Session& - George Mason University

$t2  

$t1  

off

32-bit off

$t2

1

$t2+off add

1

1

0

Mem [$t2+off]

Solution to Exercise 3 Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 17: Review&Session& - George Mason University

$t2  

$t1

off

32-bit off

$t2

1

$t2+off add

1

1

0

Mem [$t2+off]

1

Solution to Exercise 3 Describe  in  detail  the  operaFon  of  a  lw  $t1,off($t2)  for  a  single  cycle  data  path.  

Page 18: Review&Session& - George Mason University

Exercise 4 Consider  that  the  MIPS  code  below  is  executed  in  a  5-­‐stage  pipelined  architecture  as  discussed  in  class.  What  is  the  minimum  number  of  cycles    needed  to  execute  the  code  and  why?    lw    $t1,0($s0)  addi  $t1,$t1,10  sw    $t1,0($s0)  

Page 19: Review&Session& - George Mason University

Solution to Exercise 4 Consider  that  the  MIPS  code  below  is  executed  in  a  5-­‐stage  pipelined  architecture  as  discussed  in  class.  What  is  the  minimum  number  of  cycles    needed  to  execute  the  code  and  why?    lw    $t1,0($s0)    IF  ID  EX  MEM  WB  addi  $t1,$t1,10        IF  ID    EX  MEM  WB  sw    $t1,0($s0)          IF    ID  EX    MEM  WB    Eight  cycles  are  needed.  The  addi  instrucFon  needs  $t1,  which  can  be  forwarded  from  the  end  of  the  MEM  stage  of  the  lw  instrucFon.  This  requires  the  pipeline  to  stall  for  one  cycle.  The  sw  instrucFon  can  only  be  fetched  aUer  the  addi  instrucFon  is  fetched  to  avoid  a  structural  hazard.  

Page 20: Review&Session& - George Mason University

Exercise 5 Which  of  the  statements  below  regarding  pipeline  registers  are  correct?    (a)  Pipeline  registers  are  needed  because  each  stage  of  the  

pipeline  needs  its  own  register  file  to  execute  instrucFons  that  need  registers.  

(b)  A  pipeline  register  at  the  interface  of  stages  i  and  i+1  needs  to  store  all  the  control  unit  values  needed  to  execute  an  instrucFon  execuFng  at  stage  i+1  as  well  as  the  outputs  of  stage  i  that  are  needed  as  input  to  stage  i+1.    

(c)  The  contents  of  the  pipeline  registers  are  updated  at  each  clock  cycle.  

(d)  The  pipeline  registers  store  values  to  be  forwarded  to  previous  stages  in  the  pipeline.  

Page 21: Review&Session& - George Mason University

Solutions to Exercise 5 Which  of  the  statements  below  regarding  pipeline  registers  are  correct?    (a)  Pipeline  registers  are  needed  because  each  stage  of  the  

pipeline  needs  its  own  register  file  to  execute  instrucFons  that  need  registers.  

 Incorrect:  There  is  only  one  register  file,  which  has  all  registers  of  the  CPU.  The  pipeline  registers  hold,  among  other  things,  values  for  rs,  rt,  and  rd.  

Page 22: Review&Session& - George Mason University

Solutions to Exercise 5 Which  of  the  statements  below  regarding  pipeline  registers  are  correct?    b)  A  pipeline  register  at  the  interface  of  stages  i  and  i+1  needs  to  

store  all  the  control  unit  values  needed  to  execute  an  instrucFon  execuFng  at  stage  i+1  as  well  as  the  outputs  of  stage  i  that  are  needed  as  input  to  stage  i+1.    

 Correct.  

Page 23: Review&Session& - George Mason University

Solutions to Exercise 5 Which  of  the  statements  below  regarding  pipeline  registers  are  correct?    c)  The  contents  of  the  pipeline  registers  are  updated  at  each  clock  

cycle.  Correct    d)  The  pipeline  registers  store  values  to  be  forwarded  to  previous  

stages  in  the  pipeline.  

A  pipeline  register  at  the  intersecFon  of  stages  i  and  i+1  stores  a  value  generated  by  an  instrucFon  at  stage  i.  This  value  can  be  forwarded  to  an  instrucFon  at  stage  i-­‐1.  This  instrucFon  was  issued  aUer  the  instrucFon  at  stage  i.  

Page 24: Review&Session& - George Mason University

Exercise 6 Consider  the  format  of  R-­‐type  instrucFons  given  below  and  the  following  sequence  of  instrucFons.    I1  add  ra,rb,rc  I2  sub    re,rf,rg  

0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6

Provide  an  expression,  as  a  funcFon  of  ra,rb,rc,re,rf,rg  and  the  pipeline  registers,  that  can  be  used  to  determine  if  that  sequence  causes  a  data  hazard.    

Page 25: Review&Session& - George Mason University

Solutions to Exercise 6 Consider  the  format  of  R-­‐type  instrucFons  given  below  and  the  following  sequence  of  instrucFons.    I1  add  ra,rb,rc  I2  sub    re,rf,rg  

0 rs rt rd shamt funct 31:26 5:0 25:21 20:16 15:11 10:6

Provide  an  expression,  as  a  funcFon  of  ra,rb,rc,re,rf,rg  and  the  pipeline  registers,  that  can  be  used  to  determine  if  that  sequence  causes  a  data  hazard.    

((EX/MEM.RegisterRd = ID/EX.RegisterRs = ra) or (EX/MEM.RegisterRd = ID/EX.RegisterRt =ra)) and EX/MEM.RegWrite and EX/MEM.RegisterRd ≠0  

   

Page 26: Review&Session& - George Mason University

Exercise 7 Consider  the  following  sequence  of  instrucFons.  Assume  that  the  register  comparison  in  a  branch  instrucFon  is  done  at  the  instrucFon  decode  stage  of  the  pipeline.  Can  that  sequence  be  executed  without  and  without  stalling?    

 add    $t1,  $t2,  $0    add  $t3,  $t4,  $t5    beq  $t1,  $t3,  LABEL  

 

Page 27: Review&Session& - George Mason University

Solution to Exercise 7 Consider  the  following  sequence  of  instrucFons.  Assume  that  the  register  comparison  in  a  branch  instrucFon  is  done  at  the  instrucFon  decode  stage  of  the  pipeline.  Can  that  sequence  be  executed  without  and  without  stalling?    

 add    $t1,  $t2,  $0    IF  ID  EX  MEM  WB    add  $t3,  $t4,  $t5      IF  ID  EX    MEM  WB    beq  $t1,  $t3,  LABEL      IF  ID    EX    MEM  WB  

 No.  The  pipeline  needs  to  stall  for  one  cycle  because  $t1  and  $t3  are  needed  at  the  beginning  of  its  ID  stage.  The  value  of  $t3  is  only  available  at  the  end  of  the  EX  stage  of  the  second  add.    

Page 28: Review&Session& - George Mason University

Exercise 8 Consider  a  2-­‐bit  branch  predicFon  and  the  following  recent  history  of  branch  instrucFons.    Branch  instruc/on  address     Outcome  

1000   Solid  Branch  Taken  

2500   Temporary  Branch  Not  Taken  

3657   Solid  Branch  Not  Taken  

4512   Solid  Branch  Taken  

What  is  the  predicFon  for  a  branch  at  address  1000  and  how  would  the  table  change  if  the  branch  is  mispredicted?    

Page 29: Review&Session& - George Mason University

Solution to Exercise 8 Consider  a  2-­‐bit  branch  predicFon  and  the  following  recent  history  of  branch  instrucFons.    Branch  instruc/on  address     State  

1000   Solid  Branch  Taken  

2500   Temporary  Branch  Not  Taken  

3657   Solid  Branch  Not  Taken  

4512   Solid  Branch  Taken  

What  is  the  predicFon  for  a  branch  at  address  1000  and  how  would  the  table  change  if  the  branch  is  mispredicted?    The  predicFon  would  be  take  the  branch.  In  case  of  a  mispredicFon,  the  new  state  should  be  Temporary  Branch  Taken.    

Page 30: Review&Session& - George Mason University

Exercise 9 Consider  a  2-­‐bit  branch  predicFon  and  the  following  recent  history  of  branch  instrucFons.    Branch  instruc/on  address     Outcome  

1000   Solid  Branch  Taken  

2500   Temporary  Branch  Not  Taken  

3657   Solid  Branch  Not  Taken  

4512   Solid  Branch  Taken  

What  is  the  predicFon  for  a  branch  at  address  2500  and  how  would  the  table  change  if  the  branch  is  mispredicted?    

Page 31: Review&Session& - George Mason University

Solution to Exercise 9 Consider  a  2-­‐bit  branch  predicFon  and  the  following  recent  history  of  branch  instrucFons.    Branch  instruc/on  address     State  

1000   Solid  Branch  Taken  

2500   Temporary  Branch  Not  Taken  

3657   Solid  Branch  Not  Taken  

4512   Solid  Branch  Taken  

What  is  the  predicFon  for  a  branch  at  address  2500  and  how  would  the  table  change  if  the  branch  is  mispredicted?    The  predicFon  would  be  do  not  take  the  branch.  In  case  of  a  mispredicFon,  the  new  state  should  be  Temporary  Branch  Taken.