Upload
nivitha
View
216
Download
0
Embed Size (px)
Citation preview
7/26/2019 QB Advanced Computer Architecture
1/13
QUESTION BANK
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIT-I
PART - A
1. What are embedded !m"#ter$% &'$t the'r harater'$t'$.
Embedded computers are computers that are lodged into other devices where the presence of
the computer is not immediately obvious. These devices range from everyday machine to handheld
digital devices. They have a wide range of processing power and cost.
(. De)'*e Re$"!*$e t'me a*d Thr!#+h"#t.
Response time is the time between the start and the completion of the event. Also referred to
as execution time or latency. Throughput is the total amount of work done in a given amount of time.
,. Me*t'!* the #$e !) tra*$at'!* "r!e$$'*+ be*hmar$
It measures the ability of system to handle transactions, which consists of database accesses
and updates. An airline reservation system and bank AT are the examples of T! system.
. State Amda/h0$ /a.
Amdalh"s law states that the performance improvement to be gained from using some faster
mode of execution is limited by the fraction of time the faster mode can be used.
#. What are t!2 be*hmar$%
Toy benchmarks are typically between $% and $%% lines of code and produce result the user
already knows before running the toy program. E,g pu&&le.
3. What '$ "r!)'/e ba$ed $tat' m!de/'*+%
In this techni'ue, a dynamic execution profile of the program, which indicates how often each
instruction is executed, is maintained.
4. S#""!$e that e are !*$'der'*+ a* e*ha*eme*t t! the "r!e$$!r !) a $er5er
system used for web serving. The new (!) is $% times faster on computation in the web
serving application than the original processor. Assuming that the original (!) is busy with
computation *%+ of the time and is waiting for I- %+ of the time. /hat is the overall speedup
gained by incorporating the enhancement0 1raction enhanced 2 %.* 3peedup
enhanced 2 $% 3peedup
overall 2 $4%.5%.*$%6 2$%.* 2 $.#
6. E7"/a'* the d'))ere*t t2"e$ !) /!a/'t2.
Temporal locality, states that recently accessed items are likely to be accessed in the near
future
7/26/2019 QB Advanced Computer Architecture
2/13
3patial locality, says that items whose addresses are near one another tend to be referenced
close together in time.
8. Name the addre$$'*+ m!de$ #$ed )!r $'+*a/ "r!e$$'*+%
odulo or circular addressing mode 7it reverse addressing mode.
19. S"e')2 the CPU "er)!rma*e e:#at'!*.
(!) time 2 Instruction (ount x (lock cycle Time x cycles per instruction
11. E7"/a'* the h2br'd a""r!ah )!r e*!d'*+ a* '*$tr#t'!* $et%
The hybrid approach reduces the variability in si&e and work of the variable architecture but
provide multiple instruction lengths to reduce code si&e.
1(. What are the re+'$ter$ #$ed )!r MIPS "r!e$$!r$.
I!3 has 8*, *9bit general purpose registers 4:!Rs6, named R%, R$;R8$. :!Rs are
sometimes called as integer registers. There are also a set of 8< floating point registers 41!Rs6, named
1%,1$;.18$, which can hold 8< single precision values and 8< double precision values.
1,. E7"/a'* the !*e"t beh'*d "'"e/'*'*+.
!ipelining is an implementation techni'ue whereby multiple instructions are overlapped in
execution. It takes advantage of parallelism that exists among actions needed to execute an
instruction.
1. Wr'te ab!#t "'"e $ta+e$ a*d "r!e$$!r 2/e.
=ifferent steps in an instruction are completed in different parts of different instruction is
parallel. Each of these steps is called a pipe stage or pipe segment. The time re'uired between moving
an instruction one step down the pipeline is called processor cycle.
1;. E7"/a'* "'"e/'*e ha
7/26/2019 QB Advanced Computer Architecture
3/13
a. 1ree&e or flush the pipeline
b. Treat every branch as not taken
c. Treat every branch as taken d. =elayed branch
16. C!*$'der a* #*"'"e/'*ed "r!e$$!r. A$$#me that 't ha$ a 1*$ /! 2/e a*d that 't #$e$
2/e$ )!r A&U !"erat'!*$ a*d bra*he$ a*d ; 2/e$ )!r mem!r2 !"erat'!*$. A$$#me that the
re/at'5e )re:#e*'e$ !) the$e !"erat'!*$ are 9=> (9= a*d 9= re$"et'5e/2. S#""!$e that d#e
t! /! $e a*d $et#"> "'"e/'*'*+ the "r!e$$!r add$ 9.( *$ !) !5erhead t! the /!. I+*!r'*+
a*2 /ate*2 'm"at> h! m#h $"eed#" '* the '*$tr#t'!* e7e#t'!* rate '// e +a'* )r!m a
"'"e/'*e%
The average instruction execution timeon an unpipelined processor is
2 clock cycle x Average (!I 2 $ ns x 44*%+ x *654
7/26/2019 QB Advanced Computer Architecture
4/13
UNIT-II
PART - A
1. &'$t the 5ar'!#$ data de"e*de*e.
=ata dependence
ame dependence
(ontrol =ependence
(. What '$ I*$tr#t'!* &e5e/ Para//e/'$m%
!ipelining is used to overlap the execution of instructions and improve performance. This
potential overlap among instructions is called instruction level parallelism 4I!6 since the instruction
can be evaluated in parallel.
,. G'5e a* e7am"/e !) !*tr!/ de"e*de*e%
if p$ Fs$GH
if p< Fs
7/26/2019 QB Advanced Computer Architecture
5/13
The pipeline may have already completed instructions that are later in program order than
instruction causing exception. The pipeline may have not yet completed some instructions that are
earlier in program order than the instructions causing exception.
19. What are m#/t'/e5e/ bra*h "red't!r$%
These predictors use several levels of branch9prediction tables together with an algorithm for
choosing among the multiple predictors.
11. What are bra*h-tar+et b#))er$%
To reduce the branch penalty we need to know from what address to fetch by end of I1
4instruction fetch6. A branch prediction cache that stores the predicted address for the next instruction
after a branch is called a branch9target buffer or branch target cache.
1(. Br'e)/2 e7"/a'* the +!a/ !) m#/t'"/e-'$$#e "r!e$$!r%
The goal of multiple issue processors is to allow multiple instructions to issue in a clock
cycle. They come in two flavors superscalar processors and LI/ processors.
1,. What '$ $"e#/at'!*%
3peculation allows execution of instruction before control dependences are resolved.
1. Me*t'!* the "#r"!$e !) #$'*+ Bra*h h'$t!r2 tab/e%
It is a small memory indexed by the lower portion of the address of the branch instruction.
The memory contains a bit that says whether the branch was recently taken or not.
1;. What are $#"er $a/ar "r!e$$!r$%
3uperscalar processors issue varying number of instructions per clock and are either statically
scheduled or dynamically scheduled.
13. Me*t'!* the 'dea beh'*d hardare-ba$ed $"e#/at'!*%
It combines three key ideas dynamic branch prediction to choose which instruction to
execute, speculation to allow the execution of instructions before control dependences are resolved
and dynamic scheduling to deal with the scheduling of different combinations of basic blocks.
14. What are the )'e/d$ '* the ROB%
Instruction type =estination field Lalue field Ready field
16. @! ma*2 bra*h $e/eted e*tr'e$ are '* a (>( "red't!r$ that ha$ a t!ta/ !) 6K b't$ '* a
"red't'!* b#))er%
number of prediction entries selected by the branch 2 KM number of prediction entries
selected by the branch 2 $M
18. What '$ the ad5a*ta+e !) #$'*+ '*$tr#t'!* t2"e )'e/d '* ROB%
The instruction field specifies whether instruction is a branch or a store or a register operation
7/26/2019 QB Advanced Computer Architecture
6/13
(9. Me*t'!* the ad5a*ta+e !) #$'*+ t!#r*ame*t ba$ed "red't!r$%
The advantage of tournament predictor is its ability to select the right predictor for right
branch.
PART - B
$. /hat is instruction9level parallelism0 Explain in details about the various dependences caused in
I!0 =efine I! Larious dependences include D =ata dependence, ame dependence and control
dependence.
7/26/2019 QB Advanced Computer Architecture
7/13
)sing encoding techni'ues (ompress the instruction in main memory and expand them when
they are read into the cache or are decoded.
3. Me*t'!* the ad5a*ta+e !) #$'*+ m#/t'"/e '$$#e "r!e$$!r%
They are less expensive. They have cache based memory system. ore parallelism.
4. What are /!!" arr'ed de"e*de*e%
They focuses on determining whether data accesses in later iterations are dependent on data
values produced in earlier iterationsG such a dependence is called loop carried dependence. e.g
for4i2$%%%GiO%Gi2i9$6 xPiQ2xPiQ5sG
6. Me*t'!* the ta$$ '*5!/5ed '* )'*d'*+ de"e*de*e$ '* '*$tr#t'!*$%
:ood scheduling of code. =etermining which loops might contain parallelism Eliminating
name dependence
8. U$e the G.C.D te$t t! determ'*e hether de"e*de*e e7'$t$ '* the )!//!'*+ /!!"
)!r'1'199''H1 J('H,LJ('L;.9
3olution a2
7/26/2019 QB Advanced Computer Architecture
8/13
13. Me*t'!* the /'m'tat'!*$ !) "red'ated '*$tr#t'!*$%
They are useful only when the predicate can be evaluated early. !redicated instructions may
have speed penalty.
14. What '$ "!'$!* b't%
!oison bits are a set of status bits that are attached to the result registers written by the
speculated instruction when the instruction causes exceptions. The poison bits cause a fault hen a
normal instruction attempts to use the register.
16. What are the d'$ad5a*ta+e$ !) $#""!rt'*+ $"e#/at'!* '* hardare%
(omplexity Additional hardware resources re'uired
18. Me*t'!* the meth!d$ )!r "re$er5'*+ e7e"t'!* beha5'!r%
Ignore Exception Instructions that never raise exceptions are used )sing poison bits )sing
hardware buffers
(9. What '$ a* '*$tr#t'!* +r!#"%
It is a se'uence of consecutive instructions with no register data dependence among them. All
the instructions in the group could be executed in parallel. An instruction group can be arbitrarily
long.
PART - B
$. Explain loop unrolling with an example0 oop unrolling techni'ue Example
7/26/2019 QB Advanced Computer Architecture
9/13
(. What '$ r'te thr!#+h a*d r'te ba ahe%
/rite through9 the information is written to both the block in the cache and to the block in the
lower level memory. write back9 The information is written only to the block in the cahce. The
modified cache block is written to main memory only when it is replaced.
,. What '$ m'$$ rate a*d m'$$ "e*a/t2%
iss rate is the fraction of cache access that result in a miss. iss penalty depends on the
number of misses and clock per miss.
. G'5e the e:#at'!* )!r a5era+e mem!r2 ae$$ t'me%
Average memory access time2 >it time 5 iss rate x iss penalty
;. What '$ $tr'"'*+%
3preading multiple data over multiple disks is called striping, which automatically forces
accesses to several disks.
3. Me*t'!* the "r!b/em$ 'th d'$ arra2$%
/hen devices increases, dependability increases =isk arrays become unusable after a single
failure
4. What '$ h!t $"are%
>ot spares are extra disks that are not used in normal operation. /hen failure occurs, an idle
hot spare is pressed into service. Thus, hot spares reduce the TTR.
6. What '$ m'rr!r'*+%
=isks in the configuration are mirrored or copied to another disk. /ith this arrangement data
on the failed disks can be replaced by reading it from the other mirrored disks.
8. Me*t'!* the draba$ 'th m'rr!r'*+%
/riting onto the disk is slower 3ince the disks are not synchroni&ed seek time will be
different Imposes #%+ space penalty hence expensive.
19. Me*t'!* the )at!r$ that mea$#re IO "er)!rma*e mea$#re$%
=iversity (apacity Response time Throughput Interference of Io with (!) execution
11. What '$ tra*$at'!* t'me%
The sum of entry time, response time and think time is called transaction time.
1(. State /'tt/e0$ /a%
ittles law relates the average number of tasks in the system. Average arrival rate of new asks.Average time to perform a task.
7/26/2019 QB Advanced Computer Architecture
10/13
1,. G'5e the e:#at'!* )!r mea* *#mber !) ta$$ '* the $2$tem%
ean number of arrival in the system 2 Arrival rate x ean response time.
1. What '$ $er5er #t'/'
7/26/2019 QB Advanced Computer Architecture
11/13
7/26/2019 QB Advanced Computer Architecture
12/13
6. Me*t'!* the '*)!rmat'!* '* the d'ret!r2%
It keeps the state of each block that are cached. It keeps track of which caches have copies of
the block.
8. What the !"erat'!*$ that a d'ret!r2 ba$ed "r!t!!/ ha*d/e%
>andling read miss >andling a write to a shares clean cache block
19. What are the $tate$ !) ahe b/!%
3hared, )ncached, Exclusive
11.What are the #$e$ !) ha5'*+ a b't 5et!r%
/hen a block is shared, the bit vector indicates whether the processor has the copy of the
block. /hen block is in exclusive state, bit vector keep track of the owner of the block.
1(. Whe* d! e $a2 that a ahe b/! '$ e7/#$'5e%
/hen exactly one processor has the copy of the cached block, and it has written the block.
The processor is called the owner of the block.
1,. E7"/a'* the t2"e$ !) me$$a+e$ that a* be $e*d betee* the "r!e$$!r$ a*d d'ret!r'e$%
ocal node D ode where the re'uests originates >ome ode D ode where memory location
and directory entry of the address resides. Remote ode 9 The copy of the block in the third node
called remote node
1. What '$ !*$'$te*2%
(onsistency says in what order must a processor observe the data writes of another processor.
1;. Me*t'!* the m!de/$ that are #$ed )!r !*$'$te*2%
3e'uential consistency Relaxed consistency model
13. What '$ $e:#e*t'a/ !*$'$te*2%
It re'uires that the result of any execution be the same, as if the memory accesses executedby each processor were kept in order and the accesses among different processors were interleaved.
14. What '$ re/a7ed !*$'$te*2 m!de/%
Relaxed consistency model allows reads and writes to be executed out of order. The three
sets of ordering are /9O R ordering /9O/ ordering R9O/ and R9O R ordering.
16. What '$ m#/t' thread'*+%
ultithreading allows multiple threads to share the functional uits of the single processor in
an overlapping fashion.
7/26/2019 QB Advanced Computer Architecture
13/13
18. What '$ )'*e +ra'*ed m#/t'thread'*+%
It switches between threads on each instruction, causing the execution of multiple threads to
be interleaved.
(9. What '$ !ar$e +ra'*ed m#/t'thread'*+%
It switches threads only on costly stalls. Thus it is much less likely to slow down the execution of an
individual thread.
PART - B
$. Explain the snooping protocol with a state diagram0 7asic schemes for enforcing cache coherence
(ache coherence D definition 3nooping protocol An example protocol D state diagram
ardware and software implementation
*. =iscuss about the different models for memory consistency0 3e'uential consistency model Relaxed
consistency model
#. >ow is multithreading used to exploit thread level parallelism within a processor0 ultithreading
D definition 1ine grained multithreading (oarse grained multithreading 3imultaneous multithreading