Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
EE457 Quiz (~10%)Closed-book Closed-notes Exam; No cheat sheets;
Ordinary calculators may be used but not the smart phone with calculators. Verilog Guides are not needed and are not allowed.Smart phones, tablets (and any kind of computing/Internet devices) are not allowed.
This is a Crowdmark exam. Please do not write on margins or on backside. Use HB or 1H pencil.
Spring 2020Instructor: Gandhi Puvvada
Thursday, 2/13/2020 (A 3-hour exam) 05:00 PM - 08:00 PM (180 min) in SGM123Please do not write your student ID
Student’s DEN D2L username: @usc.edu
Viterbi School of Engineering, University of Southern California
Ques# Topic Page# Time Points Score
1 State Diagram, RTL Design 2-4 60 min. 100
2 Unsigned and Signed numbers 5-6 25 min. 54
3 CPU Performance 7-7 20 min. 30
4 MIPS processor ISA, Byte-addressable processors
8-9 30 min. 77
5 Single-Cycle CPU 10-11 25 min. 64
Total 1+10+1 160 min. 325
Perfect Score 300
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
1 ( points) min. State Diagram and RTL design
1.1 Mealy machine design: Reproduced below is a partial solution of Q#2 on Array Division, C[I] <= A[I]/B[I]; of ee354_MT_Spring2017, which you were asked to go through.
Here A[I] is divided by a constant 3 and we are interested to know if the division is an even division (exact division with no reminder left) and if so whether the quotient is an even number.
Here you are given an array A[I] of 24 non-zero 8-bit unsigned numbers (A[0:23]). You are asked to consider the numbers in A[I] which are evenly divisible by 3 (= exactly divisible by 3 without leaving any remainder). From those, keep a count of those cases where the quotients are even. If the total number of such even quotients are even, then go to ENEQ (Even Number of Even Quotients) state. Otherwise, go to ONEQ (Odd number of Even Quotients). Zero is an even number. So, if none of the quotients are even, then you go to ENEQ state. Instead of maintaining a count of even quotients and checking if the number is even, we can start with a Flag called (say) ENEQ_F (Even number of Even Quotients Flag) , set it to 1 (for True) in the INI state (to say we have so far found 0 number of even quotients (i.e. even number of even quotients)). We flip it every time, we find an even quotient.
Like in the EE354L problem above, access time of A[I] is nearly one clock. So, you can only deposit A[I] into X (X <= A[I];) at the end of the clock, but you do not have any time left in the clock to start any processing of A[I] directly. There is no B[I] here, as the divisor is 3 (a constant here). Unlike in the above EE354L problem, we do not need to store the quotient in C[I]. Actually we do not even need a quotient 7-bit or an 8-bit Q register here! We can have 1-bit Q starting with a zero and flip it every time we are able to successfully subtract a 3 from X. If A[I] is a 6, then X becomes 6 and Q is zero in the first clock. X=3 and Q is scheduled to become 1 in the second clock. X=0 and Q goes back to zero in the 3rd clock. Probably you do not need to go through the third clock as (X==3) indicates the A[I] is evenly divisible. Your grader will look for correctness and efficiency.
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
Complete the following table for a 4 element array A[0:3] (instead of a 24 element array, A[0:23]) containing the elements 2, 6, 7, and 6. Here, since, we have 6 and 6 which are not only evenly divisible by 3 but also generate even quotients (namely 2 and 2), we should go to ENEQ state.Show under each clock, for each variable, what is the value at the stating of the clock and what will happen to each variable at the end of the clock. For example we have already shown what happens in the LF state clock. X starts with unknown value x and becomes 2 (though X becomes 2 at the end of the clock) and I starts with 0 and becomes 1.
State transition conditions shall be arrived at carefully by considering the following facts. 1. Terminal count value of I? Did we access the last element? Did we increment I after (or while) accessing that element. Do I expect to have (I==23) or (I == 24) at that time?
2. If we accessed the last element, is it true that we are about to finish processing that last element? What indicates that? Is it (X < 3) or (X == 3) or (X >3) or a some combination of them? If this is an evenly divisible-by-3 case, it is also an even-quotient case? Is Q a zero or a 1?
3. Previously, did we find an even number of (or odd number of) even-quotient cases?What variable and what values of the variable indicate this? Does ENEQ_F help in this matter?
40pts
You can use these decision boxes to arrive at the state transition conditions, if you wish
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
Start
Rese
t
INI
DIV
StartI <= 0;
ENEQ
ACK
ACK
LF
X <= A[I];
I <= I + 1;
Q <= 1’b0; 1
ENEQ_F <= 1’b1;
ONEQ
ACK
ACK
Rough work area: for this question
60pts
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
2 ( 10+10+4+12+10+8 = 54 points) 25 min. Signed and unsigned numbers
2.1 Given below is part of the solution to Q#2 from Fall 2018 Quiz that you were asked to go through
2.1.1 Instead of the above 4XgtY, (4X greater than Y), if we needed 4XleY (4X less than or equal Y), (i.e. treating the numbers as signed numbers represented in 2’s complement form) how would you produce?Student #1: Simply invert the above 4XgtY to produce 4XleY.Student #2: But Y0 is being ignored above and we are taking about equality here.Student #3: Well, 740 is less than or equal 74X in decimal where X can be anything from 0 to 9. Also, 730 is less than or equal 74X in decimal whatever is X. So I agree with S#1.____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________
2.1.2 How would you produce 4XlosY? (4X lower or same as Y, treating them as unsigned numbers)?____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________
2.2 We know that all ones except for the least significant bit (example: 11110 (=11101+1)) is a minus 2 in 2’s complement notation. So, to decrement by 2, would you add all ones except for the least significant bit in the case of _______ (A/B/C) (A = signed numbers represented in 2’s complement notation, B = unsigned numbers, C = both signed and unsigned). Similarly, to increment by 2, would you add a number like 00010 (size depending on the finite number system in use, for example, for 8-bit system, it is 00000010) in the case of _______ (A/B/C) (legend for A, B, C same as above).
10pts
10pts
4pts
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
2.2.1 Your lab partner started the following 5-bit design, utilizing the standard adder/subtracter design and adding or subtracting the constant 00010. She calls it a "increment-by-2 or decrement-by-2" unit. She believes that it can be used for both (i) unsigned numbers and (ii) signed numbers represented in 2’s complement notation. She believes that the idea can be used for any larger number system such as a 32-bit finite number system or a 64-bit number system. Of course, for the finite number system (of 5-bit here or any number of bits), one needs to produce overflow signals, UOV (Unsigned Overflow) and SOV (Signed Overflow). If you agree with her, then produce UOV and SOV below. If you do not agree with her, say so with a brief explanation.
2.2.2 By the way, she is quite outspoken, and did not like the Fall 2018 Midterm Incrementer/Decrementer design (reproduced on the left-side below) and offered her design on the right-side below. Again, if agree with her, then produce UOV and SOV below. If you do not agree with her, say so. You _____________ (agree/disagree) with her.
2.2.2.1 She asked you to help her and simplify her incrementer/decrementer design (whether you agree with her not on correctness of her design). Look at the 5 XOR gates with constants and show how it can be simplified on the left side below.
12pts
a bcin
scout C0
a bcin
scout
a bcin
scout
a bcin
scout
a bcin
scout
Raw
Car
ry
Carry
V
X4 X3 X2 X1 X0
Y4 Y3 Y2 Y1 Y0
Inc2/Dec2
Add/Sub
1 0000Explanation if you disagree with her
10pts
a bcin
scout C0
a bcin
scout
a bcin
scout
a bcin
scout
a bcin
scout
Raw
Car
ry
Carry
V
X4 X3 X2 X1 X0
Y4 Y3 Y2 Y1 Y0
Inc1/Dec1
Add/Sub
0 1000
Reason for disagreeing:
8pts
a bcin
scout C0
a bcin
scout
a bcin
scout
a bcin
scout
a bcin
scout
Raw
Car
ry
Carry
V
X4 X3 X2 X1 X0
Y4 Y3 Y2 Y1 Y0
Inc1/Dec1
Add/Sub
0 1000
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
3 ( 12 + 10 + 8 = 30 points) 20 min. CPU Performance
3.1 CPI, IC, and performance: Our new CPU has only three types of instructions: A, B, and C. The hardware team did not tell us the CPI of category B. Two compiler designers, #1 and #2 designed totally different compilers with the following information.Do you have adequate information to conclude (a) which compiler is better (b) and by what factor? If you do not have adequate information for either or both of them, state what minimal information you need to arrive at the answer.
Better compiler is _______________________________ (#1 / #2 / inadequate data). Explain: _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Better by a factor of ___________________________ (state value or inadequate data) Explain:________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
3.2 ABC and XYZ are implementing the same ISA licensed from ARM company and use the same compiler and suppose ABC claims double the MIPs rating (Millions of instructions per second rating) double that of XYZ. Do we have adequate data to arrive at the performance ratio (speed up factor)? Yes / NoIf yes, what is the speed up and if no, what data is lacking?____________________________________________________________________________________________________________________________________________________________________________________________________________________________________ What are the possible ways in which ABC could have boosted their MIPs rating? ________________________________________________________________________________________________________________________________________________________
Category CPI Frequency#1
ABC
CPIA = 5CPIB = nCPIC = 20
fA1=10%fB1=40%fC1=50%
100%
fA2=50%fB2=20%fC2=30%
100%
Frequency#2 Instruction Count #1
Instruction Count #2
100,000 200,000
12pts
10pts
8pts
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
4 ( 4+8+24+7+16+18 = 77 points) 30 min. MIPs Instructions and Memory addresses
4.1 MIPS ISA (a RISAC ISA)
4.1.1 Stack Pointer is a ________ (GPR/SPR) where GPR stands for a General Purpose Register and SPR stands for a Special Purpose Register. MIPS ______________ (hardware/compiler) did not implement SP as a SPR.
4.1.2 The $31 is the __________ (link register/stack pointer) and is known to be having that functionality to _____ (A/B/C).The $29 $31 is the __________ (link register/stack pointer) and is known to be having that functionality to _____ (A/B/C) . Here, A = hardware implementation team, B = compiler implementation team, C = both teams .
4.1.3 A representative assembly language instruction listing is given on the side, demonstrating nested subroutine calls and returns. Here A calls B, which in turn calls C.Is the listing written assuming that the stack grows in the direction of __________________________3 pts
(decreasing/increasing) memory addresses. How can you tell? 9 pts ____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
6 pts Label the two instruction(s) _______________ (preceding/following) the JAL instruction in execution (in execution = in the dynamic execution trace), together with the JAL Subroutine instruction in MIPS (that make up the CISC CALL instruction) as C1, C2, C3 (C1 = Call Part 1, so forth).
6 pts Similarly label the two instruction(s) _______________ (preceding/following) the JR $31 instruction in execution (in execution = in the dynamic execution trace), together with the JR $31 instruction in MIPS (that make up the CISC RTN (Return) instruction) R1, R2, R3 (R1 = Return Part1, so forth).
4.2 Intel follows ___________ (Little Endian / Big Endian) system. In the Intel 80486 processor system address space, byte 0000_824CH is the ____________ (most / least) significant byte of the 32-bit word with system address ______________ (state in hexadecimal).The 32-bit word 4000 consists of the four bytes 4000, 4001, 4002, and 4003 in ________________________________ (Little-Endian / Big-Endian / both kinds of /neither kind of) processor.
4pts
8pts
1 A: ----2 ----3 jal B;4 ----5 B: ----6 addi $29, $29, -4;7 sw $31, 0($29);8 jal C;9 lw $31, 0($29);9 addi $29, $29, +4;10 ----11 C: ----12 jr $31
24pts
5+2pts
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
4.3 Intel processors, 80486 and i860, are both 32-bit logical address, byte addressable processors.The 80486 is a 32-bit data processor where as the i860 is a 64-bit data processor. State the size of their address space(s): 80486: ___________, Intel i860: ________________________. If stacks of 16 MByte SRAM chips are placed in their byte-wide memory banks to fill-up their entire address spaces, what are the lowest and highest system byte addresses which map to the bottom and the top of the specific 16 MByte chip to which the system byte address 2E45_94CB hex maps to?
In the case of 80486, the bottom is _ _ _ _ _ _ _ _ _ hex and the top is _ _ _ _ _ _ _ _ _ hex.
And in the case of i860, the bottom is _ _ _ _ _ _ _ _ _ hex and the top is _ _ _ _ _ _ _ _ _ hex.And if this chip goes bad, what is the total system address range in hex that needs to be declared as unusable?
(i) in the case of 80486 processor, it is _ _ _ _ _ _ _ _ _ hex to _ _ _ _ _ _ _ _ _ hex.
(ii) in the case of i860 processor, it is _ _ _ _ _ _ _ _ _ hex to _ _ _ _ _ _ _ _ _ hex.
4.3.1 Complete address decoding to generate Group-Selects (/GS_486 and /GS_860) for the row of chips and also show the rest of the labels for address, data, and byte-enable.
16 pts
18pts
A31A30A29A28
CS
WERD
A[ ]D[7:0]
D[ ]
A[ : ]
BE
16 MByte
2
/GS_486
A31A30A29A28
CS
WERD
A[ ]D[7:0]
D[ ]
A[ : ]
BE
16 MByte
2
/GS_860
Intel 80486 Intel i860
Rough work box:
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
5 ( 15 + 30 + 10 + 10 + 9 = 64 points) 25 min. Single-cycle CPU:
You are familiar with the branch instruction, the ordinary jump instruction J (Jump with the 26-bit jump address field), and also the indirect jump instruction Jr rs, (Jump register rs).
5.1 The data path on the next page is nearly complete. Complete the connections to the 9 loose ends which
were marked with numbered arrows .
5.2 Control Signal Table: Complete the three rows for addi, JR Rs, and J and three columns for RegWrite, JR, Jump and a few other erased cells. Whenever possible, use don’t cares.
5.2.1 Occasionally it is possible to have two columns in the Control Signal Table to have identical bits. T / FExplain: __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________Occasionally it is possible to have two rows in the Control Signal Table to have identical bits. T / FExplain: __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
5.3 You save hardware components such as muxes/adder in the datapath if you do not have to support (circle your choices): (i) addi (ii) JR Rs (iii) J Explain the ones you did not circle: _______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
Inst
ruct
ion
Mem
Rea
d
Mem
Wri
te
Reg
Wri
te
Mem
tore
g
Reg
Dst
AL
USr
c
AL
UO
p1
AL
Uop
0
Bra
nch
JR Jum
p
R-format 0 0 0 1 0 1 0 0
lw 1 0 1 0 1 0 0 0
sw 0 1 X 1 0 0 0
addi
beq 0 0 X 0 0 1 1
JR rs 1 J 1
15pts
1
30pts
10pts
9pts
PC
Instructionmemory
Readaddress
Instruction[31–0]
Instruction [20 16]
Instruction [25 21]
Add
Instruction [31 26]
4
Instruction [15 0]
Mux
0
1
Control
Instruction [15 11]
Control
JumpJR
Instruction [5 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
16 32
0
0
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Mux
1
ALUresult
Zero
PCSrc
Datamemory
Writedata
Readdata
Mux
1
ALUcontrol
Shiftleft 2
ALUAddress
PCSrcRegDst
Branch
MemReadMemtoReg
ALUOp
MemWriteALUSrc
RegWrite
Zero
ALUcontrol
1
0
1
0
JR Jump
6
Jump Address [31:0]Instruction [31:0]
PC+4 [31:28]
21 3 4 5
7
89
February 13, 2020 10:04 am EE457 Quiz - Spring 2020 C Copyright 2020 Gandhi Puvvada
Blank page: Please write your name and email. Tear it off and use for rough work. Do not submit at the end.
Student’s Last Name:____________________ email: __________________
It is not difficult to get an A in EE457. You need to work for it and seek help from the 457 teaching team on whatever you do not understand. We are eager to help you. The next four topics, Multi-cycle CPU, pipelined CPU, cache and virtual memory are interesting and challenging too. They are the focus of the midterm exam. Then we cover advanced topics. Best! Gandhi, TA: Kartik, Mentors: Sanjanai, Gengyu HW Graders: Adithya, Gurucharan, Lab Graders: Guowei, and Ting-Yu