ENCM 515 Review talk on 2001 Final

ENCM 515 Review talk on 2001 Final

A. Wong,Electrical and Computer

Engineering, University of Calgary, Canada

wona @ ucalgary.ca

To Be Tackled Today

Review Important concepts of DSP

2001 ENCM 515 Final Exam Question 1 Question 2

Disclaimer

The answers given in this presentation are the views of the presenter and not necessarily the answers accepted by Dr. Smith

Requirements for “perfect” DSP architecture - 1

Fast instruction cycle -- not clock speed Fast hardware multiplier Floating point for easier design -- avoids

scaling and overflow High precision

wide busses for register, memory, processing units

Fast loop operation


Several data buses available to reduce memory bus conflict/transfer overhead

Harvard architecture and/or instruction caches to avoid instruction and data-fetch clashes

Duplicate resources for parallel computation Dedicated address calculation hardware


Extensive temporary registers to avoid unnecessary fetches of continually used data

Architecture allows easy parallel operation in multiprocessor systems -- NEW

Cycle time adjustable by instruction -- UNCOMMON

Duplicate resources for parallel computation of real and imaginary components -- UNCOMMON -- SIMD?

2001 Final Exam - 1 Assume that non-volatile registers have been saved as needed and that

the DAG registers I4, M4, B4, L4, I3, M3, I12, M12 have been set correctly A – circle the compute component of ONE 21k instruction B – circle the first totally parallel instruction in code C – circle the instructions that demonstrate Filling the algorithm pipeline

1 F9 = F9 - F9 R2 = 256

2 F1 = dm(I4,M4) F5 = pm(I12,M12)

3 lcntr = R2, do (pc, END_DEMOD - 1)

4 F13 = F1 * F5 F9 = F9 + F13 F1 = dm(I4,M4) F5 = pm(I12,M12)

END_DEMOD:

5 F13 = F1 * F5 F9 = F9 + F13

6 dm(I3,M3)

2001 Final Exam – 1 -- DSA A – circle the compute component of ONE 21k instruction -- OK B – circle the first totally parallel instruction in code -- OK C – circle the instructions that demonstrate Filling the

algorithm pipeline – the dm and pm in 2 and the + and * in 4

1 F9 = F9 - F9 R2 = 256

2 F1 = dm(I4,M4) F5 = pm(I12,M12)



END_DEMOD:

5 F13 = F1 * F5 F9 = F9 + F13

6 dm(I3,M3)

2001 Final Exam - 2 Briefly explain, using the context of this

code, the concept of pipeline in parallel instruction processors.

Answer – pipelines are necessary for parallelizing the above code since it involves using the same registers at different stages of the instruction cycle (Fetch, Decode, and Execute)

2001 Final Exam - 3 The code would be more

understandable if the first instruction had been written as F9 = 0, R2 = 256 but that wasn’t not possible. Explain.

Answer – There is a set number of bits on the data bus, if the instruction uses too many constants, there may not be enough bit to store the number.

2001 Final Exam – 3 – D.S.A The code would be more understandable if the first

instruction had been written as F9 = 0, R2 = 256 but that wasn’t not possible. Explain.

Answer – There is a set number of bits on the data bus, if the instruction uses too many constants, there may not be enough bit to store the number.

Answer – Incomplete – better – each constant takes 32 bits, total of 64 bits needed and only 48 bit program bus to carry instructions

2001 Final Exam - 4 The code will not provide the correct

synchronous detection result. There are a number of ways of fixing the code. Would changing instruction 2 to F13=F13–F13, F1=dm(I4,M4), F5=pm(I12,M12); be one of them?

Answer – yes, because F13 is not set to 0 at first, it may be containing “garbage” when used, resulting in error.

2001 Final Exam - 5 Explain the differences and relative

advantages between processors with a von Neumann and Harvard architecture.

CPU

Address Bus

Data Bus

Von Neumann

CPUROM Data

ROM Data

Harvard

Data Bus Data Bus

Address Bus Address Bus

2001 Final Exam – 5 – D.S.A. Picture’s are nice – but N. Q. A. – The

question said “Relative advantages and disadvantages” and you never discussed these at all.

CPU

Address Bus

Data Bus

Von Neumann

CPUROM Data

ROM Data

Harvard

Data Bus Data Bus

Address Bus Address Bus

2001 Final Exam - 6 Using processors discussed in ENCM

515 provide examples of processors with a von Neumann and with a Harvard architecture.

Answer von Neumann (68k) Harvard (29k)

2001 Final Exam - 7 The SHARC 21k does not have a Harvard

architecture but a Super Harvard ARChitecture. What are the advantages of having a super Harvard over the normal type, and under what circumstances will these advantages disappear.

Answer – The 21k allows caching of instruction for fast access. The advantage disappears when the cache is full or when cache thrash occurs.

2001 Final Exam - 8 Consider the code given earlier, will

instruction 6 be cached? If it is, how do you know? If not, why?

Answer – No, caching only occurs when data access on PM bus conflicts with instruction access on the PM bus

2001 Final Exam – 1 – D.S.A

Answer – No, caching only occurs when data access on PM bus conflicts with instruction access on the PM bus

ANSWER Yes -- 4 inside the loop clashes with 6 outside the loop

1 F9 = F9 - F9 R2 = 256

2 F1 = dm(I4,M4) F5 = pm(I12,M12)



END_DEMOD:

5 F13 = F1 * F5 F9 = F9 + F13

6 dm(I3,M3)

Homework Saturation – arithmetic – Design, write and document an 21k

assembly language code segment that accesses N points of a floating point array PMarray[] over the PM data bus, TRIPLES each value and sets all results above +25.0 to be equal to +25.0 before storing the result into a floating point array DMarray[] over the DM data bus.

.segment/pm seg_pmda;

.var PMarray[256]; // The initial array

.endseg;

.segment/dm seg_dmda;

.var DMarray[512]; // The final array

.var N; // The number of values to be converted

.endseg;

Documents

ENCM 515 Review talk on 2001 Final