Part A
1) Protein string machine code
has 4 days execution time on current machine doing integer instruction in 20%
of time , doing I/O instruction in 35% of time and other instruction in the
remaining time. Which is the better trade-off among the following two proposals?
(i) Compiler optimization that reduces the number of integer
instructions by 25% (assume each integer instruction take same amount of time)
(ii)Hardware optimization that reduce the latency of each IO operation from 6µs to 5µs.
(ii)Hardware optimization that reduce the latency of each IO operation from 6µs to 5µs.
2) A computer architect need
to design the pipeline of a new microprocessor. She has an example workload
program with one million (106 ) instructions. Each instruction takes
100ps (1ps = 10-12 sec) to complete. How long does it
take to execute this program in a non-pipelined processor? The current
state-of-the-art microprocessor has about 20 pipeline stages. Assume it is
perfectly pipelined. How much speedup will it achieve compared to
no-pipelines processor?
3) How can a CPI < 1 be achieved?
List the approaches used for it.
Part B
4)
a) Explain Tomasulo's
algorithm to overcome data hazzard using dymamic scheduling with neat diagram
and an example code.
(OR)
b) Discuss the static and
dynamic branch prediction techniques with suitable examples and diagrams
(9)
5)
Analyse data dependencies
among the following statemens:
S1: Load R1, 1024 /R1 ← 1024/
S2: Load R2, M(10) /R2 ← Memory(10) /
S3: Add R1, R2 /R1 ← (R1) + (R2)
/
S4: Store M(1024) ,R1
/Memory(1024) ← (R1) /
S5: Store M((R2)) ,1024
/Memory(64) ← 1024/
Note that (Ri) means content of register Ri and Memory(10) contains 64
initially. Draw a dependence graph to show all the dependencies. Are there any
resource dependencies if only one copy of each functional unit
is available in CPU?
(OR)
Consider the following code. If all istructions take 2 cycle latency
,unroll this loop to (i) twice and (ii) four times and show how a VLIW capable
of two loads and two adds per cycle can use the minimum number of registers ,in
the absence of pipeline interruptions and stall.
Loop: LW R1,0(R2);
ADDI R5,R1,#1;
SW R1,0(R2);
ADDI R2,R2,#8;
SUB R4,R3,R2
BNZ R4,Loop
No comments:
Post a Comment