printable version
Exam 2 Review A
[1]
[2]
[3]
[4]
Problem R2a.1.
[10 pts]
Complete the following.
a. Suppose we have a 1 GHz processor with a classic 5-stage pipeline,
where 30% of instructions stall for one clock cycle (and none for more
than one).
How many instructions will this processor complete per second?
(Show your calculations)
b. Suppose we split each stage of the pipeline, so that we now have a 10-stage pipeline, which allows us to run the processor at 1.8 GHz. Now 30% of instructions stall for two clock cycles, while the other 70% do not stall at all.
How many instructions will this processor complete per second?
(Show your calculations)
a. |
1.0 × 109 cycles | × |
instruction | × |
= 7.69 × 108 | instructions |
|
.3 × 2 + .7 × 1 cycles |
second |
second |
a. |
1.8 × 109 cycles | × |
instruction | × |
= 11.25 × 108 | instructions |
|
.3 × 3 + .7 × 1 cycles |
second |
second |
Problem R2a.2.
[10 pts]
Assume we have a vector processor with vector registers of length
n where n ≥ 16.
The processor has one load/store unit (latency 10),
one multiplication unit (latency 7), and one addition unit (latency 4).
Assuming the processor does not avoid latency between consecutive
instructions, but it does support chaining, complete the following table
showing in which clock cycles each indicated element is in an execution
unit.
instruction | first elt | last elt |
lv V1, ($s0) |
0… | |
mulvv.d V2, V1, V1 |
| |
mulvv.d V3, V2, V1 |
| |
sv V3, ($s0) |
| |
instruction | first elt | last elt |
lv V1, ($s0) |
0…9 | n − 1…n + 8 |
mulvv.d V2, V1, V1 |
10…16 | n + 9…n + 15 |
mulvv.d V3, V2, V1 |
n + 10…n + 16 | 2 n + 9…2 n + 15 |
sv V3, ($s0) |
n + 17…n + 26 | 2 n + 16…2 n + 25 |
Problem R2a.3.
[18 pts]
When executing the following program,
which happens to display the hailstone sequence starting from 3,
the first six branches occur as listed at right.
n = 3;
while (true) {
printf("%d", n)
if (n % 2 == 0) { // A
n /= 2;
} else if (n != 1) { // B
n = 3 * n + 1;
} else {
break;
}
}
|
1: | A | N |
2: | B | Y |
3: | A | Y |
4: | A | N |
5: | B | Y |
6: | A | Y |
|
a. Suppose we predict branches using a local prediction buffer, where each entry is a two-bit predictor initially configured to predict “weakly yes” (that is, predict “yes” though the last observation was “no”).
Which branches will be predicted incorrectly?
b. Suppose we predict branches using a
three-bit global branch history, where each entry is
again a two-bit predictor initially configured to predict
“weakly yes.” The previous several branches have
been “no.” Which branches will be predicted
incorrectly?
c. Suppose we use a correlating predictor using a
one-bit global branch history, where each entry is again
a two-bit predictor initially configured to predict
“weakly yes.” The previous several branches have
been “no.” Which branches will be predicted
incorrectly?
a. 1, 3, 6
b. 1, 2, 4
c. 1, 4
Problem R2a.4.
[8 pts]
What is the function of a “reservation station” in
Tomasulo's algorithm (without speculation)?
Whenever an instruction is issued, all information necessary
to executing the instruction in stored in its reservation
station. The reservation station's ID also identifies what
value should be used by subsequent instructions depending on
the instruction's result.