Final Review B: Questions
Suppose two cores execute the following threads having access
to the shared-memory variables x
, y
, and z
.
initially, x
is 0,y
is 1,z
is 2Thread A Thread B x = 10;
y = 11;
z = 12;z = 22;
y = 21;
x = 20;
Assuming our cache is coherent, which of the following are possible final values for the shared variables?
x
y
z
a. 10 11 12 b. 10 11 22 c. 10 21 12 d. 10 21 22 e. 20 11 12 f. 20 11 22 g. 20 21 12 h. 20 21 22
Suppose two cores execute the following threads having access
to the shared-memory variables x
and y
.
initially, x
is 0,y
is 10Thread A Thread B if (x != 0) {
y = 11;
} else if (x == 0) {
y = 12;
}x = 1;
Assuming our cache is coherent, which of the following are possible final values for the shared variables?
x
y
a. 1 10 b. 1 11 c. 1 12
Suppose two cores execute the following threads having access
to the shared-memory variables a
and b
.
initially, a
is 0,b
is 1Thread A Thread B x = y;
y = 2;y = 3;
x = 4;
Suppose we have a sequentially consistent policy for cache
coherency. Nonetheless, in executing these threads we find that
the final value for x
is 1 and y
is 2 — as
if, in fact, B did not even exist.
This seems to contradict a sequentially consistent policy:
Though only one thread can start before the other ends,
it seems that A started before B finished (since x
is 1)
and also that B started before A finished (since y
is 2).
Why is it possible that x
is 1 and y
is 2 at the
program's completion?
In warehouse-scale computing, the biggest cost category is purchasing the computing equipment. What is the second biggest cost category?
What is a “cold aisle” in warehouse-scale computing?
Distinguish between the terms “array” and “rack” as sets of servers in warehouse-scale computing.
Distinguish the terms “Platform as a Service” (PaaS) and “Infrastructure as a Service” (IaaS) in cloud computing.
What motivates the introduction of MapReduce/Hadoop to help with programming for warehouse-scale computing?
Suppose we have a set of files corresponding to days of the year; each file contains a list of location names followed by a tempature reading. How can we use MapReduce/Hadoop to determine the maximum temperature reading for each location?
Suppose we have a set of files corresponding to courses, each listing the students enrolled in that course. How can we use MapReduce/Hadoop to determine, for each student, the number of distinct students that this person has taken a class with?
Final Review B: Solutions
The possibities are (a.), (e.), (g.), and
(h.).
If x
ends up at 10, then that means that we execute
the first instruction of Thread A after completing
the last instruction of Thread B, so none of B's writes
have any effect; that disqualifies
(b.), (c.), and (d.).
If y
ends up at 11, that means that we execute the A's second
instruction after B's second instruction, so B's write to z
will be overwritten by A's write to z
; that disqualifies
(f.).
Any of the three are possibilities.
Obviously, if we do all of A before B, then y
will be 12;
and if we do all of B before A, then y
will be 11.
It's also possible that first A's first if
condition
fails (since x
is 0),
then B executes, and then A's second if
condition fails
(since x
is now 1), so y
will remain unchanged at 10.
With the MIPS architecture and other load-store architectures,
the instruction x = y
will translate into a load instruction (lw
on MIPS)
followed by a store instruction (sw
on MIPS).
It is possible that A will execute its load instruction first,
placing 1 into a register; then both instructions of B will
execute, then A will store the 1 still in its register into
x
and the 2 into y
.
Equipment for power and cooling.
For cooling, cool air is vented into one aisle between servers, which is blown across individual servers to cool them before entering the “hot aisle” where it is vented out of the room.
A rack is a vertical stack of servers, typically 24 to 48. An array is a collection of typically around 20 to 40 racks.
With PaaS, the cloud provider configures the operating system and installed software, and the user will manage an account within that configuration. With IaaS, the user is given control over a specific machine (typically a virtual machine), which allows control over all (or nearly all) aspects of the OS installation.
In programming solutions for warehouse-scale computing, any solution must deal with distributing tasks among servers, communicating results between servers, and recovering from server failure. MapReduce/Hadoop is a framework that applies to a wide variety of problems using a large amount of data processing, as is typical in warehouse-scale computing problems, and it manages these details (task assignment, communication, server failures).
In the Map phase, we'd have a Map task for each file that would break the file into (locationName, temperatureReading) pairs. In the Reduce phase, we'd have a Reduce task that takes a locationName and the list of temperatureReadings, computes the maximum temperatureReading from the list, and produces (locationName, maximumReading) as its result.
In the Map phase, we'd have a Map task for each file that would loop through each combination of (studentA, studentB) pairs and emit it as a key-value pair. In the Reduce phase, we'd have a Reduce task that takes a student name and the list of students who that person has taken a class with; it would sort the list, removing duplicates, and then produce (locationName, sortedListLength) as its result.