Test 1 Review B: Questions

R1b.1.

In an earlier question, we implemented a ThreadHolder class, where threads could call the awaitRelease method to stall until another thread calls release with a number saying how many threads stuck in awaitRelease should be released. (You should assume that release is only called when there are enough threads stalling that all can be released immediately.)

public class ThreadHolder {

    private int released = 0;

    private Object lock = new Object();



    public void awaitRelease() {

        synchronized (lock) {

            while (released <= 0) {

                try { lock.wait(); } catch (InterruptedException e) {}

            }

            --released;

        }

    }



    public void release(int n) {

        synchronized (lock) {

            released += n;

            lock.notifyAll();

        }

    }

}

This solution, though, is prone to starvation even if a thread regularly calls release with a positive parameter n. Explain how starvation may occur, and repair the code so starvation will not occur.

R1b.2.

Summarize what happens with a semaphore's DOWN operation.

R1b.3.

Java's built-in Semaphore class implements semaphores, including a constructor taking an int parameter giving the initial value, a no-argument acquire method equivalent to the DOWN operation, and a no-argument release method equivalent to the UP operation.

Below is the skeleton of a Web server. Using the Semaphore class, how could you modify it so that no more than 5 Responders will be active at any time?

public class WebServer {

    private class Responder implements Runnable {

        public void run() {

            // (omitted) receive request



            // (omitted) send response to client



            // TODO: tell "main" we are done

        }

    }



    private void doMain() {

        // (omitted) start the server listening

        while (true) {

            // (omitted) accept connection from client



            // TODO: wait until less than 5 Responders are active



            // start thread to service client

            Responder responder = new Responder();

            // (omitted) initialize responder

            new Thread(responder).start()

        }

    }



    public static void main(String[] args) {

        new WebServer().doMain();

    }

}

R1b.4.

Using no other primitives but the Semaphore class, complete the following Barrier class so that each thread calling awaitOthers stalls within the method until the tenth, whereupon all ten threads return from awaitOthers.

public class Barrier {





    public void awaitOthers() {





    }

}

R1b.5.

Many processors provide a test-and-set machine language instruction to help an OS provide synchronization. Given a register and memory address as arguments, it places the current value in the provided memory address into the register and then updates the memory address to be 1. In C pseudocode, we can implement a critical section as follows.

while (test_and_set(lock_addr) != 0); // wait until lock changed from 0

// CRITICAL SECTION HERE

*lock_addr = 0;                       // reset lock to 0 to release it

Now suppose we need two locks lock0 and lock1 before entering a critical section. The straghtforward way to accomplish this is the following.

while (test_and_set(lock0) != 0); // first acquire lock0

while (test_and_set(lock1) != 0); // then acquire lock1

// CRITICAL SECTION HERE

*lock1 = 0;                       // release lock1

*lock0 = 0;                       // release lock0

However, this holds on to the first lock for an indefinite period (until the second lock becomes available) before entering the critical section. It would be better to avoid this. How could we rewrite this so that no lock is held for a long period of time?

R1b.6.

We discussed the relative merits of implementing threads as a kernel-independent user library (user-level threads) and of implementing threads via system calls to the kernel (kernel-level threads). Explain an advantage of one of the approaches over the other. Be sure to mention which approach your point supports.

R1b.7.

Explain what a hybrid thread implementation is and how it can potentially combine the advantages of user-level threads and kernel-level threads (at the expense of more complexity).

R1b.8.

Suppose we have four jobs arriving according to the following schedule (all units are in hours).

job	arrive	length
A	0	6
B	2	5
C	4	4
D	6	3

For each of FIFO, SJF, and Round Robin (with a one-hour time slice), list the intervals in which each job is worked on, and compute the average delay.

R1b.9.

Round-robin scheduling does poorly in minimizing the average delay. So what advantage does it have over FIFO or SJF?

R1b.10.

In a system where some processes are I/O-bound while others are compute-bound, how can we improve our scheduling over a simple Round Robin approach to improve the overall system performance?

R1b.11.

In a multiprocessor system, why can it be advantageous to try to schedule multithreaded programs simultaneously across different processors (rather than simply letting a thread come onto a processor whenever it happens to happen)?

Test 1 Review B: Solutions

R1b.1.

There is nothing dictating in which order threads will wake up from wait, and in fact it could be that the threads that most recently entered awaitRelease will be the ones that wake up and return. Thus, as long as more threads are in awaitRelease than are released by release, there could be a thread that never manages to exit awaitRelease simply because it never wins.

A solution would be for each thread entering awaitRelease to essentially “pick a number” and to ensure that those with the smallest numbers are those that are released.

public class ThreadHolder {

    private int lastReleased = 0;

    private int lastAllocated = 0;

    private Object lock = new Object();



    public void awaitRelease() {

        int myNumber;

        synchronized (lock) {

            lastAllocated++;

            myNumber = lastAllocated;



            while (lastReleased < myNumber) {

                try { lock.wait(); } catch (InterruptedException e) {}

            }

        }

    }



    public void release(int n) {

        synchronized (lock) {

            lastReleased += n;

            lock.notifyAll();

        }

    }

}

[There is a slight bug here with overflow. It's not particularly likely to be an issue, but it would be safer to use BigInteger rather than an int.]

R1b.2.

We decrement the integer held as part of the semaphore — but if the semaphore had already reached zero, we first wait until the semaphore becomes positive.

R1b.3.

The three lines marked “// ADDED” below are the only changes.

public class WebServer {

    private class Responder implements Runnable {

        public void run() {

            // (omitted) receive request



            // (omitted) send response to client



            // TODO: tell "main" we are done

            availableSlots.release(); // ADDED

        }

    }



    private Semaphore availableSlots = new Semaphore(5); // ADDED



    private void doMain() {

        // (omitted) start the server listening

        while (true) {

            // (omitted) accept connection from client



            // TODO: wait until less than 5 Responders are active

            availableSlots.acquire(); // ADDED



            // start thread to service client

            Responder responder = new Responder();

            // (omitted) initialize responder

            new Thread(responder).start()

        }

    }



    public static void main(String[] args) {

        new WebServer().doMain();

    }

}

R1b.4.

public class Barrier {

    private int numWaiting = 0;

    private Semaphore numWaitingLock = new Semaphore(1);

    private Semaphore waitToReturn = new Semaphore(0);



    public void awaitOthers() {

        numWaitingLock.acquire(); // lock for access to numWaiting

        ++numWaiting;

        boolean inFirstNine = numWaiting < 10;

        numWaitingLock.release(); // done accessing it, so release lock



        if (inFirstNine) { // I am in first 9; wait until tenth releases me

            waitToReturn.acquire();

        } else {           // I am tenth; release the other 9

            for (int i = 0; i < 9; i++) {

                waitToReturn.release();

            }

        }

    }

}

R1b.5.

while (1) {

    if (test_and_set(lock0) != 0) { // we acquired lock0

        if (test_and_set(lock1) != 0) {

            break;                  // both locks acquired, exit loop

        } else {

            *lock0 = 0;             // lock1 failed, release lock0

        }

    }

}

// CRITICAL SECTION HERE

*lock0 = 0;

*lock1 = 0;

[By the way, a better approach would be for the else case to release lock0, then work on acquiring lock1 before it attempts to acquire lock0 again. But this is a more complex.]

R1b.6.

Advantages of user-level threads:

In switching contexts between threads, avoids switching to kernel mode, which adds quite a bit of time to the context switch.
OS-level support is minimal, so threaded applications can be more portable across operating systems.
Applications have more control over the scheduling policy used for switching between threads.

Advantages of kernel-level threads:

On multicore/multiprocessor systems, threads can be scheduled simultaneously on different cores/processors.
When one thread blocks for a page fault or device access, the OS knows about other live threads in the same process that can still use the CPU.

R1b.7.

In a hybrid threading system, a process has a handful of kernel-level threads, each of which manages a large group of user-level threads. For a context switch between threads, we can usually switch between user-level threads, avoiding entry into the CPU's kernel mode. But different kernel-level threads can still be scheduled on different processors; and if a thread blocks for a page fault or device access, the OS can switch to another kernel-level thread.

R1b.8.

FIFO

time	job
0–6	A
6–11	B
11–15	C
15–18	D

job	arrive	complete	delay
A	0	6	6
B	2	11	9
C	4	15	11
D	6	18	12

The total delay is 6 + 9 + 11 + 12 = 38, so the average is 38 / 4 = 9.5.

SJF

time	job
0–6	A
6–9	D
9–13	C
13–18	B

job	arrive	complete	delay
A	0	6	6
B	2	18	16
C	4	13	9
D	6	9	3

The total delay is 6 + 16 + 9 + 3 = 34, so the average is 34 / 4 = 8.5.

Round Robin (time slice 1)

time	job
0–2	A
2–3	B
3–4	A
4–5	C
5–6	B
6–7	D
7–8	A
8–9	C
9–10	B
10–11	D
11–12	A
12–13	C
13–14	B
14–15	D
15–16	A
16–17	C
17–18	B

job	arrive	complete	delay
A	0	16	16
B	2	18	16
C	4	17	13
D	6	15	9

The total delay is 16 + 16 + 13 + 9 = 54, so the average is 54 / 4 = 13.5.

R1b.9.

Over SJF, it has the advantage that it works without any foreknowledge of job lengths, and it is not prone to starving.

Over FIFO, Round Robin ensures that the processor regularly spends time on each process, so that short-running jobs will tend to complete faster. By contrast, FIFO can lead the processor to have very long delays as the CPU works through the longest-running jobs.

R1b.10.

We should prioritize those processes that are I/O-bound, so that whenever they are ready for the CPU we can spend the small amount of time it takes so that the I/O-bound process is again waiting on I/O. This will use the I/O devices more heavily while having a negligible effect on CPU usage.

R1b.11.

When the threads are trying to communicate, having both scheduled simultaneously can allow them to communicate with smaller delays.

Also, the cache can potentially be used more effectively if it is only being used for one process's memory.