Session 10: Process scheduling

Textbook: Section 2.4

Semaphore applications

Readers and writers

Process scheduling

Round robin
Priority
Performance-based
Lottery
Differing time slices
Minix's scheme

Semaphore applications

Readers and writers

The reader-writer problem comes from databases: The idea is that we have some data on which processes might get a read lock if the process wants to read the data or a write lock if the process might possibly change the data. Any number of processes should be allowed to obtain a read lock on the data simultaneously, but a write lock should prevent anybody else from reading or writing the data until the write lock is released.

To solve this, we might do the following: We keep a regular variable count to count the number of readers, and we use two semaphores. The first semaphore count_lock is for locking the count variable, and the second semaphore db_lock is for locking the database. Both semaphores start at the value 1 (meaning up to 1 process can get the lock at a time).

To get the write lock, we can just get a lock using db_lock, which gives us complete access to the data.

void write_lock() {                void write_unlock() {
    down(db_lock);                     up(db_lock);
}                                  }

Getting the read lock is somewhat harder: If we're the first one to obtain a read lock, we get a lock on the data and increment count. Subsequent read-lockers can just increment the count. Similarly, when we want to release the read lock, we want to release the readers' joint lock on the data if we're the last one. (This is similar to the light switch in a room: The first one in turns it on, and the last one out turns it off.)

void read_lock() {                 void read_unlock() {
    down(count_lock);                  down(count_lock);
    if(count == 0) down(db_lock);      --count;
    ++count;                           if(count == 0) up(db_lock);
    up(count_lock);                    up(count_lock);
}                                  }

This solution has a problem in that it gives priority to readers over writers. If there is a steady stream of readers, the writer will starve. (This is the rush-hour street-crossing problem all over again.)

Fixing the starvation

So how can we fix this? In one solution, we'll force each reader to block when there's a writer available. Basically, we're giving priority to writers over readers.

To accomplish this, we'll add a new semaphore reads_lock, initially 1, which will hold 0 when no readers are allowed to continue. A writer will acquire this lock before it vies for the record lock, effectively blocking out readers that come after it.

void write_lock() {                void write_unlock() {
	down(reads_lock);                  up(db_lock);
    down(db_lock);                     up(reads_lock);
}                                  }

A reader will also decrement reads_lock to make sure it's available. But it immediately increments it to allow subsequent readers to get through also.

void read_lock() {                     void read_unlock() {
	down(reads_lock);
	up(reads_lock);
    down(count_lock);                      down(count_lock);
    if(count == 0) down(db_lock);          --count;
    ++count;                               if(count == 0) up(db_lock);
    up(count_lock);                        up(count_lock);
}                                      }

So, as soon as a writer requests a lock, it locks reads_lock. Subsequent readers will block on reads_lock until that writer finally finishes.

Does this give unfair priority to writers? Since both readers and writers have the same priority for reads_lock, this technique doesn't really prefer one over the other.

Process scheduling

Synchronization is one interesting issue that arises with multiple processes. Another interesting issue is the decision the OS must make in deciding which one to select for the CPU at any time.

Round robin

The simplest technique is round-robin scheduling. Here, the OS maintains a queue of processes. When it wants to select something to run, it selects the process at the head of the queue. When its time slice expires, it goes to the back of the line. When a process becomes ready to use the CPU, it also goes to the back of the line.

The only issue here is how long the quantum should be - that is, how long is a time slice? This is a balancing act: It takes time to switch processes, so you don't want to switch processes often, and hence the quantum should be large; on the other hand, you don't want it to be so long that users become frustrated with how erratic the observed behavior is.

Minix uses a quantum of 100 milliseconds, which is good enough for its purposes. A larger delay (like half a second) would be within the perceptive range of humans, and you don't want users to perceive the quantum. For a system with a heavier process load, you might want a smaller quantum, as several delays of 100 milliseconds can accumulate.

Priority

In a priority-based scheduler, processes are assigned priorities. The highest-priority ready process is chosen for the CPU at any time. This makes sense, for example, for multi-user computers in many corporate settings, where executives might feel they should get the CPU whenever they want it, while professionals should get it when the executives are done with they work, and then interns get the spare pieces.

Even in a single-user process, a priority-based system can make sense: The printing daemon never desperately needs the CPU, so why not give it a lower priority than other processes? A process that sits in the background to send mail can also wait until its convenient.

In serious priority systems, you will still want a queue for each level of the system, as multiple same-level processes may be vying for the CPU.

Notice that a priority-based system gives complete priority to higher-priority processes. That is, a second-priority job can't finish as long as there are first-priority jobs to complete. This may make sense, but it is also risky, since a first-priority job can take a very long time. If an intern is trying to run an interactive job, and the vice-president happens to run a long calculation, the intern will perceive that the system has frozen in the meantime.

Performance-based

Because of this problem with priority-based systems, they frequently have some sort of adaptation process. For example, a job's priority might be demoted to the next priority level each time it completes its time quantum. Thus the vice-president's computation job quickly reaches the same level as the intern's interactive job, at which time both jobs are running round-robin. When a process blocks, it moves back to its original priority level. (This is so that the vice-president doesn't have to continue using the spreadsheet at intern priorities just because it did a long computation earlier.)

Of course, that only works as long as the vice-president doesn't wise up to the procedure and get the program to block for a short time every 99 milliseconds, so that the program gets nearly all of its quantum and then returns at the same priority.

Even in round-robin scheduling, it may be preferable to make some modifications based on performance. Usually, it's desirable to prefer processes that frequently block on user input, as those are processes where you know somebody is sitting there waiting for a response. Processes frequently blocking on other I/O devices (like the disk) should also be finished very quickly when they want the CPU, since the system gains overall performance from being able to complete computation while such a process is waiting for the I/O device.

One nicely-designed system, XDS 940, automatically classified processes in four categories: terminal, I/O, short quantum, and long quantum. When a process blocked for terminal input, it raised to the highest priority, so that it could immediately respond to the user's input. When a process blocked for another I/O device, it raised to the second priority level. But when a process was removed because of its quantum expired, it went down to the third priority level. And if that happened several times in a row, it went to the fourth priority level.

Lottery

Another way of handling the intern's problem is to use a lottery-based system. Here, the vice-president's computational process might get 10 tickets, while the intern's process gets a pathetic 1. Each time the CPU is to select a process, it chooses a random ticket. Thus the intern can still go on editing a document while the vice-president is doing computation (though at a much-reduced pace, since the vice-president gets a much larger share of the CPU).

This also is open to its own abuses: It gives the intern an incentive to use many tiny processes, instead of one big process. You might get around this by assigning the intern 100 tickets, evenly split among the intern's processes, while the vice-president gets 1000 tickets, evenly split among the vice-president's processes.

Differing time slices

The book doesn't discuss this possible solution to the intern's problem, but it's also a significant alternative: You can still use round-robin, but assign different quanta to different priorities. Thus, while the intern gets a quantum of 20 milliseconds, the vice-president gets a quantum of 200 milliseconds.

This scheme, too, is open to abuse by the intern, who now has an incentive to divide a process into many smaller processes running in parallel.

Minix's scheme

Minix uses a combination of the priority scheme and the round-robin scheme. Recall that Minix is built in four layers.

Process scheduling system (layer 0)
I/O device tasks (layer 1)
System call servers - file server and memory manager (layer 2)
User processes (layer 3)

Layers 1 through 3 all consist of processes. Layer 0 is not really process-structured - it can't be, since it's implementing the concept of process itself. Layer 0 is just an interrupt-response system.

But Layers 1 through 3 constitute separate queues at separate priorities. Layer 1 (the I/O tasks) runs at the highest priority, Layer 2 (the servers) runs at second priority, and Layer 3 (the user processes) runs at last priority.

Minix uses round-robin for Layer 3, with a quantum of 100 milliseconds. For Layers 1 and 2, however, it simply lets the process run indefinitely, until it finally blocks. It might as well do this, since these processes block frequently, and the processes are highly trusted by the operating system.