Textbook: Section 2.1.3
Implementation kernel-level threads user-level threads comparison hybrid solutions Practice
An interesting issue arises when you talk about threads: How should they be implemented? The basic question is, should they be a part of the OS or not?
The simplest solution is to say: Yes, they will be. We'll just implement the required routines as system calls. Whenever the program wants to create a thread, it will generate a software interrupt requesting that the OS creates a new thread.
The OS will maintain a thread table (analogous to the process table), and it will switch between threads just as it switches between processes.
I'm calling this ``simple'' because it doesn't take a lot of creativity or new concepts - you just throw it all into the kernel.
An alternative is to place all the thread management into the lap of the process. In this situation, the OS knows nothing about threads, and the thread routines are just library functions.
The thread table will be part of the process's memory. The kernel is ignorant of threads, and so it treats every process equally regardless of threads.
The thread library will implement the thread routines. If it is to be preemptive, it will have to schedule a signal with the OS so that it can regain control while a computational thread is running. (Unix has such a system call, called alarm(), though it only works with a granularity of a second.)
Even with user-level threads, some kernel support is needed. The alarm() system call is one example. But, also, library would need some way of avoiding system calls blocking. For example, the read() system call blocks until information becomes available. You don't want all the threads blocking while this is going on - after all, being able to do something during this time is frequently a major motivation behind using threads. Luckily, industrial-strength Unix systems already include such a system call (usually select()) to test whether a file descriptor is ready.
To give the user the illusion of being able to use regular system calls, the threads library would have to define wrapper functions for any system calls that might block. For example, read() would need a wrapper function. This wrapper function might test whether the file descriptor is ready for reading; if so, it would read and immediately return. Otherwise, it would block the thread and start up into another thread.
So why go with one instead of the other?
kernel-level threads | user-level threads |
---|---|
requires more frequent switches to and from kernel mode (which takes a lot of time) | requires less such switching |
makes kernel larger and hence more error-prone | makes processes larger and hence could involve additional memory expense |
existing system calls require little overhead | handling blocking system requires wrapper functions |
can handle page faults | page fault by any thread blocks all |
programs are less portable across OSes | programs port to other OSes |
programs can't control scheduling | programs can control scheduling policy |
threads of same process can be scheduled on different processors | threads of same process cannot be scheduled on different processors |
type | time to create |
---|---|
user-level thread | 34 microseconds |
kernel-level thread | 948 microseconds |
process | 11,300 microseconds |
In practice, many operating systems opt for a hybrid approach. Windows 2000, for example, defines the concept of a thread, a kernel-level thread, and a fiber, a user-level thread. A process can have multiple threads, and a thread can have multiple fibers.
Using fibers, a program can get the efficiency of user-level threads. But by having several threads, the program can take advantage of multiple processors. Additionally, the scheme handles OS blocks by simply switching to the fibers of another thread as the OS blocks.
This scheme is certainly the most complicated you could have, but it wins out on efficiency.
Using the mutex and condition variables defined in Pthreads, give a solution to Problem 1 from Quiz 3 (the one with save_obs() and read_obs()).
My solution for Quiz 3 used three semaphores: block_test (initially 1), is_good_wait (initially 0), and latest_lock (initially 1).
My solution using the Pthreads concepts uses a single mutex and a condition variable.void save_obs(int value) { int read_obs() { down(block_test); if(!is_good) { blocked = true; up(block_test); down(is_good_wait); } else { up(block_test); } down(latest_lock); down(latest_lock); latest = value; int ret = latest; is_good = true; is_good = false; up(latest_lock); up(latest_lock); down(block_test); if(blocked) up(is_good_wait); up(block_test); return ret; } }