Session 24: Threads

Textbook: Section 2.1.3

Threads
  definition
  uses
Application
  pthreads
  mutexes
  conditions
  example

Threads

Definition

Two things distinguish processes: Each process has its own line of execution, and each process has its own set of resources. A thread is similar, but it permits different threads to use the same set of resources. You can think of a thread as being ``below'' processes - each thread runs within a process, and a process can have multiple threads.

Multiple threads running within the same process share all the same resources - most importantly, they share memory. Of course, each thread will need its own stack, but they share everything in the process's data segment - which holds all the global variables.

Uses

Threads have many advantages.

Threads allow a program to continue working even when the program has called a system call. For example, a Unix Web browser will need to call the read() system call in order to read information from the network. This call blocks for a long time, until the information is received. It would be nice if the browser could still go on about its normal work (permitting the user to scroll within other documents, for example). To do this, you could have a thread to load the page from the network, and another thread to handle the normal work. While the first thread is blocked, the second can still continue.
They enable the programmer to deal more easily with situations where the program should be doing many things simultaneously. For example, good Web servers will handle several page requests simultaneously, since handling a page request frequently involves large network delays, and more requests will come during that time. It's a huge pain to try to program this within the same line of execution (trust me - I've done it). It's much easier if you can somehow program each request separately.
Threads are ``lightweight'' - that is, the OS can create them efficiently, and communication between them (via the shared data segment) is very efficient. This makes them superior to processes in many situations.

Ok, that's just three advantages. But it's enough.

Application

Historically, threads are relatively modern. The OS textbook I used as an undergraduate, published 1991, spends only one page on threads. But today, threads are de rigeur in modern computers. Solaris, Linux, and Windows 2000 are just a few of the more prominent OSes that have some support for threads. (It doesn't have to be supported by the OS, but we'll look at that later.)

Pthreads

We're going to look at the POSIX version of threading, named Pthreads (no doubt in honor of Psmith, one of Wodehouse's favorite characters before he discovered Jeeves). Both Linux and Solaris (and many other Unices) implement this.

Pthreads defines several routines for managing threads. The most important among them are doubtless the one for creating a thread, pthread_create. It takes four arguments:

a pointer to a pthread_t, which is simply a thread ID number. The created thread's ID is placed into the pthread_t pointed to by the argument.
configuration parameters to control how the thread is created. Just use NULL here.
a function pointer, to a function that takes a void* as a parameter and returns a void*. The new thread will start up in this function.
a void*, which is the parameter that should be passed into the function in the new thread.

After creating the new thread and setting it off on its way, the thread that called pthread_create simply continues. The pthread_create function returns an integer code, specifying whether an error occurred. (A 0 indicates no error.)

The second most important Pthread function is pthread_join, which takes two parameters - a pthread_t and a pointer to a void*. This function waits for the thread mentioned in the first parameter, and the value it returns is placed into the void* location pointed to by the second pointer parameter.

Example

Here's an example that counts the primes less than a million by generating 10 new threads - one to handle each block of 100000 numbers. For the moment, just look at main(): The loop in lines 36-45 spawns off the 10 threads, setting each off into the run() function. The loop in lines 46-53 waits for each of these threads to complete.

(This is a somewhat contrived example. It's easier to write this program with a single thread; on a single-processor computer, the single-threaded version would be slower. There is a redeeming possibility, though: On a multiprocessor computer, however, the operating system might schedule different threads on different processors.)

For the moment, ignore the lines that talk about mutexes.

  1  // to compile: g++ count_primes.C -lpthread
  2  #include <stdio.h>
  3  #include <iostream.h>
  4  #include <errno.h>
  5  #include <pthread.h>
    
  6  const int NUM_THREADS = 10;
  7  const int BLOCK_SIZE = 100000;
    
  8  // Returns true if parameter n is a prime number
  9  bool isprime(int n) {
 10      if(n % 2 == 0) return false;
 11      for(int i = 3; i * i <= n; i += 2) {
 12          if(n % i == 0) return false;
 13      }
 14      return true;
 15  }
    
 16  pthread_mutex_t total_mutex = PTHREAD_MUTEX_INITIALIZER;
 17      // for getting a lock before adding to total
 18  int total = 0; // maintains the total number of primes found
    
 19  // routine for defining what a thread does
 20  void* run(void *data_p) {
 21      // determine the range in which we are to search
 22      int first = *((int*) data_p);
 23      int last = first + BLOCK_SIZE;
    
 24      // count up the primes
 25      int count = 0;
 26      for(int n = first; n < last; n++) {
 27          if(isprime(n)) ++count;
 28      }
    
 29      // add count into the global total variable
 30      pthread_mutex_lock(&total_mutex);
 31      total += count;
 32      pthread_mutex_unlock(&total_mutex);
 33      return NULL;
 34  }
    
 35  int main() {
 36      // spawn off threads to handle various blocks
 37      pthread_t threads[NUM_THREADS];
 38      int start[NUM_THREADS];
 39      for(int i = 0; i < NUM_THREADS; i++) {
 40          start[i] = i * BLOCK_SIZE;
 41          int err = pthread_create(&(threads[i]), NULL, run, &(start[i]));
 42          if(err != 0) {
 43              cerr << "pthread_create: " << err << endl;
 44          }
 45      }
    
 46      // wait for all threads to terminate
 47      for(int i = 0; i < NUM_THREADS; i++) {
 48          void *data_p; // just a dummy parameter for pthread_join
 49          int err = pthread_join(threads[i], &data_p);
 50          if(err != 0) {
 51              cerr << "pthread_join: " << err << endl;
 52          }
 53      }
    
 54      // print total
 55      cout << total << endl;
 56  }

Notice that the routine tells each thread which block it should work on by passing it an integer. But, since run() has to take a void* as a parameter, it actually passes into run() a pointer to an integer (the last argument of line 41). The first thing that run() does is to convert that void* parameter into an int*, and then dereference it to get an integer (line 22).

It's significant that, in line 38, a complete array of integers is created to hold these parameters. Suppose we were to instead make a single int variable named start, and use that instead?

 38      int start;
 39      for(int i = 0; i < NUM_THREADS; i++) {
 40          start = i * BLOCK_SIZE;
 41          int err = pthread_create(&(threads[i]), NULL, run, &start);

This would be problematic, because we don't know the order of execution. The main thread may create the first thread, but there's no guarantee that this is when the first thread would begin working. When the main thread continues, it changes the start variable. If the first thread begins working after this point, then it gets the wrong start value in line 22.

Mutexes

The above program has a global variable total that all the processes see. Each of the computing threads computes how many primes it finds in its block, and then it adds the total into total (line 31). The main thread, after ensuring that all the threads have completed, prints the value of this out.

There is a possibility of a collision here: Thread A might load the total value from memory, then thread B might load the total value from memory, then A will add its count into its loaded value and store it into total, then B will add its count into its loaded value and store it into total. The problem is that B added to the old value, and as a result A's computation was for naught!

This is a critical section. Pthreads gives a way of handling critical sections, called the mutex, which behaves essentially as a lock. In line 16, we declare a shared pthread_mutex_t variable to represent a mutex. Then, in line 30, we obtain a lock on the mutex. And, in line 32, we release the lock. This prevents A and B from both being at line 31 simultaneously.

Conditions

Pthreads also provides a concept called the condition variable. The idea is that a thread can block on the condition variable, and it will remain blocked until another thread signals the condition variable to wake up somebody waiting on it.

Sending a signal is done with the pthread_cond_signal function, which takes a pthread_cond_t* as a parameter. It will awake some thread that is waiting on that condition variable. If there is no such thread, nothing happens. (The signals aren't queued.)

This last parenthesized point is important: Signals aren't queued. In other respects, condition variables are similar to semaphores - sending a signal is like an UP, and waiting on the variable is like a DOWN. But, since signals are lost in the ether if nobody is waiting on the condition variable, they are crucially different.

To block on the condition variable, there is the pthread_cond_wait function, which takes a pthread_cond_t* to designate the condition variable, and a pthread_mutex_t* to designate a mutex variable. The thread should already be holding a lock on the mutex variable. The pthread_cond_wait function will release the mutex and wait on the condition variable. When it finally gets a signal (which would be indefinitely later), it reacquires the mutex lock and continues.

What's the point of the mutex? It's important that the thread blocking be sure the thing it's waiting on is not true. Thread A may check whether something is true, and if not then it would wait on a condition variable for another thread to signal that it's now true. But thread B may make the condition true and then signal in between the point where A checks the condition and the point where A waits on the condition variable. If so, then A has missed the signal.

You can use condition variables for any problem where we've been using semaphores earlier. For example, here's how I'd handle the baboon problem using condition variables. (The pthread_cond_broadcast function simply signals to everbody waiting on the signal, instead of just one thread.)

lock_east() {
    pthread_mutex_lock(&rope_check);
    while(west_count > 0) {
        pthread_cond_wait(&rope_empty, &rope_check);
    }
    ++east_count;
    pthread_mutex_unlock(&rope_check);
}

unlock_east() {
    pthread_mutex_lock(&rope_check);
    --east_count;
    if(east_count == 0) {
        pthread_cond_broadcast(&rope_empty);
    }
    pthread_mutex_unlock(&rope_check);
}

lock_west() {
    pthread_mutex_lock(&rope_check);
    while(east_count > 0) {
        pthread_cond_wait(&rope_empty, &rope_check);
    }
    ++west_count;
    pthread_mutex_unlock(&rope_check);
}

unlock_east() {
    pthread_mutex_lock(&rope_check);
    --west_count;
    if(east_count == 0) {
        pthread_cond_broadcast(&rope_empty);
    }
    pthread_mutex_unlock(&rope_check);
}