Using a barrier

When we talked about the synchronization of the main() function to the completion of the worker threads (in “Synchronizing to the termination of a thread,” above), we mentioned two methods: pthread_join(), which we've looked at, and a barrier.

Returning to our house analogy, suppose that the family wanted to take a trip somewhere. The driver gets in the minivan and starts the engine. And waits. The driver waits until all the family members have boarded, and only then does the van leave to go on the trip—we can't leave anyone behind!

This is exactly what happened with the graphics example. The main thread needs to wait until all the worker threads have completed, and only then can the next part of the program begin.

Note an important distinction, however. With pthread_join(), we're waiting for the termination of the threads. This means that the threads are no longer with us; they've exited.

With the barrier, we're waiting for a certain number of threads to rendezvous at the barrier. Then, when the requisite number are present, we unblock all of them. (Note that the threads continue to run.)

You first create a barrier with pthread_barrier_init():

#include <pthread.h>

int
pthread_barrier_init (pthread_barrier_t *barrier,
                      const pthread_barrierattr_t *attr,
                      unsigned int count);

This creates a barrier object at the passed address (pointer to the barrier object is in barrier), with the attributes as specified by attr (we'll just use NULL to get the defaults). The number of threads that must call pthread_barrier_wait() is passed in count.

Once the barrier is created, we then want each of the threads to call pthread_barrier_wait() to indicate that it has completed:

#include <pthread.h>

int
pthread_barrier_wait (pthread_barrier_t *barrier);

When a thread calls pthread_barrier_wait(), it will block until the number of threads specified initially in the pthread_barrier_init() have called pthread_barrier_wait() (and blocked too). When the correct number of threads have called pthread_barrier_wait(), all those threads will “simultaneously” unblock.

Here's an example:

/*
 *  barrier1.c
*/

#include <stdio.h>
#include <time.h>
#include <pthread.h>
#include <sys/neutrino.h>

pthread_barrier_t   barrier; // the barrier synchronization object

void *
thread1 (void *not_used)
{
    time_t  now;
    char    buf [27];

    time (&now);
    printf ("thread1 starting at %s", ctime_r (&now, buf));

    // do the computation
    // let's just do a sleep here...
    sleep (20);
    pthread_barrier_wait (&barrier);
    // after this point, all three threads have completed.
    time (&now);
    printf ("barrier in thread1() done at %s", ctime_r (&now, buf));
}

void *
thread2 (void *not_used)
{
    time_t  now;
    char    buf [27];

    time (&now);
    printf ("thread2 starting at %s", ctime_r (&now, buf));

    // do the computation
    // let's just do a sleep here...
    sleep (40);
    pthread_barrier_wait (&barrier);
    // after this point, all three threads have completed.
    time (&now);
    printf ("barrier in thread2() done at %s", ctime_r (&now, buf));
}

main () // ignore arguments
{
    time_t  now;
    char    buf [27];

    // create a barrier object with a count of 3
    pthread_barrier_init (&barrier, NULL, 3);

    // start up two threads, thread1 and thread2
    pthread_create (NULL, NULL, thread1, NULL);
    pthread_create (NULL, NULL, thread2, NULL);

    // at this point, thread1 and thread2 are running

    // now wait for completion
    time (&now);
    printf ("main () waiting for barrier at %s", ctime_r (&now, buf));
    pthread_barrier_wait (&barrier);

    // after this point, all three threads have completed.
    time (&now);
    printf ("barrier in main () done at %s", ctime_r (&now, buf));
}

The main thread created the barrier object and initialized it with a count of how many threads (including itself!) should be synchronized to the barrier before it “breaks through.” In our sample, this was a count of three: one for the main() thread, one for thread1(), and one for thread2(). Then the graphics computational threads (thread1() and thread2() in our case here) are started, as before. For illustration, instead of showing source for graphics computations, we just stuck in a sleep (20); and sleep (40); to cause a delay, as if computations were occurring. To synchronize, the main thread simply blocks itself on the barrier, knowing that the barrier will unblock only after the worker threads have joined it as well.

As mentioned earlier, with the pthread_join(), the worker threads are done and dead in order for the main thread to synchronize with them. But with the barrier, the threads are alive and well. In fact, they've just unblocked from the pthread_barrier_wait() when all have completed. The wrinkle introduced here is that you should be prepared to do something with these threads! In our graphics example, there's nothing for them to do (as we've written it). In real life, you may wish to start the next frame calculations.