November 2011

Volume 26 Number 11

Windows with C++ - Thread Pool Synchronization

By Kenny Kerr | November 2011

Kenny KerrI’ve said it before: Blocking operations are bad news for concurrency. Often, however, you need to wait for some resource to become available, or perhaps you’re implementing a protocol that stipulates some amount of time needs to elapse before resending a network packet. What do you do then? You could use a critical section, call functions like Sleep and WaitForSingleObject and so on. Of course, that means you’ll have threads just sitting around blocking again. What you need is a way for the thread pool to wait on your behalf without affecting its concurrency limits, which I discussed in my September 2011 column ( The thread pool can then queue a callback once the resource is available or the time has elapsed.

In this month’s column, I’m going to show you how you can do just that. Along with work objects, which I introduced in my August 2011 column (, the thread pool API provides a number of other callback-generating objects. This month, I’m going to show you how to use wait objects.

Wait Objects

The thread pool’s wait object is used for synchronization. Rather than block on a critical section—or slim reader/writer lock—you can wait for a kernel synchronization object, commonly an event or semaphore, to become signaled.

Although you can use WaitForSingleObject and friends, a wait object integrates nicely with the rest of the thread pool API. It does this quite efficiently by grouping together any wait objects that you submit, reducing the number of required threads and the amount of code you need to write and debug. This allows you to use a thread pool environment and cleanup groups, and also frees you from having to dedicate one or more threads to wait for objects to become signaled. Due to improvements in the kernel portion of the thread pool, it can in some cases even achieve this in a threadless manner.

 The CreateThreadpoolWait function creates a wait object. If the function succeeds, it returns an opaque pointer representing the wait object. If it fails, it returns a null pointer value and provides more information via the GetLastError function. Given a work object, the CloseThreadpoolWait function informs the thread pool that the object may be released. This function doesn’t return a value, and for efficiency assumes the wait object is valid.

The unique_handle class template I introduced in my July 2011 column ( takes care of these details.

Here’s a traits class that can be used with unique_handle, as well as a typedef for convenience:

struct wait_traits
  static PTP_WAIT invalid() throw()
    return nullptr;
  static void close(PTP_WAIT value) throw()
typedef unique_handle<PTP_WAIT, wait_traits> wait;

I can now use the typedef and create a wait object as follows:

void * context = ...
wait w(CreateThreadpoolWait(callback, context, nullptr));

As usual, the final parameter optionally accepts a pointer to an environment so that you can associate the wait object with an environment, as I described in my September column. The first parameter is the callback function that will be queued to the thread pool once the wait completes. The wait callback is declared as follows:


The callback TP_WAIT_RESULT argument is just an unsigned integer providing the reason why the wait completed. A value of WAIT_OBJECT_0 indicates that the wait was satisfied as the synchronization object became signaled. Alternatively, a value of WAIT_TIMEOUT indicates that the timeout interval elapsed before the synchronization object was signaled. How would you indicate the timeout and synchronization object to wait for? That’s the job of the surprisingly complex SetThreadpoolWait function. This function is simple enough until you try to specify a timeout. Consider this example:

handle e(CreateEvent( ... ));
SetThreadpoolWait(w.get(), e.get(), nullptr);

First, I create an event object, using the unique_handle typedef from my July column. Not surprisingly, the SetThreadpoolWait function sets the synchronization object that the wait object is to wait for. The last parameter indicates an optional timeout, but in this example, I provide a null pointer value, indicating that the thread pool should wait indefinitely.

The FILETIME Structure

But what about a specific timeout? That’s where it gets tricky. Functions such as WaitForSingleObject let you set a timeout value in milliseconds as an unsigned integer. The SetThreadpoolWait function, however, expects a pointer to a FILETIME structure, which presents a few challenges to the developer. The FILETIME structure is a 64-bit value representing an absolute date and time since the beginning of the year 1601 in 100-nanosecond intervals (based on Coordinated Universal Time).

To accommodate relative time intervals, SetThreadpoolWait treats the FILETIME structure as a signed 64-bit value. If a negative value is provided, it takes the unsigned value as a time interval relative to the current time, again in 100-nanosecond intervals. It’s worth mentioning that the relative timer stops counting when the computer is sleeping or hibernating. Absolute timeout values are obviously not affected by this. Anyway, this use of FILETIME is not convenient for either absolute or relative timeout values.

Probably the simplest approach for absolute timeouts is to fill out a SYSTEMTIME structure and then use the SystemTimeToFileTime function to prepare a FILETIME structure for you:

st.wYear = ...
st.wMonth = ...
st.wDay = ...
st.wHour = ...
// etc.
check_bool(SystemTimeToFileTime(&st, &ft));
SetThreadpoolWait(w.get(), e.get(), &ft);

For relative timeout values, a bit more thinking is involved. First, you need to convert some relative time into 100-nanosecond intervals, and then convert it to a negative 64-bit value. The latter is trickier than it seems. Remember that computers represent signed integers using the two’s complement system, with the effect that a negative value must have its most significant bit set high. Added to this is the fact that FILETIME actually consists of two 32-bit values. This means you also need to handle machine alignment properly when treating it as a 64-bit value, otherwise an alignment fault may occur. Additionally, you can’t simply use the lower 32-bit values to store the value, as the most significant bit is in the higher 32-bit values.

Relative Timeout Value Conversion

It’s common to express relative timeouts in milliseconds, so let me demonstrate that conversion here. Recall that a millisecond is a thousandth of a second and a nanosecond is a billionth of a second. Another way to look at it is that a millisecond is 1,000 microseconds and a microsecond is 1,000 nanoseconds. A millisecond is then 10,000 100-nanosecond units, the unit of measurement expected by SetThreadpoolWait. There are many ways to express this, but here’s one approach that works:

DWORD milliseconds = ...
auto ft64 = -static_cast<INT64>(milliseconds) * 10000;
memcpy(&ft, &ft64, sizeof(INT64));
SetThreadpoolWait(w.get(), e.get(), &ft);

Notice that I’m careful to cast the DWORD before the multiplication to avoid integer overflow. I also use memcpy, because a reinterpret_cast would require the FILETIME to be aligned on an 8-byte boundary. You could, of course, do that instead, but this is a bit cleaner. An even simpler approach takes advantage of the fact that the Visual C++ compiler aligns a union with the largest alignment requirement of any of its members. In fact, if you order the union members correctly, you can do it in just one line, as follows:

union FILETIME64
  INT64 quad;
FILETIME64 ft = { -static_cast<INT64>(milliseconds) * 10000 };
SetThreadpoolWait(w.get(), e.get(), &ft.ft);

Enough compiler tricks. Let’s get back to the thread pool. Another thing you might be tempted to try is a zero timeout. This is commonly done using WaitForSingleObject as a way to determine whether a synchronization object is signaled without actually blocking and waiting. However, this procedure isn’t supported by the thread pool, so you’re better off sticking with WaitForSingleObject.

If you want a particular work object to cease waiting for its synchronization object, then simply call SetThreadpoolWait with a null pointer value as its second parameter. Just watch out for the obvious race condition.

The final function related to wait objects is WaitForThreadpoolWaitCallbacks. At first, it may appear similar to the WaitForThreadpoolWorkCallbacks function used with work objects, which I introduced in my August column. Don’t let it fool you. The WaitForThreadpoolWaitCallbacks function literally does what its name suggests. It waits for any callbacks from the particular wait object.

The catch is that the wait object will only queue a callback when either the associated synchronization object is signaled or the timeout expires. Until one of those events occurs, no callbacks are queued and there’s nothing for the Wait function to wait for. The solution is to first call SetThreadpoolWait with null pointer values, telling the wait object to cease waiting, and then call WaitForThreadpoolWaitCallbacks to avoid any race conditions:

SetThreadpoolWait(w.get(), nullptr, nullptr);
WaitForThreadpoolWaitCallbacks(w.get(), TRUE);

As you might expect, the second parameter determines whether any pending callbacks that may have slipped through but have not yet begun to execute will be canceled or not. Naturally, wait objects work well with cleanup groups. You can read my October 2011 ( column to find out how to use cleanup groups. On larger projects, they really do help simplify a lot of the trickier cancelation and cleanup that needs to be done.           


Kenny Kerr is a software craftsman with a passion for native Windows development. Reach him at

Thanks to the following technical expert for reviewing this article: Pedro Teixeira