February 2012

Volume 27 Number 02

Asynchronous Programming - Asynchronous Programming in C++ Using PPL

By Artur Laksberg | February 2012

Hollywood casting directors are often said to brush off aspiring performers with a dismissive “don’t call us; we’ll call you.” For developers, however, that phrase describes the way many software frameworks work—instead of letting the programmer drive the flow of control for the whole application, the framework controls the environment and invokes callbacks or event handlers provided by the programmer.

In asynchronous systems, this paradigm lets you decouple the start of the asynchronous operation from its completion. The programmer initiates the operation and then registers a callback that will be invoked when the results are available. Not having to wait for completion means you can do useful work while the operation is in progress—service the message loop or start other asynchronous operations, for example. The “frosted window,” the “spinning donut” and other such phenomena will become relics of the past if you follow this pattern rigorously for all potentially blocking operations. Your apps will become—you’ve heard this one before—fast and fluid.

In Windows 8, asynchronous operations are ubiquitous, and WinRT offers a new programming model for dealing with asynchrony in a consistent way.

Figure 1demonstrates the basic pattern of working with asynchronous operations. In the code, a C++ function reads a string from a file.

Figure 1 Reading from a File

template<typename Callback>
void ReadString(String^ fileName, Callback func)
{
  StorageFolder^ item = KnownFolders::PicturesLibrary;

  auto getFileOp = item->GetFileAsync(fileName);
  getFileOp->Completed = ref new AsyncOperationCompletedHandler<StorageFile^>
    ([=](IAsyncOperation<StorageFile^>^ operation, AsyncStatus status)
  {
    auto storageFile = operation->GetResults();
    auto openOp = storageFile->OpenAsync(FileAccessMode::Read);
    openOp->Completed = 
      ref new AsyncOperationCompletedHandler <IRandomAccessStream^>
      ([=](IAsyncOperation<IRandomAccessStream^>^ operation, AsyncStatus status)
    {
      auto istream = operation->GetResults();
      auto reader = ref new DataReader(istream);
      auto loadOp = reader->LoadAsync(istream->Size);
      loadOp->Completed = ref new AsyncOperationCompletedHandler<UINT>
        ([=](IAsyncOperation<UINT>^ operation, AsyncStatus status)
      {
        auto bytesRead = operation->GetResults();
        auto str = reader->ReadString(bytesRead);
        func(str);
      });
    });
  });
}

The first thing to notice is that the return type of ReadString is void. That’s right: The function doesn’t return a value; instead it takes a user-provided callback and invokes it when the result is available. Welcome to the world of asynchronous programming—don’t call us; we’ll call you!

The Anatomy of a WinRT Asynchronous Operation

At the heart of the asynchrony in WinRT are the four interfaces defined in the Windows::Foundation namespace: IAsyncOperation, IAsyncAction, IAsyncOperationWithProgress and IAsyncActionWithProgress. All potentially blocking or long-running operations in WinRT are defined as asynchronous. By convention, the name of the method ends with “Async” and the return type is one of the four interfaces. Such is the method GetFileAsync in the example in Figure 1, returning an IAsyncOperation<StorageFile^>. Many asynchronous operations do not return a value and their type is IAsyncAction. The operations that can report progress are exposed via IAsync­OperationWithProgress and IAsyncActionWithProgress.

To specify the completion callback for an asynchronous operation, you set the Completed property. This property is a delegate that takes the asynchronous interface and the status of the completion. Though the delegate can be instantiated with a function pointer, most often you’d use a lambda (I expect that by now you’re familiar with this part of C++11).

To get the value of the operation, you call the GetResults method on the interface. Notice that though this is the same interface returned to you from the GetFileAsync call, you can only call GetResults on it when you’re within the completion handler.

The second parameter to the completion delegate is AsyncStatus, which returns the status of the operation. In a real world application, you’d check its value before calling GetResults. In Figure 1, I omitted this part for brevity.

Very often, you’ll find yourself using multiple asynchronous operations together. In my example, I first get an instance of StorageFile (by calling GetFileAsync), then open it using OpenAsync and getting IInputStream. Next, I load the data (LoadAsync) and read it using the DataReader. Finally, I get the string and call the user-provided callback func.

Composition

Separating the start of the operation from the completion is essential for eliminating blocking calls. The problem is, composing multiple callback-based asynchronous operations is hard, and the resulting code is difficult to reason about and debug. Something has to be done to rein in the ensuing “callback soup.”

Let’s consider a concrete example. I want to use the ReadString function from the previous sample to read from two files sequentially and concatenate the result into a single string. I’m going to again implement it as a function taking a callback:

template<typename Callback>
void ConcatFiles1(String^ file1, String^ file2, Callback func)
{
  ReadString(file1, [func](String^ str1) {
    ReadString(file2, [func](String^ str2) {
      func(str1+str2);
    });
  });
}

Not too bad, right?

If you don’t see a flaw in this solution, though, think about this: When will you start reading from file2? Do you really need to finish reading the first file before you start reading the second one? Of course not! It’s far better to start multiple asynchronous operations eagerly and deal with the data as it comes in.

Let’s give it a try. First, because I start two operations concurrently and return from the function before the operations complete, I’ll need a special heap-allocated object to hold the intermediate results. I call it the ResultHolder:

ref struct ResultHolder
{
  String^ str;
};

As Figure 2 shows, the first operation to succeed will set the results-&gt;str member. The second operation to complete will use that to form the final result.

Figure 2 Reading from Two Files Concurrently

template<typename Callback>
void ConcatFiles(String^ file1, String^ file2, Callback func)
{
  auto results = ref new ResultHolder();

  ReadString(file1, [=](String^ str) {
    if(results->str != nullptr) { // Beware of the race condition!
      func(str + results->str);
    }
    else{
      results->str = str;
    }
  });

  ReadString(file2, [=](String^ str) {
    if(results->str != nullptr) { // Beware of the race condition!
      func(results->str + str);
    }
    else{
      results->str = str;
    }
  }); 
}

This will work … most of the time. The code has an obvious race condition, and it doesn’t handle errors, so we still have a long way to go. For something as simple as joining two operations, that’s an awful lot of code—and it’s tricky to get right.

Tasks in the Parallel Patterns Library

The Visual Studio Parallel Patterns Library (PPL) is designed to make writing parallel and asynchronous programs in C++ easy and productive. Instead of operating at the level of threads and thread pools, users of PPL get to use higher-level abstractions such as tasks, parallel algorithms like parallel_for and the parallel_sort and concurrency-friendly containers such as concurrent_vector.

New in the next release of Visual Studio, the task class of the PPL allows you to succinctly represent an individual unit of work to be executed asynchronously. It allows you to express your program logic in terms of independent (or interdependent) tasks and let the runtime take care of scheduling these tasks in the optimal manner.

What makes tasks so useful is their composability. In its simplest form, two tasks can be composed sequentially by declaring one task to be a continuation of another. This seemingly trivial construct enables you to combine multiple tasks in interesting ways. Many higher-level PPL constructs such as join and choice (which I’ll talk about in moment) are themselves built using this concept. Task continuations can also be used to represent completions of asynchronous operations in a more concise way. Let’s revisit the sample from Figure 1 and now write it using PPL tasks, as shown in Figure 3.

Figure 3 Reading from Files Using Nested PPL Tasks

task<String^> ReadStringTask(String^ fileName)
{
  StorageFolder^ item = KnownFolders::PicturesLibrary;
  task<StorageFile^> getFileTask(item->GetFileAsync(fileName));
  return getFileTask.then([](StorageFile^ storageFile) {
    task<IRandomAccessStream^> openTask(storageFile->OpenAsync(
      FileAccessMode::Read));
    return openTask.then([](IRandomAccessStream^ istream) {
      auto reader = ref new DataReader(istream);
      task<UINT> loadTask(reader->LoadAsync(istream->Size));
      return loadTask.then([reader](UINT bytesRead) {
        return reader->ReadString(bytesRead);
      });
    });
  });
}

Because I’m now using tasks instead of callbacks to represent asynchrony, the user-provided callback is gone. This incarnation of the function returns a task instead.

In the implementation, I created the getFileTask task from the asynchronous operation returned by GetFileAsync. I then set up the completion of that operation as a continuation of the task (the then method).

The then method deserves a closer look. The parameter to the method is a lambda expression. Actually, it could also be a function pointer, a function object, or an instance of std::function—but because lambda expressions are ubiquitous in PPL (and indeed in modern C++), from here on I’ll just say “the lambda” whenever I mean any type of a callable object.

The return type of the then method is a task of some type T. This type T is determined by the return type of the lambda passed to then. In its basic form, when the lambda returns an expression of type T, the then method returns a task<T>. For example, the lambda in the following continuation returns an int; therefore, the resulting type is a task<int>:

task<int> myTask = someOtherTask.then([]() { return 42; });

The type of the continuation used in Figure 3 is slightly different. It returns a task and performs the asynchronous unwrapping of that task so that the resulting type is not a task<task<int>> but a task<int>:

task<int> myTask = someOtherTask.then([]() {
  task<int> innerTask([]() {
    return 42; 
  });
  return innerTask;
});

If all this feels a bit dense, don’t let that slow you down. I promise after a few more motivating examples it will make more sense.

Task Composition

Armed with what was covered in the previous section, let’s continue to build on the file-reading example.

Recall that in C++ all local variables residing in functions and lambdas are lost on returning. To keep the state around, you must manually copy the variables into the heap or some other long-lived storage. That’s the reason I created the holder class earlier. In lambdas that run asynchronously, you need to be careful not to capture any state from the enclosing function by pointer or reference; otherwise, when the function finishes, you’ll end up with a pointer to an invalid memory location.

I will capitalize on the fact that the then method performs the unwrapping on the asynchronous interfaces, and rewrite our sample in a more succinct form—albeit at the cost of introducing another holder struct, shown in Figure 4.

Figure 4 Chaining Multiple Tasks

ref struct Holder
{
  IDataReader^ Reader;
};
task<String^> ReadStringTask(String^ fileName)
{
  StorageFolder^ item = KnownFolders::PicturesLibrary;

  auto holder = ref new Holder();

  task<StorageFile^> getFileTask(item->GetFileAsync(fileName));
  return getFileTask.then([](StorageFile^ storageFile) {
    return storageFile->OpenAsync(FileAccessMode::Read);
  }).then([holder](IRandomAccessStream^ istream) {
    holder->Reader = ref new DataReader(istream);
    return holder->Reader->LoadAsync(istream->Size);
  }).then([holder](UINT bytesRead) {
    return holder->Reader->ReadString(bytesRead);
  });
}

Compared with the sample in Figure 3, this code is easier to read because it resembles sequential steps as opposed to a “staircase” of nested operations.

In addition to the then method, PPL has several other compositional constructs. One is the join operation, implemented by the when_all method. The when_all method takes a sequence of tasks and returns the resulting task, which collects the output of all the constituent tasks into an std::vector. For the common case of two arguments, PPL has a convenient shorthand: the operator &&.

This is how I used the join operator to re-implement the file concatenation method:

task<String^> ConcatFiles(String^ file1, String^ file2)
{
  auto strings_task = ReadStringTask(file1) && ReadStringTask(file2);
  return strings_task.then([](std::vector<String^> strings) {
    return strings[0] + strings[1];
  });
}

The choice operation is also useful. Given a series of tasks, choice (implemented by the when_any method) completes when the first task in the sequence completes. Like join, choice has a two-argument shorthand in the form of the operator ||.

Choice is handy in scenarios such as redundant or speculative execution; you launch several tasks and the first one to complete delivers the required result. You could also add a timeout to an operation—start with an operation that returns a task and combine it with a task that sleeps for a given amount of time. If the sleeping task completes first, your operation has timed out and can therefore be discarded or canceled.

PPL has another construct that helps with composability of tasks—the task_completion_event, which you can use for interoperability of tasks and non-PPL code. A task_completion_event can be passed to a thread or to an IO completion callback that’s expected to eventually set it. A task created from the task_completion_event will be completed once the task_completion_event is set.

Authoring Asynchronous Operations with PPL

Whenever you need to extract the last ounce of performance from your hardware, C++ is the language of choice. Other languages have their place in Windows 8: The JavaScript/HTML5 combo is great for writing GUIs; C# offers a productive developer experience; and so on. To write a Metro style app, use what works for you; use what you know. In fact, you can use many languages in the same app.

Often, you’ll find yourself writing the front-end of the application in a language like JavaScript or C#, and the back-end component in C++ for maximum performance. If the operation exported by your C++ component is either compute-bound or I/O-bound, it’s a good idea to define it as an asynchronous operation.

To implement the four WinRT asynchronous interfaces mentioned earlier—IAsyncOperation, IAsyncAction, IAsyncOperation­WithProgress and IAsyncActionWithProgress—PPL defines the create_async method and the progress_reporter class, both in the concurrency namespace.

In its simplest form, create_async takes a lambda or a function pointer that returns a value. The type of the lambda determines the type of the interface returned from create_async.

Given a lambda with no parameters that returns a non-void type T, create_async returns an implementation of the IAsyncOperation. For a lambda returning void, the resulting interface is IAsyncAction.

The lambda can take a parameter of type progress_reporter

. The instance of this type is used to post progress reports of type P back to the caller. For example, the lambda taking a progress_reporter can report the percentage of completion as an integer value. The return type of the lambda in this case determines whether the resulting interface is IAsyncOperationWithProgress or IAsyncAction

. See Figure 5.

Figure 5 Authoring Asynchronous Operations in PPL

IAsyncOperation<float>^ operation = create_async([]() {
  return 42.0f;
});

IAsyncAction^ action = create_async([]() {
    // Do something, return nothing
});

IAsyncOperationWithProgress<float,int>^ operation_with_progress = 
  create_async([](progress_reporter<int> reporter) {
    for(int percent=0; percent<100; percent++) {
      reporter.report(percent);
    }
    return 42.0f;
  });

IAsyncActionWithProgress<int>^ action_with_progress = 
  create_async([](progress_reporter<int> reporter) {
    for(int percent=0; percent<100; percent++) {
      reporter.report(percent);
    }
  });

To expose an asynchronous operation to other WinRT languages, define a public ref class in your C++ component and have a function that returns one of the four asynchronous interfaces. You’ll find a concrete example of a hybrid C++/JavaScript application in the PPL Sample Pack (to get it, search online for “Asynchrony with PPL”). Here’s a snippet that exposes the image transformation routine as an asynchronous action with progress:

public ref class ImageTransformer sealed
{
public:
  //
  // Expose image transformation as an asynchronous action with progress
  //
  IAsyncActionWithProgress<int>^ GetTransformImageAsync(String^ inFile, String^ outFile);
}

As Figure 6 shows, the client part of the application is implemented in JavaScript using the promise object.

Figure 6 Consuming the Image Transformation Routine in JavaScript

var transformer = new ImageCartoonizerBackend.ImageTransformer();
...
transformer.getTransformImageAsync(copiedFile.path, dstImgPath).then(
function () {
// Handle completion…
},
function (error) {
// Handle error…
},
function (progressPercent) {
// Handle progress:
UpdateProgress(progressPercent);
}
);

Error Handling and Cancellation

Attentive readers might have noticed that this treatise on asynchrony so far completely lacks any notion of error handling and cancellation. This subject can be neglected no longer!

Inevitably, the file-reading routine will be presented with a file that doesn’t exist or can’t be opened for one reason or another. The dictionary-lookup function will encounter a word it doesn’t know. The image transformation won’t produce a result fast enough and will be canceled by the user. In these scenarios, an operation terminates prematurely, before its intended completion.

In modern C++, exceptions are used to indicate errors or other exceptional conditions. Exceptions work wonderfully within a single thread—when an exception is thrown, the stack is unwound until the appropriate catch block down the call stack is encountered. Things get messy when concurrency is thrown into the mix, because an exception originating from one thread can’t be easily caught in another thread.

Consider what happens with tasks and continuations: when the body of a task throws an exception, its flow of execution is interrupted and it can’t produce a value. If there’s no value that can be passed to the continuation, the continuation can’t run. Even for void tasks that yield no value, you need be able to tell whether the antecedent task has completed successfully.

That’s why there’s an alternative form of continuation: For a task of type T, the lambda of the error-handling continuation takes a task. To get the value produced by the antecedent task, you must call the get method on the parameter task. If the antecedent task completes successfully, so will the get method. Otherwise, get will throw an exception.

I want to emphasize an important point here. For any task in PPL, including a task created from an asynchronous operation, it is syntactically valid to call get on it. However, before the result is available, get would have to block the calling thread, and of course that would fly in the face of our “fast and fluid” mantra. Therefore, calling get on a task is discouraged in general and prohibited in an STA (the runtime will throw an “invalid operation” exception). The only time you can call get is when you’ve got the task as a parameter to a continuation. Figure 7 shows an example.

Figure 7 Error-handling Continuation

task<image> take_picture([]() {
  if (!init_camera())
    throw std::exception("can’t init camera");
  return get_image();
});

take_picture.then([](task<image> antecedent) {
  try
  {
    image img = antecedent.get();
  }
  catch (std::exception ex)
  {
    // Handle exception here
  }
});
var transformer = new ImageCartoonizerBackend.ImageTransformer();
...
transformer.getTransformImageAsync(copiedFile.path, dstImgPath).then(
  function () {
    // Handle completion…
  },
  function (error) {
    // Handle error…
  },
  function (progressPercent) {
    // Handle progress:
    UpdateProgress(progressPercent);
  }
);

Every continuation in your program can be an error-handling one, and you may choose to handle exceptions in every continuation. However, in a program composed of multiple tasks, handling exceptions in every continuation can be overkill. Fortunately, this doesn’t have to happen. Similar to unhandled exceptions working their way down the call stack until the frame where they’re caught, exceptions thrown by tasks can “trickle down” to the next continuation in the chain to the point where they are eventually handled. And handled they must be, for if an exception remains unhandled past the lifetime of the tasks that could have handled it, the runtime throws the “unobserved exception” exception.

Let’s now return to our file-reading example and augment it with error handling. All the exceptions thrown by WinRT are of type Platform::Exception, so this is what I’m going to catch in my last continuation, as shown in Figure 8.

Figure 8 Read String from File with Error Handling

task<String^> ReadStringTaskWithErrorHandling(String^ fileName)
{
  StorageFolder^ item = KnownFolders::PicturesLibrary;

  auto holder = ref new Holder();

  task<StorageFile^> getFileTask(item->GetFileAsync(fileName));
  return getFileTask.then([](StorageFile^ storageFile) {
    return storageFile->OpenAsync(FileAccessMode::Read);
  }).then([holder](IRandomAccessStream^ istream) {
    holder->Reader = ref new DataReader(istream);
    return holder->Reader->LoadAsync(istream->Size);
  }).then([holder](task<UINT> bytesReadTask) {
    try
    {
      UINT bytesRead = bytesReadTask.get();
      return holder->Reader->ReadString(bytesRead);
    }
    catch (Exception^ ex)
    {
      String^ result = ""; // return empty string
      return result;
    }
  });
}

Once the exception has been caught in a continuation, it’s considered “handled,” and the continuation returns a task that completes successfully. So, in Figure 8, the caller of the ReadStringWithErrorHandling will have no way of knowing whether the file reading completed successfully. The point I’m trying to make here is that handling exceptions too early isn’t always a good thing.

Cancellation is another form of premature termination of a task. In WinRT, as in the PPL, cancellation requires the cooperation of two parties—the client of the operation and the operation itself. Their roles are distinct: The client requests the cancellation, and the operation acknowledges the request—or not. Because of a natural race between the client and the operation, the cancellation request isn’t guaranteed to succeed.

In PPL, these two roles are represented by the two types, the cancellation_token_source and the cancellation_token. An instance of the former is used to request the cancellation by calling the cancel method on it. An instance of the latter is instantiated from the cancellation_token_source and passed as the last parameter into the constructor of the task; the then method; or in the lambda of the create_async method.

Inside the task’s body, the implementation can poll the cancellation request by calling the is_task_cancellation_requested method, and acknowledge the request by calling the cancel_current_task method. Because the cancel_current_task method throws an exception under the covers, some resource cleanup is appropriate before calling cancel_current_task. Figure 9 shows an example.

Figure 9 Canceling and Reaction to the Cancellation Request in a Task

cancellation_token_source ct;

task<int> my_task([]() {
  // Do some work
  // Check if cancellation has been requested
  if(is_task_cancellation_requested())
  {
        // Clean up resources:
        // ...
        // Cancel task:
        cancel_current_task();
  }
  // Do some more work
  return 1;
}, ct.get_token());
...
ct.cancel(); // attempt to cancel

Notice that many tasks can be canceled by the same cancellation_token_source. This is very convenient when working with chains and graphs of tasks. Instead of canceling every task individually, you can cancel all the tasks governed by a given cancellation_­token_source. Of course, there’s no guarantee that any of the tasks will actually respond to the cancellation request. Such tasks will complete, but their normal (value-based) continuations will not run. The error-handling continuations will run, but an attempt to get the value from the antecedent task will result in the task_canceled exception.

Finally, let’s look at using cancellation tokens on the production side. The lambda of the create_async method can take a cancellation_token parameter, poll it using the is_canceled method, and cancel the operation in response to the cancellation request:

IAsyncAction^ action = create_async( [](cancellation_token ct) {
  while (!ct.is_canceled()); // spin until canceled
  cancel_current_task();
});
...
action->Cancel();

Notice how in the case of the task continuation, it’s the then method that takes the cancellation token, whereas in the case of create_async, the cancellation token is passed into the lambda. In the latter case, cancellation is initiated by calling the cancel method on the resulting asynchronous interface, and that gets plumbed by the PPL into a cancellation request through the cancellation token.

Wrapping Up

As Tony Hoare once quipped, we need to teach our programs to “wait faster.” And yet, wait-free asynchronous programming remains difficult to master and its benefits are not immediately obvious, so developers shun it.

In Windows 8, all blocking operations are asynchronous, and if you’re a C++ programmer, PPL makes asynchronous programming quite palatable. Embrace the world of asynchrony, and teach your programs to wait faster!


Artur Laksberg *leads a group of developers working on the Parallel Patterns Library and the Concurrency Runtime at Microsoft. In the past, he has worked on the C++ compiler front end and was involved in the implementation of the Axum programming language. Artur can be reached at arturl@microsoft.com.*

Thanks to the following technical experts for reviewing this article: Genevieve Fernandes and Krishnan Varadarajan