Async in depth

Writing I/O- and CPU-bound asynchronous code is straightforward using the .NET Task-based async model. The model is exposed by the Task and Task<T> types and the async and await keywords in C# and Visual Basic. (Language-specific resources are found in the See also section.) This article explains how to use .NET async and provides insight into the async framework used under the covers.

Task and Task<T>

Tasks are constructs used to implement what is known as the Promise Model of Concurrency. In short, they offer you a "promise" that work will be completed at a later point, letting you coordinate with the promise with a clean API.

  • Task represents a single operation that does not return a value.
  • Task<T> represents a single operation that returns a value of type T.

It’s important to reason about tasks as abstractions of work happening asynchronously, and not an abstraction over threading. By default, tasks execute on the current thread and delegate work to the Operating System, as appropriate. Optionally, tasks can be explicitly requested to run on a separate thread via the Task.Run API.

Tasks expose an API protocol for monitoring, waiting upon and accessing the result value (in the case of Task<T>) of a task. Language integration, with the await keyword, provides a higher-level abstraction for using tasks.

Using await allows your application or service to perform useful work while a task is running by yielding control to its caller until the task is done. Your code does not need to rely on callbacks or events to continue execution after the task has been completed. The language and task API integration does that for you. If you’re using Task<T>, the await keyword will additionally "unwrap" the value returned when the Task is complete. The details of how this works are explained further below.

You can learn more about tasks and the different ways to interact with them in the Task-based Asynchronous Pattern (TAP) topic.

Deeper Dive into Tasks for an I/O-Bound Operation

The following section describes a 10,000 foot view of what happens with a typical async I/O call. Let's start with a couple examples from the following class.

The first example method GetHtmlAsync() calls an async method and returns an active task, likely yet to complete. The second example method GetFirstCharactersCountAsync() adds the use of the async and await keywords to operate on the task.

class DotNetFoundationClient
    // HttpClient is intended to be instantiated once per application, rather than per-use.
    private static readonly HttpClient s_client = new HttpClient();

    public Task<string> GetHtmlAsync()
        // Execution is synchronous here
        var uri = new Uri("");

        return s_client.GetStringAsync(uri);

    public async Task<string> GetFirstCharactersCountAsync(int count)
        // Execution is synchronous here
        var uri = new Uri("");

        // Execution of GetFirstCharactersCountAsync() is yielded to the caller here
        // GetStringAsync returns a Task<string>, which is *awaited*
        var page = await s_client.GetStringAsync(uri);

        // Execution resumes when the client.GetStringAsync task completes,
        // becoming synchronous again.

        if (count > page.Length)
            return page;
            return page.Substring(0, count);

The call to GetStringAsync() calls through lower-level .NET libraries (perhaps calling other async methods) until it reaches a P/Invoke interop call into a native networking library. The native library may subsequently call into a System API call (such as write() to a socket on Linux). A task object will be created at the native/managed boundary, possibly using TaskCompletionSource. The task object will be passed up through the layers, possibly operated on or directly returned, eventually returned to the initial caller.

In the second example method GetFirstCharactersCountAsync() above, a Task<T> object will be returned from GetStringAsync. The use of the await keyword causes the method to return a newly created task object. Control returns to the caller from this location in the GetFirstCharactersCountAsync method. The methods and properties of the Task<T> object enable callers to monitor the progress of the task, which will complete when the remaining code in GetFirstCharactersCountAsync has executed.

After the System API call, the request is now in kernel space, making its way to the networking subsystem of the OS (such as /net in the Linux Kernel). Here the OS will handle the networking request asynchronously. Details may be different depending on the OS used (the device driver call may be scheduled as a signal sent back to the runtime, or a device driver call may be made and then a signal sent back), but eventually the runtime will be informed that the networking request is in progress. At this time, the work for the device driver will either be scheduled, in-progress, or already finished (the request is already out "over the wire") - but because this is all happening asynchronously, the device driver is able to immediately handle something else!

For example, in Windows an OS thread makes a call to the network device driver and asks it to perform the networking operation via an Interrupt Request Packet (IRP) which represents the operation. The device driver receives the IRP, makes the call to the network, marks the IRP as "pending", and returns back to the OS. Because the OS thread now knows that the IRP is "pending", it doesn't have any more work to do for this job and "returns" back so that it can be used to perform other work.

When the request is fulfilled and data comes back through the device driver, it notifies the CPU of new data received via an interrupt. How this interrupt gets handled will vary depending on the OS, but eventually the data will be passed through the OS until it reaches a system interop call (for example, in Linux an interrupt handler will schedule the bottom half of the IRQ to pass the data up through the OS asynchronously). This also happens asynchronously! The result is queued up until the next available thread is able to execute the async method and "unwrap" the result of the completed task.

Throughout this entire process, a key takeaway is that no thread is dedicated to running the task. Although work is executed in some context (that is, the OS does have to pass data to a device driver and respond to an interrupt), there is no thread dedicated to waiting for data from the request to come back. This allows the system to handle a much larger volume of work rather than waiting for some I/O call to finish.

Although that may seem like a lot of work to be done, when measured in terms of wall clock time, it’s minuscule compared to the time it takes to do the actual I/O work. Although not at all precise, a potential timeline for such a call would look like this:


  • Time spent from points 0 to 1 is everything up until an async method yields control to its caller.
  • Time spent from points 1 to 2 is the time spent on I/O, with no CPU cost.
  • Finally, time spent from points 2 to 3 is passing control back (and potentially a value) to the async method, at which point it is executing again.

What does this mean for a server scenario?

This model works well with a typical server scenario workload. Because there are no threads dedicated to blocking on unfinished tasks, the server thread pool can service a much higher volume of web requests.

Consider two servers: one that runs async code, and one that does not. For this example, each server only has five threads available to service requests. This number is unrealistically small and serves only in a demonstrative context.

Assume both servers receive six concurrent requests. Each request performs an I/O operation. The server without async code has to queue up the sixth request until one of the five threads have finished the I/O-bound work and written a response. At the point that the 20th request comes in, the server might start to slow down, because the queue is getting too long.

The server with async code running on it still queues up the sixth request, but because it uses async and await, each of its threads are freed up when the I/O-bound work starts, rather than when it finishes. By the time the 20th request comes in, the queue for incoming requests will be far smaller (if it has anything in it at all), and the server won't slow down.

Although this is a contrived example, it works in a similar fashion in the real world. In fact, you can expect a server to be able to handle an order of magnitude more requests using async and await than if it were dedicating a thread for each request it receives.

What does this mean for client scenario?

The biggest gain for using async and await for a client app is an increase in responsiveness. Although you can make an app responsive by spawning threads manually, the act of spawning a thread is an expensive operation relative to just using async and await. Especially for something like a mobile game, it's crucial to impact the UI thread as little as possible where I/O is concerned.

More importantly, because I/O-bound work spends virtually no time on the CPU, dedicating an entire CPU thread to perform barely any useful work would be a poor use of resources.

Additionally, dispatching work to the UI thread (such as updating a UI) is simple with async methods, and does not require extra work (such as calling a thread-safe delegate).

Deeper Dive into Task and Task<T> for a CPU-Bound Operation

CPU-bound async code is a bit different than I/O-bound async code. Because the work is done on the CPU, there's no way to get around dedicating a thread to the computation. The use of async and await provides you with a clean way to interact with a background thread and keep the caller of the async method responsive. Note that this does not provide any protection for shared data. If you are using shared data, you will still need to apply an appropriate synchronization strategy.

Here's a 10,000 foot view of a CPU-bound async call:

public async Task<int> CalculateResult(InputData data)
    // This queues up the work on the threadpool.
    var expensiveResultTask = Task.Run(() => DoExpensiveCalculation(data));

    // Note that at this point, you can do some other work concurrently,
    // as CalculateResult() is still executing!

    // Execution of CalculateResult is yielded here!
    var result = await expensiveResultTask;

    return result;

CalculateResult() executes on the thread it was called on. When it calls Task.Run, it queues the expensive CPU-bound operation, DoExpensiveCalculation(), on the thread pool and receives a Task<int> handle. DoExpensiveCalculation() is eventually run concurrently on the next available thread, likely on another CPU core. It's possible to do concurrent work while DoExpensiveCalculation() is busy on another thread, because the thread that called CalculateResult() is still executing.

Once await is encountered, the execution of CalculateResult() is yielded to its caller, allowing other work to be done with the current thread while DoExpensiveCalculation() is churning out a result. Once it has finished, the result is queued up to run on the main thread. Eventually, the main thread will return to executing CalculateResult(), at which point it will have the result of DoExpensiveCalculation().

Why does async help here?

async and await are the best practice for managing CPU-bound work when you need responsiveness. There are multiple patterns for using async with CPU-bound work. It's important to note that there is a small cost to using async and it's not recommended for tight loops. It's up to you to determine how you write your code around this new capability.

See also