Durable orchestrations

Durable Functions is an extension of Azure Functions. You can use an orchestrator function to orchestrate the execution of other Durable functions within a function app. Orchestrator functions have the following characteristics:

  • Orchestrator functions define function workflows using procedural code. No declarative schemas or designers are needed.
  • Orchestrator functions can call other durable functions synchronously and asynchronously. Output from called functions can be reliably saved to local variables.
  • Orchestrator functions are durable and reliable. Execution progress is automatically checkpointed when the function "awaits" or "yields". Local state is never lost when the process recycles or the VM reboots.
  • Orchestrator functions can be long-running. The total lifespan of an orchestration instance can be seconds, days, months, or never-ending.

This article gives you an overview of orchestrator functions and how they can help you solve various app development challenges. If you are not already familiar with the types of functions available in a Durable Functions app, read the Durable Function types article first.

Orchestration identity

Each instance of an orchestration has an instance identifier (also known as an instance ID). By default, each instance ID is an autogenerated GUID. However, instance IDs can also be any user-generated string value. Each orchestration instance ID must be unique within a task hub.

The following are some rules about instance IDs:

  • Instance IDs must be between 1 and 100 characters.
  • Instance IDs must not start with @.
  • Instance IDs must not contain /, \, #, or ? characters.
  • Instance IDs must not contain control characters.

Note

It is generally recommended to use autogenerated instance IDs whenever possible. User-generated instance IDs are intended for scenarios where there is a one-to-one mapping between an orchestration instance and some external application-specific entity, like a purchase order or a document.

Also, the actual enforcement of character restriction rules may vary depending on the storage provider being used by the app. To ensure correct behavior and compatibility, it's strongly recommended that you follow the instance ID rules listed previously.

An orchestration's instance ID is a required parameter for most instance management operations. They are also important for diagnostics, such as searching through orchestration tracking data in Application Insights for troubleshooting or analytics purposes. For this reason, it is recommended to save generated instance IDs to some external location (for example, a database or in application logs) where they can be easily referenced later.

Reliability

Orchestrator functions reliably maintain their execution state by using the event sourcing design pattern. Instead of directly storing the current state of an orchestration, the Durable Task Framework uses an append-only store to record the full series of actions the function orchestration takes. An append-only store has many benefits compared to "dumping" the full runtime state. Benefits include increased performance, scalability, and responsiveness. You also get eventual consistency for transactional data and full audit trails and history. The audit trails support reliable compensating actions.

Durable Functions uses event sourcing transparently. Behind the scenes, the await (C#) or yield (JavaScript/Python) operator in an orchestrator function yields control of the orchestrator thread back to the Durable Task Framework dispatcher. In the case of Java, there is no special language keyword. Instead, calling .await() on a task will yield control back to the dispatcher via a custom Throwable. The dispatcher then commits any new actions that the orchestrator function scheduled (such as calling one or more child functions or scheduling a durable timer) to storage. The transparent commit action updates the execution history of the orchestration instance by appending all new events into storage, much like an append-only log. Similarly, the commit action creates messages in storage to schedule the actual work. At this point, the orchestrator function can be unloaded from memory. By default, Durable Functions uses Azure Storage as its runtime state store, but other storage providers are also supported.

When an orchestration function is given more work to do (for example, a response message is received or a durable timer expires), the orchestrator wakes up and re-executes the entire function from the start to rebuild the local state. During the replay, if the code tries to call a function (or do any other async work), the Durable Task Framework consults the execution history of the current orchestration. If it finds that the activity function has already executed and yielded a result, it replays that function's result and the orchestrator code continues to run. Replay continues until the function code is finished or until it has scheduled new async work.

Note

In order for the replay pattern to work correctly and reliably, orchestrator function code must be deterministic. Non-deterministic orchestrator code may result in runtime errors or other unexpected behavior. For more information about code restrictions for orchestrator functions, see the orchestrator function code constraints documentation.

Note

If an orchestrator function emits log messages, the replay behavior may cause duplicate log messages to be emitted. See the Logging topic to learn more about why this behavior occurs and how to work around it.

Orchestration history

The event-sourcing behavior of the Durable Task Framework is closely coupled with the orchestrator function code you write. Suppose you have an activity-chaining orchestrator function, like the following orchestrator function:

Note

Version 4 of the Node.js programming model for Azure Functions is generally available. The new v4 model is designed to have a more flexible and intuitive experience for JavaScript and TypeScript developers. Learn more about the differences between v3 and v4 in the migration guide.

In the following code snippets, JavaScript (PM4) denotes programming model V4, the new experience.

[FunctionName("HelloCities")]
public static async Task<List<string>> Run(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    var outputs = new List<string>();

    outputs.Add(await context.CallActivityAsync<string>("SayHello", "Tokyo"));
    outputs.Add(await context.CallActivityAsync<string>("SayHello", "Seattle"));
    outputs.Add(await context.CallActivityAsync<string>("SayHello", "London"));

    // returns ["Hello Tokyo!", "Hello Seattle!", "Hello London!"]
    return outputs;
}

Whenever an activity function is scheduled, the Durable Task Framework checkpoints the execution state of the function into some durable storage backend (Azure Table storage by default). This state is what's referred to as the orchestration history.

History table

Generally speaking, the Durable Task Framework does the following at each checkpoint:

  1. Saves execution history into durable storage.
  2. Enqueues messages for functions the orchestrator wants to invoke.
  3. Enqueues messages for the orchestrator itself — for example, durable timer messages.

Once the checkpoint is complete, the orchestrator function is free to be removed from memory until there is more work for it to do.

Note

Azure Storage does not provide any transactional guarantees between saving data into table storage and queues. To handle failures, the Durable Functions Azure Storage provider uses eventual consistency patterns. These patterns ensure that no data is lost if there is a crash or loss of connectivity in the middle of a checkpoint. Alternate storage providers, such as the Durable Functions MSSQL storage provider, may provide stronger consistency guarantees.

Upon completion, the history of the function shown earlier looks something like the following table in Azure Table Storage (abbreviated for illustration purposes):

PartitionKey (InstanceId) EventType Timestamp Input Name Result Status
eaee885b ExecutionStarted 2021-05-05T18:45:28.852Z null HelloCities
eaee885b OrchestratorStarted 2021-05-05T18:45:32.362Z
eaee885b TaskScheduled 2021-05-05T18:45:32.670Z SayHello
eaee885b OrchestratorCompleted 2021-05-05T18:45:32.670Z
eaee885b TaskCompleted 2021-05-05T18:45:34.201Z """Hello Tokyo!"""
eaee885b OrchestratorStarted 2021-05-05T18:45:34.232Z
eaee885b TaskScheduled 2021-05-05T18:45:34.435Z SayHello
eaee885b OrchestratorCompleted 2021-05-05T18:45:34.435Z
eaee885b TaskCompleted 2021-05-05T18:45:34.763Z """Hello Seattle!"""
eaee885b OrchestratorStarted 2021-05-05T18:45:34.857Z
eaee885b TaskScheduled 2021-05-05T18:45:34.857Z SayHello
eaee885b OrchestratorCompleted 2021-05-05T18:45:34.857Z
eaee885b TaskCompleted 2021-05-05T18:45:34.919Z """Hello London!"""
eaee885b OrchestratorStarted 2021-05-05T18:45:35.032Z
eaee885b OrchestratorCompleted 2021-05-05T18:45:35.044Z
eaee885b ExecutionCompleted 2021-05-05T18:45:35.044Z "[""Hello Tokyo!"",""Hello Seattle!"",""Hello London!""]" Completed

A few notes on the column values:

  • PartitionKey: Contains the instance ID of the orchestration.
  • EventType: Represents the type of the event. You can find detailed descriptions of all the history event types here.
  • Timestamp: The UTC timestamp of the history event.
  • Name: The name of the function that was invoked.
  • Input: The JSON-formatted input of the function.
  • Result: The output of the function; that is, its return value.

Warning

While it's useful as a debugging tool, don't take any dependency on this table. It may change as the Durable Functions extension evolves.

Every time the function is resumed after waiting for a task to complete, the Durable Task Framework reruns the orchestrator function from scratch. On each rerun, it consults the execution history to determine whether the current async task has completed. If the execution history shows that the task has already completed, the framework replays the output of that task and moves on to the next task. This process continues until the entire execution history has been replayed. Once the current execution history has been replayed, the local variables will have been restored to their previous values.

Features and patterns

The next sections describe the features and patterns of orchestrator functions.

Sub-orchestrations

Orchestrator functions can call activity functions, but also other orchestrator functions. For example, you can build a larger orchestration out of a library of orchestrator functions. Or, you can run multiple instances of an orchestrator function in parallel.

For more information and for examples, see the Sub-orchestrations article.

Durable timers

Orchestrations can schedule durable timers to implement delays or to set up timeout handling on async actions. Use durable timers in orchestrator functions instead of language-native "sleep" APIs.

For more information and for examples, see the Durable timers article.

External events

Orchestrator functions can wait for external events to update an orchestration instance. This Durable Functions feature often is useful for handling a human interaction or other external callbacks.

For more information and for examples, see the External events article.

Error handling

Orchestrator functions can use the error-handling features of the programming language. Existing patterns like try/catch are supported in orchestration code.

Orchestrator functions can also add retry policies to the activity or sub-orchestrator functions that they call. If an activity or sub-orchestrator function fails with an exception, the specified retry policy can automatically delay and retry the execution up to a specified number of times.

Note

If there is an unhandled exception in an orchestrator function, the orchestration instance will complete in a Failed state. An orchestration instance cannot be retried once it has failed.

For more information and for examples, see the Error handling article.

Critical sections (Durable Functions 2.x, currently .NET only)

Orchestration instances are single-threaded so it isn't necessary to worry about race conditions within an orchestration. However, race conditions are possible when orchestrations interact with external systems. To mitigate race conditions when interacting with external systems, orchestrator functions can define critical sections using a LockAsync method in .NET.

The following sample code shows an orchestrator function that defines a critical section. It enters the critical section using the LockAsync method. This method requires passing one or more references to a Durable Entity, which durably manages the lock state. Only a single instance of this orchestration can execute the code in the critical section at a time.

[FunctionName("Synchronize")]
public static async Task Synchronize(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    var lockId = new EntityId("LockEntity", "MyLockIdentifier");
    using (await context.LockAsync(lockId))
    {
        // critical section - only one orchestration can enter at a time
    }
}

The LockAsync acquires the durable lock(s) and returns an IDisposable that ends the critical section when disposed. This IDisposable result can be used together with a using block to get a syntactic representation of the critical section. When an orchestrator function enters a critical section, only one instance can execute that block of code. Any other instances that try to enter the critical section will be blocked until the previous instance exits the critical section.

The critical section feature is also useful for coordinating changes to durable entities. For more information about critical sections, see the Durable entities "Entity coordination" topic.

Note

Critical sections are available in Durable Functions 2.0. Currently, only .NET in-proc orchestrations implement this feature. Entities and critical sections are not yet available in Durable Functions for dotnet-isolated worker.

Calling HTTP endpoints (Durable Functions 2.x)

Orchestrator functions aren't permitted to do I/O, as described in orchestrator function code constraints. The typical workaround for this limitation is to wrap any code that needs to do I/O in an activity function. Orchestrations that interact with external systems frequently use activity functions to make HTTP calls and return the result to the orchestration.

To simplify this common pattern, orchestrator functions can use the CallHttpAsync method to invoke HTTP APIs directly.

[FunctionName("CheckSiteAvailable")]
public static async Task CheckSiteAvailable(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    Uri url = context.GetInput<Uri>();

    // Makes an HTTP GET request to the specified endpoint
    DurableHttpResponse response = 
        await context.CallHttpAsync(HttpMethod.Get, url);

    if ((int)response.StatusCode == 400)
    {
        // handling of error codes goes here
    }
}

In addition to supporting basic request/response patterns, the method supports automatic handling of common async HTTP 202 polling patterns, and also supports authentication with external services using Managed Identities.

For more information and for detailed examples, see the HTTP features article.

Note

Calling HTTP endpoints directly from orchestrator functions is available in Durable Functions 2.0 and above.

Passing multiple parameters

It isn't possible to pass multiple parameters to an activity function directly. The recommendation is to pass in an array of objects or composite objects.

In .NET you can also use ValueTuple objects. The following sample is using new features of ValueTuple added with C# 7:

[FunctionName("GetCourseRecommendations")]
public static async Task<object> RunOrchestrator(
    [OrchestrationTrigger] IDurableOrchestrationContext context)
{
    string major = "ComputerScience";
    int universityYear = context.GetInput<int>();

    object courseRecommendations = await context.CallActivityAsync<object>(
        "CourseRecommendations",
        (major, universityYear));
    return courseRecommendations;
}

[FunctionName("CourseRecommendations")]
public static async Task<object> Mapper([ActivityTrigger] IDurableActivityContext inputs)
{
    // parse input for student's major and year in university
    (string Major, int UniversityYear) studentInfo = inputs.GetInput<(string, int)>();

    // retrieve and return course recommendations by major and university year
    return new
    {
        major = studentInfo.Major,
        universityYear = studentInfo.UniversityYear,
        recommendedCourses = new []
        {
            "Introduction to .NET Programming",
            "Introduction to Linux",
            "Becoming an Entrepreneur"
        }
    };
}

Next steps