Create queries to list Batch resources efficiently

Nearly all Batch applications need to perform some type of monitoring or other operation that queries the Batch service, often at regular intervals. For example, to determine whether there are any queued tasks remaining in a job, you must get data on every task in the job. To determine the status of nodes in your pool, you must get data on every node in the pool. This article explains how to execute such queries in the most efficient way.

You can increase your Azure Batch application's performance by reducing the amount of data that is returned by the service when you query jobs, tasks, compute nodes, and other resources with the Batch .NET library.

Note

The Batch service provides API support for the common scenarios of counting tasks in a job, and counting compute nodes in Batch pool. Instead of using a list query for these, you can call the Get Task Counts and List Pool Node Counts operations. These operations are more efficient than a list query, but return more limited information that may not always be up to date. For more information, see Count tasks and compute nodes by state.

Specify a detail level

In a production Batch application, entities like jobs, tasks, and compute nodes can number in the thousands. When you request information on these resources, a potentially large amount of data must "cross the wire" from the Batch service to your application on each query. By limiting the number of items and type of information that is returned by a query, you can increase the speed of your queries, and therefore the performance of your application.

This Batch .NET API code snippet lists every task that is associated with a job, along with all of the properties of each task:

// Get a collection of all of the tasks and all of their properties for job-001
IPagedEnumerable<CloudTask> allTasks =
    batchClient.JobOperations.ListTasks("job-001");

You can perform a much more efficient list query, however, by applying a "detail level" to your query. You do this by supplying an ODATADetailLevel object to the JobOperations.ListTasks method. This snippet returns only the ID, command line, and compute node information properties of completed tasks:

// Configure an ODATADetailLevel specifying a subset of tasks and
// their properties to return
ODATADetailLevel detailLevel = new ODATADetailLevel();
detailLevel.FilterClause = "state eq 'completed'";
detailLevel.SelectClause = "id,commandLine,nodeInfo";

// Supply the ODATADetailLevel to the ListTasks method
IPagedEnumerable<CloudTask> completedTasks =
    batchClient.JobOperations.ListTasks("job-001", detailLevel);

In this example scenario, if there are thousands of tasks in the job, the results from the second query will typically be returned much quicker than the first. More information about using ODATADetailLevel when you list items with the Batch .NET API is included below.

Important

We highly recommend that you always supply an ODATADetailLevel object to your .NET API list calls to ensure maximum efficiency and performance of your application. By specifying a detail level, you can help to lower Batch service response times, improve network utilization, and minimize memory usage by client applications.

Filter, select, and expand

The Batch .NET and Batch REST APIs provide the ability to reduce both the number of items that are returned in a list, as well as the amount of information that is returned for each. You do so by specifying filter, select, and expand strings when performing list queries.

Filter

The filter string is an expression that reduces the number of items that are returned. For example, you can list only the running tasks for a job, or list only compute nodes that are ready to run tasks.

The filter string consists of one or more expressions, with an expression that consists of a property name, operator, and value. The properties that can be specified are specific to each entity type that you query, as are the operators that are supported for each property. Multiple expressions can be combined by using the logical operators and and or.

This example filter string lists only the running "render" tasks: (state eq 'running') and startswith(id, 'renderTask').

Select

The select string limits the property values that are returned for each item. You specify a list of comma-separated property names, and only those property values are returned for the items in the query results.You can specify any of the properties for the entity type you are querying.

This example select string specifies that only three property values should be returned for each task: id, state, stateTransitionTime.

Expand

The expand string reduces the number of API calls that are required to obtain certain information. When you use an expand string, more information about each item can be obtained with a single API call. Rather than first obtaining the list of entities, then requesting information for each item in the list, you use an expand string to obtain the same information in a single API call, helping to improve performance by reducing API calls.

Similar to the select string, the expand string controls whether certain data is included in list query results. When all properties are required and no select string is specified, the expand string must be used to get statistics information. If a select string is used to obtain a subset of properties, then stats can be specified in the select string, and the expand string does not need to be specified.

The expand string is only supported when it is used in listing jobs, job schedules, tasks, and pools. Currently, it only supports statistics information.

This example expand string specifies that statistics information should be returned for each item in the list: stats.

Note

When constructing any of the three query string types (filter, select, and expand), you must ensure that the property names and case match that of their REST API element counterparts. For example, when working with the .NET CloudTask class, you must specify state instead of State, even though the .NET property is CloudTask.State. See the tables below for property mappings between the .NET and REST APIs.

Rules for filter, select, and expand strings

  • Properties names in filter, select, and expand strings should appear as they do in the Batch REST API, even when you use Batch .NET or one of the other Batch SDKs.

  • All property names are case-sensitive, but property values are case insensitive.

  • Date/time strings can be one of two formats, and must be preceded with DateTime.

    • W3C-DTF format example: creationTime gt DateTime'2011-05-08T08:49:37Z'
    • RFC 1123 format example: creationTime gt DateTime'Sun, 08 May 2011 08:49:37 GMT'
  • Boolean strings are either true or false.

  • If an invalid property or operator is specified, a 400 (Bad Request) error will result.

Efficient querying in Batch .NET

Within the Batch .NET API, the ODATADetailLevel class is used for supplying filter, select, and expand strings to list operations. The ODataDetailLevel class has three public string properties that can be specified in the constructor, or set directly on the object. You then pass the ODataDetailLevel object as a parameter to the various list operations such as ListPools, ListJobs, and ListTasks.

The following code snippet uses the Batch .NET API to efficiently query the Batch service for the statistics of a specific set of pools. In this scenario, the Batch user has both test and production pools. The test pool IDs are prefixed with "test", and the production pool IDs are prefixed with "prod". In the snippet, myBatchClient is a properly initialized instance of the BatchClient class.

// First we need an ODATADetailLevel instance on which to set the filter, select,
// and expand clause strings
ODATADetailLevel detailLevel = new ODATADetailLevel();

// We want to pull only the "test" pools, so we limit the number of items returned
// by using a FilterClause and specifying that the pool IDs must start with "test"
detailLevel.FilterClause = "startswith(id, 'test')";

// To further limit the data that crosses the wire, configure the SelectClause to
// limit the properties that are returned on each CloudPool object to only
// CloudPool.Id and CloudPool.Statistics
detailLevel.SelectClause = "id, stats";

// Specify the ExpandClause so that the .NET API pulls the statistics for the
// CloudPools in a single underlying REST API call. Note that we use the pool's
// REST API element name "stats" here as opposed to "Statistics" as it appears in
// the .NET API (CloudPool.Statistics)
detailLevel.ExpandClause = "stats";

// Now get our collection of pools, minimizing the amount of data that is returned
// by specifying the detail level that we configured above
List<CloudPool> testPools =
    await myBatchClient.PoolOperations.ListPools(detailLevel).ToListAsync();

Tip

An instance of ODATADetailLevel that is configured with Select and Expand clauses can also be passed to appropriate Get methods, such as PoolOperations.GetPool, to limit the amount of data that is returned.

Batch REST to .NET API mappings

Property names in filter, select, and expand strings must reflect their REST API counterparts, both in name and case. The tables below provide mappings between the .NET and REST API counterparts.

Mappings for filter strings

  • .NET list methods: Each of the .NET API methods in this column accepts an ODATADetailLevel object as a parameter.
  • REST list requests: Each REST API page linked to in this column contains a table that specifies the properties and operations that are allowed in filter strings. You use these property names and operations when you construct an ODATADetailLevel.FilterClause string.
.NET list methods REST list requests
CertificateOperations.ListCertificates List the certificates in an account
CloudTask.ListNodeFiles List the files associated with a task
JobOperations.ListJobPreparationAndReleaseTaskStatus List the status of the job preparation and job release tasks for a job
JobOperations.ListJobs List the jobs in an account
JobOperations.ListNodeFiles List the files on a node
JobOperations.ListTasks List the tasks associated with a job
JobScheduleOperations.ListJobSchedules List the job schedules in an account
JobScheduleOperations.ListJobs List the jobs associated with a job schedule
PoolOperations.ListComputeNodes List the compute nodes in a pool
PoolOperations.ListPools List the pools in an account

Mappings for select strings

  • Batch .NET types: Batch .NET API types.
  • REST API entities: Each page in this column contains one or more tables that list the REST API property names for the type. These property names are used when you construct select strings. You use these same property names when you construct an ODATADetailLevel.SelectClause string.
Batch .NET types REST API entities
Certificate Get information about a certificate
CloudJob Get information about a job
CloudJobSchedule Get information about a job schedule
ComputeNode Get information about a node
CloudPool Get information about a pool
CloudTask Get information about a task

Example: construct a filter string

When you construct a filter string for ODATADetailLevel.FilterClause, consult the table above under "Mappings for filter strings" to find the REST API documentation page that corresponds to the list operation that you wish to perform. You will find the filterable properties and their supported operators in the first multirow table on that page. If you wish to retrieve all tasks whose exit code was nonzero, for example, this row on List the tasks associated with a job specifies the applicable property string and allowable operators:

Property Operations allowed Type
executionInfo/exitCode eq, ge, gt, le , lt Int

Thus, the filter string for listing all tasks with a nonzero exit code would be:

(executionInfo/exitCode lt 0) or (executionInfo/exitCode gt 0)

Example: construct a select string

To construct ODATADetailLevel.SelectClause, consult the table above under "Mappings for select strings" and navigate to the REST API page that corresponds to the type of entity that you are listing. You will find the selectable properties and their supported operators in the first multirow table on that page. If you wish to retrieve only the ID and command line for each task in a list, for example, you will find these rows in the applicable table on Get information about a task:

Property Type Notes
id String The ID of the task.
commandLine String The command line of the task.

The select string for including only the ID and command line with each listed task would then be:

id, commandLine

Code samples

Efficient list queries code sample

The EfficientListQueries sample project on GitHub shows how efficient list querying can affect performance in an application. This C# console application creates and adds a large number of tasks to a job. Then, it makes multiple calls to the JobOperations.ListTasks method and passes ODATADetailLevel objects that are configured with different property values to vary the amount of data to be returned. It produces output similar to the following:

Adding 5000 tasks to job jobEffQuery...
5000 tasks added in 00:00:47.3467587, hit ENTER to query tasks...

4943 tasks retrieved in 00:00:04.3408081 (ExpandClause:  | FilterClause: state eq 'active' | SelectClause: id,state)
0 tasks retrieved in 00:00:00.2662920 (ExpandClause:  | FilterClause: state eq 'running' | SelectClause: id,state)
59 tasks retrieved in 00:00:00.3337760 (ExpandClause:  | FilterClause: state eq 'completed' | SelectClause: id,state)
5000 tasks retrieved in 00:00:04.1429881 (ExpandClause:  | FilterClause:  | SelectClause: id,state)
5000 tasks retrieved in 00:00:15.1016127 (ExpandClause:  | FilterClause:  | SelectClause: id,state,environmentSettings)
5000 tasks retrieved in 00:00:17.0548145 (ExpandClause: stats | FilterClause:  | SelectClause: )

Sample complete, hit ENTER to continue...

As shown in the elapsed times, you can greatly lower query response times by limiting the properties and the number of items that are returned. You can find this and other sample projects in the azure-batch-samples repository on GitHub.

BatchMetrics library and code sample

In addition to the EfficientListQueries code sample above, the BatchMetrics sample project demonstrates how to efficiently monitor Azure Batch job progress using the Batch API.

The BatchMetrics sample includes a .NET class library project which you can incorporate into your own projects, and a simple command-line program to exercise and demonstrate the use of the library.

The sample application within the project demonstrates the following operations:

  1. Selecting specific attributes in order to download only the properties you need
  2. Filtering on state transition times in order to download only changes since the last query

For example, the following method appears in the BatchMetrics library. It returns an ODATADetailLevel that specifies that only the id and state properties should be obtained for the entities that are queried. It also specifies that only entities whose state has changed since the specified DateTime parameter should be returned.

internal static ODATADetailLevel OnlyChangedAfter(DateTime time)
{
    return new ODATADetailLevel(
        selectClause: "id, state",
        filterClause: string.Format("stateTransitionTime gt DateTime'{0:o}'", time)
    );
}

Next steps