Fault-tolerant execution

Important

This feature is currently in preview. The Supplemental Terms of Use for Microsoft Azure Previews include more legal terms that apply to Azure features that are in beta, in preview, or otherwise not yet released into general availability. For information about this specific preview, see Azure HDInsight on AKS preview information. For questions or feature suggestions, please submit a request on AskHDInsight with the details and follow us for more updates on Azure HDInsight Community.

Trino supports fault-tolerant execution to mitigate query failures and increase resilience. This article describes how you can enable fault tolerance for your Trino cluster with HDInsight on AKS.

Configuration

Fault-tolerant execution is disabled by default. It can be enabled by adding retry-policy property in config.properties settings. Learn how to manage configurations of your cluster.

Property Allowed Values Description
retry-policy QUERY or TASK Setting determines whether Trino retries failing tasks or entire queries if there's a failure.

For more details, refer Trino documentation.

To enable fault-tolerant execution on queries/tasks with a larger result set, configure an exchange manager that utilizes external storage for spooled data beyond the in-memory buffer size.

Exchange manager

Exchange manager is responsible for managing spooled data to back fault-tolerant execution. For more details, refer Trino documentation.
Trino with HDInsight on AKS supports filesystem based exchange managers that can store the data in Azure Blob Storage (ADLS Gen 2). This section describes how to configure exchange manager with Azure Blob Storage.

To set up exchange manager with Azure Blob Storage as spooling destination, you need three required properties in exchange-manager.properties file.

Property Description
exchange-manager.name Kind of storage that is used for spooled data.
exchange.base-directories Comma-separated list of URI locations that are used by exchange manager to store spooled data.
exchange.azure.connection-string Connection string property used to access the directories specified in exchange.base-directories.

Tip

You need to add exchange-manager.properties file in common component inside serviceConfigsProfiles.serviceName[“trino”] section in the cluster ARM template. Refer to manage confgurations on how to add configuration files to your cluster.

Example:

exchange-manager.name=filesystem
exchange.base-directories=abfs://container_name@account_name.dfs.core.windows.net
exchange.azure.connection-string=connection-string

The connection string takes the following form:

DefaultEndpointsProtocol=https;AccountName=<account-name>;AccountKey=<account-key>;EndpointSuffix=core.windows.net

You can find the connection string in Security + Networking -> Access keys section in the Azure portal page for your storage account, as shown in the following example:

Screenshot showing storage account connection string.

Note

Trino with HDInsight on AKS currently does not support MSI authentication in exchange manager set up backed by Azure Blob Storage.