Preview - Add a spot node pool to an Azure Kubernetes Service (AKS) cluster

A spot node pool is a node pool backed by a spot virtual machine scale set. Using spot VMs for nodes with your AKS cluster allows you to take advantage of unutilized capacity in Azure at a significant cost savings. The amount of available unutilized capacity will vary based on many factors, including node size, region, and time of day.

When deploying a spot node pool, Azure will allocate the spot nodes if there's capacity available. But there's no SLA for the spot nodes. A spot scale set that backs the spot node pool is deployed in a single fault domain and offers no high availability guarantees. At any time when Azure needs the capacity back, the Azure infrastructure will evict spot nodes.

Spot nodes are great for workloads that can handle interruptions, early terminations, or evictions. For example, workloads such as batch processing jobs, development and testing environments, and large compute workloads may be good candidates to be scheduled on a spot node pool.

In this article, you add a secondary spot node pool to an existing Azure Kubernetes Service (AKS) cluster.

This article assumes a basic understanding of Kubernetes and Azure Load Balancer concepts. For more information, see Kubernetes core concepts for Azure Kubernetes Service (AKS).

This feature is currently in preview.

If you don't have an Azure subscription, create a free account before you begin.

Before you begin

When you create a cluster to use a spot node pool, that cluster must also use Virtual Machine Scale Sets for node pools and the Standard SKU load balancer. You must also add an additional node pool after you create your cluster to use a spot node pool. Adding an additional node pool is covered in a later step, but you first need to enable a preview feature.

Important

AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer support on a best-effort basis. As such, these features aren't meant for production use. AKS preview features aren't available in Azure Government or Azure China 21Vianet clouds. For more information, see the following support articles:

Register spotpoolpreview preview feature

To create an AKS cluster that uses a spot node pool, you must enable the spotpoolpreview feature flag on your subscription. This feature provides the latest set of service enhancements when configuring a cluster.

Register the spotpoolpreview feature flag using the az feature register command as shown in the following example:

az feature register --namespace "Microsoft.ContainerService" --name "spotpoolpreview"

It takes a few minutes for the status to show Registered. You can check on the registration status using the az feature list command:

az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/spotpoolpreview')].{Name:name,State:properties.state}"

When ready, refresh the registration of the Microsoft.ContainerService resource provider using the az provider register command:

az provider register --namespace Microsoft.ContainerService

Install aks-preview CLI extension

To create an AKS cluster that uses a spot node pool, you need the aks-preview CLI extension version 0.4.32 or higher. Install the aks-preview Azure CLI extension using the az extension add command, then check for any available updates using the az extension update command:

# Install the aks-preview extension
az extension add --name aks-preview
 
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview

Limitations

The following limitations apply when you create and manage AKS clusters with a spot node pool:

  • A spot node pool can't be the cluster's default node pool. A spot node pool can only be used for a secondary pool.
  • You can't upgrade a spot node pool since spot node pools can't guarantee cordon and drain. You must replace your existing spot node pool with a new one to do operations such as upgrading the Kubernetes version. To replace a spot node pool, create a new spot node pool with a different version of Kubernetes, wait until its status is Ready, then remove the old node pool.
  • The control plane and node pools cannot be upgraded at the same time. You must upgrade them separately or remove the spot node pool to upgrade the control plane and remaining node pools at the same time.
  • A spot node pool must use Virtual Machine Scale Sets.
  • You cannot change ScaleSetPriority or SpotMaxPrice after creation.
  • When setting SpotMaxPrice, the value must be -1 or a positive value with up to five decimal places.
  • A spot node pool will have the label kubernetes.azure.com/scalesetpriority:spot, the taint kubernetes.azure.com/scalesetpriority=spot:NoSchedule, and system pods will have anti-affinity.
  • You must add a corresponding toleration to schedule workloads on a spot node pool.

Add a spot node pool to an AKS cluster

You must add a spot node pool to an existing cluster that has multiple node pools enabled. More details on creating an AKS cluster with multiple node pools are available here.

Create a node pool using the az aks nodepool add.

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name spotnodepool \
    --priority Spot \
    --eviction-policy Delete \
    --spot-max-price -1 \
    --enable-cluster-autoscaler \
    --min-count 1 \
    --max-count 3 \
    --no-wait

By default, you create a node pool with a priority of Regular in your AKS cluster when you create a cluster with multiple node pools. The above command adds an auxiliary node pool to an existing AKS cluster with a priority of Spot. The priority of Spot makes the node pool a spot node pool. The eviction-policy parameter is set to Delete in the above example, which is the default value. When you set the eviction policy to Delete, nodes in the underlying scale set of the node pool are deleted when they're evicted. You can also set the eviction policy to Deallocate. When you set the eviction policy to Deallocate, nodes in the underlying scale set are set to the stopped-deallocated state upon eviction. Nodes in the stopped-deallocated state count against your compute quota and can cause issues with cluster scaling or upgrading. The priority and eviction-policy values can only be set during node pool creation. Those values can't be updated later.

The command also enables the cluster autoscaler, which is recommended to use with spot node pools. Based on the workloads running in your cluster, the cluster autoscaler scales up and scales down the number of nodes in the node pool. For spot node pools, the cluster autoscaler will scale up the number of nodes after an eviction if additional nodes are still needed. If you change the maximum number of nodes a node pool can have, you also need to adjust the maxCount value associated with the cluster autoscaler. If you do not use a cluster autoscaler, upon eviction, the spot pool will eventually decrease to zero and require a manual operation to receive any additional spot nodes.

Important

Only schedule workloads on spot node pools that can handle interruptions, such as batch processing jobs and testing environments. It is recommended that you set up taints and tolerations on your spot node pool to ensure that only workloads that can handle node evictions are scheduled on a spot node pool. For example, the above command ny default adds a taint of kubernetes.azure.com/scalesetpriority=spot:NoSchedule so only pods with a corresponding toleration are scheduled on this node.

Verify the spot node pool

To verify your node pool has been added as a spot node pool:

az aks nodepool show --resource-group myResourceGroup --cluster-name myAKSCluster --name spotnodepool

Confirm scaleSetPriority is Spot.

To schedule a pod to run on a spot node, add a toleration that corresponds to the taint applied to your spot node. The following example shows a portion of a yaml file that defines a toleration that corresponds to a kubernetes.azure.com/scalesetpriority=spot:NoSchedule taint used in the previous step.

spec:
  containers:
  - name: spot-example
  tolerations:
  - key: "kubernetes.azure.com/scalesetpriority"
    operator: "Equal"
    value: "spot"
    effect: "NoSchedule"
   ...

When a pod with this toleration is deployed, Kubernetes can successfully schedule the pod on the nodes with the taint applied.

Max price for a spot pool

Pricing for spot instances is variable, based on region and SKU. For more information, see pricing for Linux and Windows.

With variable pricing, you have option to set a max price, in US dollars (USD), using up to 5 decimal places. For example, the value 0.98765 would be a max price of $0.98765 USD per hour. If you set the max price to -1, the instance won't be evicted based on price. The price for the instance will be the current price for Spot or the price for a standard instance, whichever is less, as long as there is capacity and quota available.

Next steps

In this article, you learned how to add a spot node pool to an AKS cluster. For more information about how to control pods across node pools, see Best practices for advanced scheduler features in AKS.