您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

将现成节点池添加到 Azure Kubernetes 服务 (AKS) 群集Add a spot node pool to an Azure Kubernetes Service (AKS) cluster

点节点池是由 点虚拟机规模集支持的节点池。A spot node pool is a node pool backed by a spot virtual machine scale set. 通过将点 Vm 用于 AKS 群集,可以在 Azure 中充分利用未利用容量,同时节省大量成本。Using spot VMs for nodes with your AKS cluster allows you to take advantage of unutilized capacity in Azure at a significant cost savings. 可用的未利用容量取决于许多因素,包括节点大小、区域和当天的时间。The amount of available unutilized capacity will vary based on many factors, including node size, region, and time of day.

部署点节点池时,如果有可用的容量,Azure 将分配点节点。When deploying a spot node pool, Azure will allocate the spot nodes if there's capacity available. 但对于专色节点没有 SLA。But there's no SLA for the spot nodes. 用于支持点节点池的位置规模集部署在单个容错域中,不提供高可用性保证。A spot scale set that backs the spot node pool is deployed in a single fault domain and offers no high availability guarantees. Azure 在 Azure 基础结构将逐出点节点时,可以随时将其恢复。At any time when Azure needs the capacity back, the Azure infrastructure will evict spot nodes.

专色节点非常适用于可处理中断、提前终止或逐出的工作负荷。Spot nodes are great for workloads that can handle interruptions, early terminations, or evictions. 例如,工作负荷(如批处理作业、开发和测试环境以及大型计算工作负荷)可能是计划在点节点池上的候选项。For example, workloads such as batch processing jobs, development and testing environments, and large compute workloads may be good candidates to be scheduled on a spot node pool.

本文介绍如何将辅助点节点池添加到现有的 Azure Kubernetes 服务 (AKS) 群集。In this article, you add a secondary spot node pool to an existing Azure Kubernetes Service (AKS) cluster.

本文假设读者基本了解 Kubernetes 和 Azure 负载均衡器的概念。This article assumes a basic understanding of Kubernetes and Azure Load Balancer concepts. 有关详细信息,请参阅 Azure Kubernetes 服务 (AKS) 的 Kubernetes 核心概念For more information, see Kubernetes core concepts for Azure Kubernetes Service (AKS).

如果没有 Azure 订阅,请在开始之前创建一个免费帐户If you don't have an Azure subscription, create a free account before you begin.

准备阶段Before you begin

当你创建群集以使用点节点池时,该群集还必须将虚拟机规模集用于节点池和 标准 SKU 负载均衡器。When you create a cluster to use a spot node pool, that cluster must also use Virtual Machine Scale Sets for node pools and the Standard SKU load balancer. 创建群集后,还必须添加其他节点池,才能使用点节点池。You must also add an additional node pool after you create your cluster to use a spot node pool. 稍后的步骤中介绍了添加其他节点池。Adding an additional node pool is covered in a later step.

本文要求运行 Azure CLI 版本 2.14 或更高版本。This article requires that you are running the Azure CLI version 2.14 or later. 运行 az --version 即可查找版本。Run az --version to find the version. 如果需要进行安装或升级,请参阅安装 Azure CLIIf you need to install or upgrade, see Install Azure CLI.

限制Limitations

使用点节点池创建和管理 AKS 群集时,有以下限制:The following limitations apply when you create and manage AKS clusters with a spot node pool:

  • 点节点池不能是群集的默认节点池。A spot node pool can't be the cluster's default node pool. 点节点池只能用于辅助池。A spot node pool can only be used for a secondary pool.
  • 无法升级点节点池,因为点节点池无法保证 cordon 和排出。You can't upgrade a spot node pool since spot node pools can't guarantee cordon and drain. 必须将现有的专色节点池替换为新的点节点,以执行操作,例如升级 Kubernetes 版本。You must replace your existing spot node pool with a new one to do operations such as upgrading the Kubernetes version. 若要替换点节点池,请使用不同版本的 Kubernetes 创建新的专色节点池,等待其状态为 " 就绪",然后删除旧节点池。To replace a spot node pool, create a new spot node pool with a different version of Kubernetes, wait until its status is Ready, then remove the old node pool.
  • 不能同时升级控制平面和节点池。The control plane and node pools cannot be upgraded at the same time. 您必须单独升级它们,或者删除点节点池以同时升级控制面和剩余节点池。You must upgrade them separately or remove the spot node pool to upgrade the control plane and remaining node pools at the same time.
  • 专色节点池必须使用虚拟机规模集。A spot node pool must use Virtual Machine Scale Sets.
  • 创建后,不能更改 ScaleSetPriority 或 SpotMaxPrice。You cannot change ScaleSetPriority or SpotMaxPrice after creation.
  • 设置 SpotMaxPrice 时,值必须为-1,或者为正值,最多包含五个小数位数。When setting SpotMaxPrice, the value must be -1 or a positive value with up to five decimal places.
  • 一个点节点池具有标签 kubernetes.azure.com/scalesetpriority:spot、破坏 kubernetes.azure.com/scalesetpriority=spot:NoSchedule 和系统箱将具有抗关联。A spot node pool will have the label kubernetes.azure.com/scalesetpriority:spot, the taint kubernetes.azure.com/scalesetpriority=spot:NoSchedule, and system pods will have anti-affinity.
  • 必须添加相应的 toleration 来计划点节点池中的工作负荷。You must add a corresponding toleration to schedule workloads on a spot node pool.

将现成节点池添加到 AKS 群集Add a spot node pool to an AKS cluster

必须将一个专色节点池添加到已启用多个节点池的现有群集。You must add a spot node pool to an existing cluster that has multiple node pools enabled. 有关创建具有多个节点池的 AKS 群集的详细信息,请参阅 此处More details on creating an AKS cluster with multiple node pools are available here.

使用 az aks nodepool add 创建节点池。Create a node pool using the az aks nodepool add.

az aks nodepool add \
    --resource-group myResourceGroup \
    --cluster-name myAKSCluster \
    --name spotnodepool \
    --priority Spot \
    --eviction-policy Delete \
    --spot-max-price -1 \
    --enable-cluster-autoscaler \
    --min-count 1 \
    --max-count 3 \
    --no-wait

默认情况下,当你创建具有多个节点池的群集时,将在 AKS 群集中创建 优先级 为 " 常规 " 的节点池。By default, you create a node pool with a priority of Regular in your AKS cluster when you create a cluster with multiple node pools. 上述命令将辅助节点池添加到具有 ""优先级 的现有 AKS 群集。The above command adds an auxiliary node pool to an existing AKS cluster with a priority of Spot. 优先级 使节点池成为一个点节点池。The priority of Spot makes the node pool a spot node pool. 在上面的示例中, 逐出策略 参数设置为 Delete ,这是默认值。The eviction-policy parameter is set to Delete in the above example, which is the default value. 逐出策略 设置为 " 删除" 时,节点池的底层规模集中的节点会在被逐出时删除。When you set the eviction policy to Delete, nodes in the underlying scale set of the node pool are deleted when they're evicted. 你还可以将逐出策略设置为 解除分配You can also set the eviction policy to Deallocate. 将逐出策略设置为 解除分配 时,基础规模集中的节点会在逐出时设置为已停止释放的状态。When you set the eviction policy to Deallocate, nodes in the underlying scale set are set to the stopped-deallocated state upon eviction. 已停止解除分配状态中的节点会根据计算配额进行计数,并可能会导致群集缩放或升级问题。Nodes in the stopped-deallocated state count against your compute quota and can cause issues with cluster scaling or upgrading. 优先级逐出策略 值只能在创建节点池的过程中设置。The priority and eviction-policy values can only be set during node pool creation. 以后不能更新这些值。Those values can't be updated later.

该命令还启用 群集自动缩放程序,这是建议用于污点节点池的。The command also enables the cluster autoscaler, which is recommended to use with spot node pools. 根据群集中运行的工作负荷,群集自动缩放程序扩展并缩小节点池中的节点数。Based on the workloads running in your cluster, the cluster autoscaler scales up and scales down the number of nodes in the node pool. 对于点节点池,如果仍需要其他节点,则群集自动缩放程序将在逐出之后增加节点数。For spot node pools, the cluster autoscaler will scale up the number of nodes after an eviction if additional nodes are still needed. 如果更改节点池可以具有的最大节点数,还需要调整 maxCount 与群集自动缩放程序关联的值。If you change the maximum number of nodes a node pool can have, you also need to adjust the maxCount value associated with the cluster autoscaler. 如果不使用群集自动缩放程序,则在逐出时,点池最终将减小到零,并需要手动操作才能接收任何其他的专色节点。If you do not use a cluster autoscaler, upon eviction, the spot pool will eventually decrease to zero and require a manual operation to receive any additional spot nodes.

重要

仅在可处理中断的点节点池(如批处理作业和测试环境)上计划工作负荷。Only schedule workloads on spot node pools that can handle interruptions, such as batch processing jobs and testing environments. 建议在点节点池上设置 taints 和 tolerations ,以确保仅在点节点池上安排可处理节点逐出的工作负荷。It is recommended that you set up taints and tolerations on your spot node pool to ensure that only workloads that can handle node evictions are scheduled on a spot node pool. 例如,上述命令默认情况下会添加 kubernetes.azure.com/scalesetpriority=spot:NoSchedule 的破坏,因此在此节点上只计划具有相应 toleration 的 pod。For example, the above command ny default adds a taint of kubernetes.azure.com/scalesetpriority=spot:NoSchedule so only pods with a corresponding toleration are scheduled on this node.

验证点节点池Verify the spot node pool

验证是否已将节点池添加为点节点池:To verify your node pool has been added as a spot node pool:

az aks nodepool show --resource-group myResourceGroup --cluster-name myAKSCluster --name spotnodepool

确认 scaleSetPriority 为 " "。Confirm scaleSetPriority is Spot.

若要将 pod 计划为在专色节点上运行,请添加与应用到您的点节点的破坏相对应的 toleration。To schedule a pod to run on a spot node, add a toleration that corresponds to the taint applied to your spot node. 下面的示例显示了 yaml 文件的一部分,该部分定义了与上一步中使用的 kubernetes.azure.com/scalesetpriority=spot:NoSchedule 破坏相对应的 toleration。The following example shows a portion of a yaml file that defines a toleration that corresponds to a kubernetes.azure.com/scalesetpriority=spot:NoSchedule taint used in the previous step.

spec:
  containers:
  - name: spot-example
  tolerations:
  - key: "kubernetes.azure.com/scalesetpriority"
    operator: "Equal"
    value: "spot"
    effect: "NoSchedule"
   ...

部署具有此 toleration 的 pod 时,Kubernetes 可以成功地在应用了破坏的节点上计划 pod。When a pod with this toleration is deployed, Kubernetes can successfully schedule the pod on the nodes with the taint applied.

某个位置池的最大价格Max price for a spot pool

基于区域和 SKU,专色实例的定价是可变的。Pricing for spot instances is variable, based on region and SKU. 有关详细信息,请参阅针对 LinuxWindows 的定价。For more information, see pricing for Linux and Windows.

使用可变定价,你可以设置最高价格,以美元 (USD) 为单位,最多可使用 5 个小数位。With variable pricing, you have option to set a max price, in US dollars (USD), using up to 5 decimal places. 例如,值 0.98765 的最大价格为 $0.98765 美元/小时。For example, the value 0.98765 would be a max price of $0.98765 USD per hour. 如果将最大价格设置为 -1,则不会根据价格收回实例。If you set the max price to -1, the instance won't be evicted based on price. 如果有可用容量和配额,则实例的价格将是当前的长期价格或标准实例的价格。The price for the instance will be the current price for Spot or the price for a standard instance, whichever is less, as long as there is capacity and quota available.

后续步骤Next steps

本文介绍了如何将专色节点池添加到 AKS 群集。In this article, you learned how to add a spot node pool to an AKS cluster. 有关如何跨节点池控制 pod 的详细信息,请参阅有关 AKS 中的高级计划程序功能的最佳做法For more information about how to control pods across node pools, see Best practices for advanced scheduler features in AKS.