您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

将低优先级 VM 与 Batch 配合使用Use low-priority VMs with Batch

Azure Batch 可提供低优先级虚拟机 (VM) 来降低 Batch 工作负荷的成本。Azure Batch offers low-priority virtual machines (VMs) to reduce the cost of Batch workloads. 低优先级 VM 提供大量的经济型计算资源,使新型 Batch 工作负荷成为可能。Low-priority VMs make new types of Batch workloads possible by enabling a large amount of compute power to be used for a very low cost.

低优先级 VM 利用 Azure 中多余的容量。Low-priority VMs take advantage of surplus capacity in Azure. 在池中指定低优先级 VM 时,Azure Batch 可以自动使用此多余容量(如果可用)。When you specify low-priority VMs in your pools, Azure Batch can use this surplus, when available.

使用低优先级虚拟机的代价是这些虚拟机可能不可用,并将其分配,或在任何时间,具体取决于可用的容量可能会被抢占。The tradeoff for using low-priority VMs is that those VMs may not be available to be allocated or may be preempted at any time, depending on available capacity. 出于此原因,低优先级 VM 最适合用于某些类型的工作负荷。For this reason, low-priority VMs are most suitable for certain types of workloads. 对于作业完成时间很灵活且工作分布在多个 VM 上的批处理和异步处理工作负荷,可以使用低优先级 VM。Use low-priority VMs for batch and asynchronous processing workloads where the job completion time is flexible and the work is distributed across many VMs.

与专用 VM 相比,以显著低廉的价格提供低优先级 VM。Low-priority VMs are offered at a significantly reduced price compared with dedicated VMs. 有关价格详细信息,请参阅 Batch 定价For pricing details, see Batch Pricing.

低优先级 VM 的用例Use cases for low-priority VMs

根据低优先级 VM 的特征,哪些工作负荷可以和不可以使用它们?Given the characteristics of low-priority VMs, what workloads can and cannot use them? 一般情况下,处理工作负荷很适合使用低优先级 VM,因为作业已分解成多个并行任务,或者许多作业已横向扩展并分散在多个 VM 中。In general, batch processing workloads are a good fit, as jobs are broken into many parallel tasks or there are many jobs that are scaled out and distributed across many VMs.

  • 为了最大程度地利用 Azure 中的多余容量,可以横向扩展适当的作业。To maximize use of surplus capacity in Azure, suitable jobs can scale out.

  • VM 有时可能不可用或者被占用,导致作业可用的容量减少,或者导致任务中断,需要重新运行。Occasionally VMs may not be available or are preempted, which results in reduced capacity for jobs and may lead to task interruption and reruns. 因此,运行作业所要花费的时间必须灵活。Jobs must therefore be flexible in the time they can take to run.

  • 对于包含长时间运行的任务的作业,如果任务被中断,这些作业受到的影响可能更大。Jobs with longer tasks may be impacted more if interrupted. 如果长时间运行的任务实施检查点来保存其执行进度,则可以降低中断产生的影响。If long-running tasks implement checkpointing to save progress as they execute, then the impact of interruption is reduced. 执行时间较短的任务更适合在低优先级 VM 上运行,因为这样可以大大降低中断产生的影响。Tasks with shorter execution times tend to work best with low-priority VMs, because the impact of interruption is far less.

  • 利用多个 VM 的长时间运行的 MPI 作业不是很适合使用低优先级 VM,因为一个被占用的 VM 可能会导致需要重新运行整个作业。Long-running MPI jobs that utilize multiple VMs are not well suited to use low-priority VMs, because one preempted VM can lead to the whole job having to run again.

适合使用低优先级 VM 的批处理用例示例包括:Some examples of batch processing use cases well suited to use low-priority VMs are:

  • 开发和测试:具体而言,开发大规模解决方案时可以实现极大的节省。Development and testing: In particular, if large-scale solutions are being developed, significant savings can be realized. 所有类型的测试都可以受益,但大规模负载测试和回归测试可以获得极大的好处。All types of testing can benefit, but large-scale load testing and regression testing are great uses.

  • 补充按需容量:低优先级 VM 可用于补充常规的专用 VM - 如果可用,则作业可以扩展,因而能够以更低的成本、更快的速度完成;如果不可用,则仍可遵循专用 VM 的基准。Supplementing on-demand capacity: Low-priority VMs can be used to supplement regular dedicated VMs - when available, jobs can scale and therefore complete quicker for lower cost; when not available, the baseline of dedicated VMs remains available.

  • 灵活的作业执行时间:如果作业必须完成的时间比较灵活,则可以容忍潜在的容量下降;但是,增加的低优先级 VM 作业往往能够以更低的成本、更快的速度运行。Flexible job execution time: If there is flexibility in the time jobs have to complete, then potential drops in capacity can be tolerated; however, with the addition of low-priority VMs jobs frequently run faster and for a lower cost.

可以根据作业执行时间的灵活度,通过多种方式将 Batch 池配置为使用低优先级 VM。Batch pools can be configured to use low-priority VMs in a few ways, depending on the flexibility in job execution time:

  • 低优先级 VM 仅可在池中使用。Low-priority VMs can solely be used in a pool. 在这种情况下,Batch 将在可用时恢复任何占用的容量。In this case, Batch recovers any preempted capacity when available. 此配置是执行作业最节省的方法,因为只使用低优先级 VM。This configuration is the cheapest way to execute jobs, as only low-priority VMs are used.

  • 低优先级 VM 可与具有固定基准的专用 VM 结合使用。Low-priority VMs can be used in conjunction with a fixed baseline of dedicated VMs. 专用 VM 的固定数量确保始终能够提供一些容量来保持作业的持续运行。The fixed number of dedicated VMs ensures there is always some capacity to keep a job progressing.

  • 可以将专用 VM 和低优先级 VM 动态混用,以便在廉价的低优先级 VM 可用时专门使用此类 VM,并根据需要扩展昂贵的专用 VM。There can be dynamic mix of dedicated and low-priority VMs, so that the cheaper low-priority VMs are solely used when available, but the full-priced dedicated VMs are scaled up when required. 此配置将保留最小可用容量以保持作业的进展。This configuration keeps a minimum amount of capacity available to keep the jobs progressing.

低优先级 VM 的 Batch 支持Batch support for low-priority VMs

Azure Batch 提供多种功能来方便你使用低优先级 VM 并从中受益:Azure Batch provides several capabilities that make it easy to consume and benefit from low-priority VMs:

  • Batch 池可以包含专用 VM 和低优先级 VM。Batch pools can contain both dedicated VMs and low-priority VMs. 创建池时可以指定每种类型的 VM 的数量,或者随时使用显式的大小调整操作或自动缩放来更改现有池的 VM 数量。The number of each type of VM can be specified when a pool is created, or changed at any time for an existing pool, using the explicit resize operation or using auto-scale. 作业和任务提交内容可保持不变,而无需考虑池中的 VM 类型。Job and task submission can remain unchanged, regardless of the VM types in the pool. 此外,还可以配置某个池完全使用低优先级 VM,以尽量最低的成本运行作业,但在容量降至最低阈值之下时,运转专用 VM 来保持作业的运行。You can also configure a pool to completely use low-priority VMs to run jobs as cheaply as possible, but spin up dedicated VMs if the capacity drops below a minimum threshold, to keep jobs running.

  • Batch 池会自动确定低优先级 VM 的目标数量。Batch pools automatically seek the target number of low-priority VMs. 如果 VM 被占用,Batch 将尝试替换丢失的容量并返回到目标。If VMs are preempted, then Batch attempts to replace the lost capacity and return to the target.

  • 当任务被中断时,Batch 将检测并自动请求任务以再次运行。When tasks are interrupted, Batch detects and automatically requeues tasks to run again.

  • 低优先级 VM 具有不同于专用 VM 的单独 vCPU 配额。Low-priority VMs have a separate vCPU quota that differs from the one for dedicated VMs. 因为低优先级 VM 成本更低,因此,低优先级 VM 的配额高于专用 VM 的配额。The quota for low-priority VMs is higher than the quota for dedicated VMs, because low-priority VMs cost less. 有关详细信息,请参阅 Batch 服务的配额和限制For more information, see Batch service quotas and limits.


用户订阅模式下创建的 Batch 帐户目前不支持低优先级 VM。Low-priority VMs are not currently supported for Batch accounts created in user subscription mode.

创建和更新池Create and update pools

Batch 池可以包含专用 VM 和低优先级 VM(也称为计算节点)。A Batch pool can contain both dedicated and low-priority VMs (also referred to as compute nodes). 可为专用 VM 和低优先级 VM 设置计算节点的目标数量。You can set the target number of compute nodes for both dedicated and low-priority VMs. 节点的目标数量指定要在池中包含的 VM 数量。The target number of nodes specifies the number of VMs you want to have in the pool.

例如,若要使用 Azure 云服务 VM 创建包含目标为 5 个专用 VM 和 20 个低优先级 VM 的池,请使用以下代码:For example, to create a pool using Azure cloud service VMs with a target of 5 dedicated VMs and 20 low-priority VMs:

CloudPool pool = batchClient.PoolOperations.CreatePool(
    poolId: "cspool",
    targetDedicatedComputeNodes: 5,
    targetLowPriorityComputeNodes: 20,
    virtualMachineSize: "Standard_D2_v2",
    cloudServiceConfiguration: new CloudServiceConfiguration(osFamily: "5") // WS 2016

若要使用 Azure 虚拟机(在本例中为 Linux VM)创建包含目标为 5 个专用 VM 和 20 个低优先级 VM 的池,请使用以下代码:To create a pool using Azure virtual machines (in this case Linux VMs) with a target of 5 dedicated VMs and 20 low-priority VMs:

ImageReference imageRef = new ImageReference(
    publisher: "Canonical",
    offer: "UbuntuServer",
    sku: "16.04-LTS",
    version: "latest");

// Create the pool
VirtualMachineConfiguration virtualMachineConfiguration =
    new VirtualMachineConfiguration("batch.node.ubuntu 16.04", imageRef);

pool = batchClient.PoolOperations.CreatePool(
    poolId: "vmpool",
    targetDedicatedComputeNodes: 5,
    targetLowPriorityComputeNodes: 20,
    virtualMachineSize: "Standard_D2_v2",
    virtualMachineConfiguration: virtualMachineConfiguration);

可以获取专用 VM 和低优先级 VM 的当前节点数:You can get the current number of nodes for both dedicated and low-priority VMs:

int? numDedicated = pool1.CurrentDedicatedComputeNodes;
int? numLowPri = pool1.CurrentLowPriorityComputeNodes;

池节点提供一个属性用于指示节点是专用 VM 还是低优先级 VM:Pool nodes have a property to indicate if the node is a dedicated or low-priority VM:

bool? isNodeDedicated = poolNode.IsDedicated;

当池中的一个或多个节点被占用时,池上的列表节点操作仍会返回这些节点。When one or more nodes in a pool are preempted, a list nodes operation on the pool still returns those nodes. 低优先级节点的当前数量保持不变,但这些节点会将其状态设置为“已占用”。The current number of low-priority nodes remains unchanged, but those nodes have their state set to the Preempted state. Batch 会尝试查找替代 VM,如果成功,节点将依次经历“正在创建”和“正在启动”状态,然后才可用于执行任务,就像新的节点一样。Batch attempts to find replacement VMs and, if successful, the nodes go through Creating and then Starting states before becoming available for task execution, just like new nodes.

缩放包含低优先级 VM 的池Scale a pool containing low-priority VMs

与仅包含专用 VM 的池一样,可以通过调用 Resize 方法或使用自动缩放来缩放包含低优先级 VM 的池。As with pools solely consisting of dedicated VMs, it is possible to scale a pool containing low-priority VMs by calling the Resize method or by using autoscale.

池调整大小操作采用可更新 targetLowPriorityNodes 值的另一个可选参数:The pool resize operation takes a second optional parameter that updates the value of targetLowPriorityNodes:

pool.Resize(targetDedicatedComputeNodes: 0, targetLowPriorityComputeNodes: 25);

池自动缩放公式支持低优先级 VM,如下所示:The pool autoscale formula supports low-priority VMs as follows:

  • 可以获取或设置服务定义的变量 $TargetLowPriorityNodes 的值。You can get or set the value of the service-defined variable $TargetLowPriorityNodes.

  • 可以获取服务定义的变量 $CurrentLowPriorityNodes 的值。You can get the value of the service-defined variable $CurrentLowPriorityNodes.

  • 可以获取服务定义的变量 $PreemptedNodeCount 的值。You can get the value of the service-defined variable $PreemptedNodeCount. 此变量返回处于已取代状态的节点的数量,并可让你根据不可用的已取代节点数增加或减少专用节点的数量。This variable returns the number of nodes in the preempted state and allows you to scale up or down the number of dedicated nodes, depending on the number of preempted nodes that are unavailable.

作业和任务Jobs and tasks

作业和任务对于低优先级节点几乎不需要额外配置;唯一的支持如下:Jobs and tasks require little additional configuration for low-priority nodes; the only support is as follows:

  • 作业的 JobManagerTask 属性包含新属性 AllowLowPriorityNodeThe JobManagerTask property of a job has a new property, AllowLowPriorityNode. 如果此属性为 true,则可以在专用或低优先级节点上计划作业管理器任务。When this property is true, the job manager task can be scheduled on either a dedicated or low-priority node. 如果此属性为 false,则只会在专用节点上计划作业管理器任务。If this property is false, the job manager task is scheduled to a dedicated node only.

  • 可对任务应用程序使用一个环境变量,使该应用程序能够确定它是在低优先级节点还是专用节点上运行。An environment variable is available to a task application so that it can determine whether it is running on a low-priority or dedicated node. 该环境变量为 AZ_BATCH_NODE_IS_DEDICATED。The environment variable is AZ_BATCH_NODE_IS_DEDICATED.

处理取代Handling preemption

VM 有时会被占用;如果发生占用情况,Batch 将执行以下操作:VMs may occasionally be preempted; when preemption happens, Batch does the following:

  • 将已取代的 VM 的状态更新为“已取代”。The preempted VMs have their state updated to Preempted.
  • 如果已取代的节点 VM 上有运行中的任务,这些任务将重新排队并重新运行。If tasks were running on the preempted node VMs, then those tasks are requeued and run again.
  • VM 被实际删除,导致 VM 本地存储的所有数据丢失。The VM is effectively deleted, leading to loss of any data stored locally on the VM.
  • 池将不断地尝试用完低优先级节点的可用目标数量。The pool continually attempts to reach the target number of low-priority nodes available. 如果找到替代容量,节点将保留其 ID 但会被重新初始化,依次经历“正在创建”和“正在启动”状态,然后可供任务计划使用。When replacement capacity is found, the nodes keep their IDs, but are reinitialized, going through Creating and Starting states before they are available for task scheduling.
  • Azure 门户以指标形式提供取代计数。Preemption counts are available as a metric in the Azure portal.


Azure 门户提供了低优先级节点的新指标。New metrics are available in the Azure portal for low-priority nodes. 这些指标是:These metrics are:

  • 低优先级节点计数Low-Priority Node Count
  • 低优先级核心计数Low-Priority Core Count
  • 已占用节点计数Preempted Node Count

在 Azure 门户中查看指标:To view metrics in the Azure portal:

  1. 在门户中导航到 Batch 帐户,查看此帐户设置。Navigate to your Batch account in the portal, and view the settings for your Batch account.
  2. 从“监视”部分选择“指标”。Select Metrics from the Monitoring section.
  3. 从“可用指标”列表选择所需指标。Select the metrics you desire from the Available Metrics list.


后续步骤Next steps

  • 对于准备使用 Batch 的任何人,有必要阅读 面向开发人员的 Batch 功能概述了解基本信息。Read the Batch feature overview for developers, essential information for anyone preparing to use Batch. 本文中包含有关 Batch 服务资源(如池、节点、作业和任务)以及生成 Batch 应用程序时可以使用的许多 API 功能的更多详细信息。The article contains more detailed information about Batch service resources like pools, nodes, jobs, and tasks, and the many API features that you can use while building your Batch application.
  • 了解适用于生成批处理解决方案的批处理 API 和工具Learn about the Batch APIs and tools available for building Batch solutions.