您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

管理和增大 Azure 机器学习资源的配额Manage and increase quotas for resources with Azure Machine Learning

Azure 使用限制和配额来防止由于欺诈导致的预算超支,并遵循 Azure 容量约束。Azure uses limits and quotas to prevent budget overruns due to fraud, and to honor Azure capacity constraints. 对于生产工作负荷,在缩放时请考虑这些限制。Consider these limits as you scale for production workloads. 本文介绍:In this article, you learn about:

  • Azure 机器学习相关的 Azure 资源的默认限制。Default limits on Azure resources related to Azure Machine Learning.
  • 创建工作区级别的配额。Creating workspace-level quotas.
  • 查看你的配额和限制。Viewing your quotas and limits.
  • 请求增大配额。Requesting quota increases.
  • 专用终结点和 DNS 配额。Private endpoint and DNS quotas.

除了管理配额,还可以了解如何 计划和管理 Azure 机器学习的成本 或了解 Azure 机器学习中的服务限制Along with managing quotas, you can learn how to plan and manage costs for Azure Machine Learning or learn about the service limits in Azure Machine Learning.

特殊注意事项Special considerations

  • 配额是一种信用限制,不附带容量保证。A quota is a credit limit, not a capacity guarantee. 如果有大规模容量需求,请与 Azure 支持部门联系来增加你的配额If you have large-scale capacity needs, contact Azure support to increase your quota.

  • 配额在订阅中的所有服务(包括 Azure 机器学习)之间共享。A quota is shared across all the services in your subscriptions, including Azure Machine Learning. 你在评估容量时需计算所有服务中的使用量。Calculate usage across all services when you're evaluating capacity.

    Azure 机器学习计算是一个例外。Azure Machine Learning compute is an exception. 它具有独立于核心计算配额的配额。It has a separate quota from the core compute quota.

  • 默认限制因套餐类别类型(例如免费试用、即用即付)和虚拟机 (VM) 系列(例如 Dv2、F、G 等)而异。Default limits vary by offer category type, such as free trial, pay-as-you-go, and virtual machine (VM) series (such as Dv2, F, and G).

默认资源配额Default resource quotas

在本部分中,你将了解以下资源的默认和最大配额限制:In this section, you learn about the default and maximum quota limits for the following resources:

  • Azure 机器学习资产Azure Machine Learning assets
    • Azure 机器学习计算Azure Machine Learning compute
    • Azure 机器学习管道Azure Machine Learning pipelines
  • 虚拟机Virtual machines
  • Azure 容器实例Azure Container Instances
  • Azure 存储Azure Storage

重要

限制随时会变化。Limits are subject to change. 有关最新信息,请参阅 Azure 机器学习中的服务限制For the latest information, see Service limits in Azure Machine Learning.

Azure 机器学习资产Azure Machine Learning assets

每个工作区都适用以下资产限制。The following limits on assets apply on a per-workspace basis.

资源Resource 最大限制Maximum limit
数据集Datasets 1 千万10 million
运行次数Runs 1 千万10 million
模型Models 1 千万10 million
ArtifactsArtifacts 1 千万10 million

另外,最长运行时间为 30 天,每次运行记录的指标的最大数量为 1 百万 。In addition, the maximum run time is 30 days and the maximum number of metrics logged per run is 1 million.

Azure 机器学习计算Azure Machine Learning Compute

Azure 机器学习计算为订阅中每个区域所允许的核心数(按每个 VM 系列和累积总核心数拆分)和非重复计算资源数都设置了默认配额限制。Azure Machine Learning Compute has a default quota limit on both the number of cores (split by each VM Family and cumulative total cores) as well as the number of unique compute resources allowed per region in a subscription. 此配额与上一部分中列出的 VM 核心配额不同,因为它只适用于 Azure 机器学习的托管计算资源。This quota is separate from the VM core quota listed in the previous section as it applies only to the managed compute resources of Azure Machine Learning.

请求增加配额可提升适用于本部分中各种 VM 系列核心配额、订阅核心配额总量和资源的限额。Request a quota increase to raise the limits for various VM family core quotas, total subscription core quotas and resources in this section.

可用资源:Available resources:

  • 每个区域的专用核心数 的默认限制为 24 到 300 个,具体取决于订阅套餐的类型。Dedicated cores per region have a default limit of 24 to 300, depending on your subscription offer type. 可以为每个 VM 系列提高每个订阅的专用核心数。You can increase the number of dedicated cores per subscription for each VM family. 专业化 VM 系列(例如 NCv2、NCv3 或 ND 系列)最初的默认限制为零个核心。Specialized VM families like NCv2, NCv3, or ND series start with a default of zero cores.

  • 每个区域的低优先级核心数 的默认限制为 100 到 3000 个,具体取决于订阅套餐的类型。Low-priority cores per region have a default limit of 100 to 3,000, depending on your subscription offer type. 每个订阅的低优先级核心数可以提高,对不同的 VM 系列采用单个值。The number of low-priority cores per subscription can be increased and is a single value across VM families.

  • 每个区域的群集数 的默认限制为 200。Clusters per region have a default limit of 200. 它们在训练群集和计算实例之间共享。These are shared between a training cluster and a compute instance. (就配额用途来说,可以将计算实例视为单节点群集。)(A compute instance is considered a single-node cluster for quota purposes.)

提示

若要详细了解要求增加配额的 VM 系列,请查看 Azure 中的虚拟机大小To learn more about which VM family to request a quota increase for, check out virtual machine sizes in Azure. 例如,GPU VM 系列在其系列名称中以“N”开头(如For instance GPU VM families start with an "N" in their family name (eg. NCv3 系列)NCv3 series)

下表显示了平台中的其他限制。The following table shows additional limits in the platform. 若要提出有关例外情况的请求,请通过技术支持票证与 AzureML 产品团队联系。Please reach out to the AzureML product team through a technical support ticket to request an exception.

资源或操作Resource or Action 最大限制Maximum limit
每个资源组的工作区数Workspaces per resource group 800800
设置为未启用通信的池(即无法运行 MPI 作业)的单个 Azure 机器学习计算 (AmlCompute) 群集中的节点Nodes in a single Azure Machine Learning Compute (AmlCompute) cluster setup as a non communication-enabled pool (i.e. cannot run MPI jobs) 100 个节点,但最多可配置为 65000 个节点100 nodes but configurable up to 65000 nodes
在 Azure 机器学习计算 (AmlCompute) 群集上运行的单个并行运行步骤中的节点Nodes in a single Parallel Run Step run on an Azure Machine Learning Compute (AmlCompute) cluster 100 个节点,但如果将群集设置为按上述标准缩放,则最多可配置为 65000 个节点100 nodes but configurable up to 65000 nodes if your cluster is setup to scale per above
设置为已启用通信的池的单个 Azure 机器学习计算 (AmlCompute) 群集中的节点Nodes in a single Azure Machine Learning Compute (AmlCompute) cluster setup as a communication-enabled pool 300 个节点,但最多可配置为 4000 个节点300 nodes but configurable up to 4000 nodes
在已启用 RDMA 的 VM 系列上设置为已启用通信的池的单个 Azure 机器学习计算 (AmlCompute) 群集中的节点Nodes in a single Azure Machine Learning Compute (AmlCompute) cluster setup as a communication-enabled pool on an RDMA enabled VM Family 100 个节点100 nodes
在 Azure 机器学习计算 (AmlCompute) 群集上运行的单个 MPI 中的节点Nodes in a single MPI run on an Azure Machine Learning Compute (AmlCompute) cluster 100 个节点,但可以增加到 300 个节点100 nodes but can be increased to 300 nodes
每个节点的 GPU MPI 进程数GPU MPI processes per node 1-41-4
每个节点的 GPU 辅助角色数GPU workers per node 1-41-4
作业生存期Job lifetime 21 天121 days1
低优先级节点上的作业生存期Job lifetime on a low-priority node 7 天27 days2
每个节点的参数服务器数Parameter servers per node 11

1 最大生存期是指从运行开始到运行完成之间的持续时间。1 Maximum lifetime is the duration between when a run starts and when it finishes. 已完成的运行无限期保留。Completed runs persist indefinitely. 最长生存期内未完成的运行的数据不可访问。Data for runs not completed within the maximum lifetime is not accessible. 2 每当存在容量约束时,低优先级节点上的作业可能会预先清空。2 Jobs on a low-priority node can be preempted whenever there's a capacity constraint. 我们建议在作业中实施检查点。We recommend that you implement checkpoints in your job.

Azure 机器学习管道Azure Machine Learning pipelines

Azure 机器学习管道具有以下限制。Azure Machine Learning pipelines have the following limits.

资源Resource 限制Limit
管道中的步骤Steps in a pipeline 30,00030,000
每个资源组的工作区数Workspaces per resource group 800800

虚拟机Virtual machines

每个 Azure 订阅都对所有服务中的虚拟机数量进行了限制。Each Azure subscription has a limit on the number of virtual machines across all services. 虚拟机核心数既有区域总数限制,又有按大小系列的区域限制。Virtual machine cores have a regional total limit and a regional limit per size series. 这两种限制单独实施。Both limits are separately enforced.

例如,假设某个订阅的美国东部 VM 核心总数限制为 30,A 系列核心数限制为 30,D 系列核心数限制为 30。For example, consider a subscription with a US East total VM core limit of 30, an A series core limit of 30, and a D series core limit of 30. 该订阅可以部署 30 个 A1 VM、30 个 D1 VM,或者两者的组合,但其总数不能超过 30 个核心。This subscription would be allowed to deploy 30 A1 VMs, or 30 D1 VMs, or a combination of the two that does not exceed a total of 30 cores.

不能将对虚拟机的限制提升至高于下表中显示的值。You can't raise limits for virtual machines above the values shown in the following table.

资源Resource 限制Limit
每个 Azure Active Directory 租户的订阅数Subscriptions per Azure Active Directory tenant 无限制Unlimited
每个订阅的协同管理员数Coadministrators per subscription 无限制Unlimited
每个订阅的资源组数Resource groups per subscription 980980
Azure 资源管理器 API 请求大小Azure Resource Manager API request size 4,194,304 字节4,194,304 bytes
每个订阅的标记数1Tags per subscription1 5050
每个订阅的唯一标记计算数1Unique tag calculations per subscription1 10,00010,000
每个位置的订阅级部署数Subscription-level deployments per location 80028002

1可以将最多 50 个标记直接应用于一个订阅。1You can apply up to 50 tags directly to a subscription. 但是,订阅可以包含无限数量的标记,这些标记应用于订阅中的资源组和资源。However, the subscription can contain an unlimited number of tags that are applied to resource groups and resources within the subscription. 每个资源或资源组的标记数限制为 50。The number of tags per resource or resource group is limited to 50. 当标记数少于或等于 10,000 时,资源管理器仅返回订阅中唯一标记名和值的列表Resource Manager returns a list of unique tag name and values in the subscription only when the number of tags is 10,000 or less. 即使数目超过 10,000,也仍可按标记查找资源。You still can find a resource by tag when the number exceeds 10,000.

2如果达到部署数限制 800,请从历史记录中删除不再需要的部署。2If you reach the limit of 800 deployments, delete deployments that are no longer needed from the history. 若要删除订阅级别的部署,请使用 Remove-AzDeploymentaz deployment sub deleteTo delete subscription-level deployments, use Remove-AzDeployment or az deployment sub delete.

容器实例Container Instances

有关详细信息,请参阅容器实例限制For more information, see Container Instances limits.

存储Storage

Azure 存储的限制是每个订阅在每个区域中的存储帐户数不能超过 250 个。Azure Storage has a limit of 250 storage accounts per region, per subscription. 此限制包括标准和高级存储帐户。This limit includes both Standard and Premium storage accounts.

若要提高此限制,请通过 Azure 支持提交请求。To increase the limit, make a request through Azure Support. Azure 存储团队会评审你的案例,最多可以为每个区域批准 250 个存储帐户。The Azure Storage team will review your case and can approve up to 250 storage accounts for a region.

工作区级别的配额Workspace-level quotas

使用工作区级配额来管理同一订阅中多个工作区之间的 Azure 机器学习计算目标分配。Use workspace-level quotas to manage Azure Machine Learning compute target allocation between multiple workspaces in the same subscription.

默认情况下,所有工作区的配额与任何 VM 系列的订阅级配额相同。By default, all workspaces share the same quota as the subscription-level quota for VM families. 但是,你可以在订阅中的工作区上设置各个 VM 系列的最大配额。However, you can set a maximum quota for individual VM families on workspaces in a subscription. 这使你可以共享容量,避免资源争用问题。This lets you share capacity and avoid resource contention issues.

  1. 转到你的订阅中的任何工作区。Go to any workspace in your subscription.
  2. 在左侧窗格中,选择“使用量 + 配额”。In the left pane, select Usages + quotas.
  3. 选择“配置配额”选项卡以查看配额。Select the Configure quotas tab to view the quotas.
  4. 展开某个 VM 系列。Expand a VM family.
  5. 在任何工作区(在该 VM 系列下列出)上设置配额限制。Set a quota limit on any workspace listed under that VM family.

不能设置负值或大于订阅级配额的值。You can't set a negative value or a value higher than the subscription-level quota.

屏幕截图显示了 Azure 机器学习工作区级别的配额。Screenshot that shows an Azure Machine Learning workspace-level quota.

备注

需要拥有订阅级别的权限才能在工作区级别设置配额。You need subscription-level permissions to set a quota at the workspace level.

查看使用情况和配额View your usage and quotas

若要查看各种 Azure 资源(例如虚拟机、存储或网络)的配额,请使用 Azure 门户:To view your quota for various Azure resources like virtual machines, storage, or network, use the Azure portal:

  1. 在左窗格上,选择“所有服务”,然后在“一般”类别下选择“订阅” 。On the left pane, select All services and then select Subscriptions under the General category.

  2. 从订阅列表中选择要查找其配额的订阅。From the list of subscriptions, select the subscription whose quota you're looking for.

  3. 选择“使用情况 + 配额”以查看当前的配额限制和使用情况。Select Usage + quotas to view your current quota limits and usage. 使用筛选器选择提供者和位置。Use the filters to select the provider and locations.

订阅中的 Azure 机器学习计算配额与其他 Azure 配额分开管理:You manage the Azure Machine Learning compute quota on your subscription separately from other Azure quotas:

  1. 在 Azure 门户中转到 Azure 机器学习工作区。Go to your Azure Machine Learning workspace in the Azure portal.

  2. 在左侧窗格的“支持 + 故障排除”部分中,选择“使用情况 + 配额”以查看当前的配额限制和使用情况 。On the left pane, in the Support + troubleshooting section, select Usage + quotas to view your current quota limits and usage.

  3. 选择订阅以查看配额限制。Select a subscription to view the quota limits. 筛选到你关注的区域。Filter to the region you're interested in.

  4. 你可以在订阅级视图与工作区级视图之间切换。You can switch between a subscription-level view and a workspace-level view.

请求增加配额Request quota increases

若要提高限制或配额,使其超出默认限制,可以免费提交联机客户支持请求To raise the limit or quota above the default limit, open an online customer support request at no charge.

不能将限制提升至高于上面的表中显示的最大值。You can't raise limits above the maximum values shown in the preceding tables. 如果没有最大限制,则无法调整对资源的限制。If there's no maximum limit, you can't adjust the limit for the resource.

请求增大配额时,请选择所需的服务。When you're requesting a quota increase, select the service that you have in mind. 例如,选择 Azure 机器学习、容器实例或存储。For example, select Azure Machine Learning, Container Instances, or Storage. 对于 Azure 机器学习计算,可以在按照上述步骤查看配额时选择“请求配额”按钮。For Azure Machine Learning compute, you can select the Request Quota button while viewing the quota in the preceding steps.

备注

免费试用订阅 不符合限制或配额增加的条件。Free trial subscriptions are not eligible for limit or quota increases. 如果有免费试用版订阅,可以升级到即 用即付 订阅。If you have a free trial subscription, you can upgrade to a pay-as-you-go subscription. 有关详细信息,请参阅将 Azure 免费试用版升级为即用即付AZURE 免费帐户常见问题解答For more information, see Upgrade Azure free trial to pay-as-you-go and Azure free account FAQ.

专用终结点和专用 DNS 配额增加Private endpoint and private DNS quota increases

可以在订阅中创建的专用终结点和专用 DNS 区域的数目存在限制。There are limits on the number of private endpoints and private DNS zones that you can create in a subscription.

虽然 Azure 机器学习在你的(客户)订阅中创建资源,但某些情况下,会在 Microsoft 拥有的订阅中创建资源。Azure Machine Learning creates resources in your (customer) subscription, but some scenarios create resources in a Microsoft-owned subscription.

在以下方案中,你可能需要在 Microsoft 拥有的订阅中请求配额宽限:In the following scenarios, you might need to request a quota allowance in the Microsoft-owned subscription:

  • 采用客户管理的密钥 (CMK) 且启用了 Azure 专用链接的工作区Azure Private Link enabled workspace with a customer-managed key (CMK)
  • 虚拟网络后的工作区的 Azure 容器注册表Azure Container Registry for the workspace behind your virtual network
  • 将启用了专用链接的 Azure Kubernetes 服务群集附加到你的工作区Attaching a Private Link enabled Azure Kubernetes Service cluster to your workspace

若要针对这些方案请求宽限,请使用以下步骤:To request an allowance for these scenarios, use the following steps:

  1. 创建 Azure 支持请求并在“基本信息”部分中选择以下选项:Create an Azure support request and select the following options in the Basics section:

    字段Field 选择Selection
    问题类型Issue type 技术Technical
    服务Service 我的服务My services. 然后,在下拉列表中选择“机器学习”。Then select Machine Learning in the drop-down list.
    问题类型Problem type 工作区配置和安全性Workspace Configuration and Security
    问题子类型Problem subtype 专用终结点和专用 DNS 区域宽限请求Private Endpoint and Private DNS Zone allowance request
  2. 在“详细信息”部分中,使用“说明”字段提供 Azure 区域以及计划使用的方案。In the Details section, use the Description field to provide the Azure region and the scenario that you plan to use. 如果需要为多个订阅请求增加配额,请在此字段中列出订阅 ID。If you need to request quota increases for multiple subscriptions, list the subscription IDs in this field.

  3. 选择“创建”以创建请求。Select Create to create the request.

屏幕截图显示了专用终结点和专用 DNS 配额增大请求。

后续步骤Next steps