Manage and increase quotas for resources with Azure Machine Learning
Azure uses limits and quotas to prevent budget overruns due to fraud, and to honor Azure capacity constraints. Consider these limits as you scale for production workloads. In this article, you learn about:
- Default limits on Azure resources related to Azure Machine Learning.
- Creating workspace-level quotas.
- Viewing your quotas and limits.
- Requesting quota increases.
- Private endpoint and DNS quotas.
A quota is a credit limit, not a capacity guarantee. If you have large-scale capacity needs, contact Azure support to increase your quota.
A quota is shared across all the services in your subscriptions, including Azure Machine Learning. Calculate usage across all services when you're evaluating capacity.
Azure Machine Learning compute is an exception. It has a separate quota from the core compute quota.
Default limits vary by offer category type, such as free trial, pay-as-you-go, and virtual machine (VM) series (such as Dv2, F, and G).
Default resource quotas
In this section, you learn about the default and maximum quota limits for the following resources:
- Azure Machine Learning assets
- Azure Machine Learning compute
- Azure Machine Learning pipelines
- Virtual machines
- Azure Container Instances
- Azure Storage
Limits are subject to change. For the latest information, see Service limits in Azure Machine Learning.
Azure Machine Learning assets
The following limits on assets apply on a per-workspace basis.
In addition, the maximum run time is 30 days and the maximum number of metrics logged per run is 1 million.
Azure Machine Learning Compute
Azure Machine Learning Compute has a default quota limit on both the number of cores (split by each VM Family and cumulative total cores) as well as the number of unique compute resources allowed per region in a subscription. This quota is separate from the VM core quota listed in the previous section as it applies only to the managed compute resources of Azure Machine Learning.
Request a quota increase to raise the limits for various VM family core quotas, total subscription core quotas and resources in this section.
Dedicated cores per region have a default limit of 24 to 300, depending on your subscription offer type. You can increase the number of dedicated cores per subscription for each VM family. Specialized VM families like NCv2, NCv3, or ND series start with a default of zero cores.
Low-priority cores per region have a default limit of 100 to 3,000, depending on your subscription offer type. The number of low-priority cores per subscription can be increased and is a single value across VM families.
Clusters per region have a default limit of 200. These are shared between a training cluster and a compute instance. (A compute instance is considered a single-node cluster for quota purposes.)
To learn more about which VM family to request a quota increase for, check out virtual machine sizes in Azure. For instance GPU VM families start with an "N" in their family name (eg. NCv3 series)
The following table shows additional limits in the platform. Please reach out to the AzureML product team through a technical support ticket to request an exception.
|Resource or Action||Maximum limit|
|Workspaces per resource group||800|
|Nodes in a single Azure Machine Learning Compute (AmlCompute) cluster setup as a non communication-enabled pool (i.e. cannot run MPI jobs)||100 nodes but configurable up to 65000 nodes|
|Nodes in a single Parallel Run Step run on an Azure Machine Learning Compute (AmlCompute) cluster||100 nodes but configurable up to 65000 nodes if your cluster is setup to scale per above|
|Nodes in a single Azure Machine Learning Compute (AmlCompute) cluster setup as a communication-enabled pool||300 nodes but configurable up to 4000 nodes|
|Nodes in a single Azure Machine Learning Compute (AmlCompute) cluster setup as a communication-enabled pool on an RDMA enabled VM Family||100 nodes|
|Nodes in a single MPI run on an Azure Machine Learning Compute (AmlCompute) cluster||100 nodes but can be increased to 300 nodes|
|GPU MPI processes per node||1-4|
|GPU workers per node||1-4|
|Job lifetime||21 days1|
|Job lifetime on a low-priority node||7 days2|
|Parameter servers per node||1|
1 Maximum lifetime is the duration between when a run starts and when it finishes. Completed runs persist indefinitely. Data for runs not completed within the maximum lifetime is not accessible. 2 Jobs on a low-priority node can be preempted whenever there's a capacity constraint. We recommend that you implement checkpoints in your job.
Azure Machine Learning pipelines
Azure Machine Learning pipelines have the following limits.
|Steps in a pipeline||30,000|
|Workspaces per resource group||800|
Each Azure subscription has a limit on the number of virtual machines across all services. Virtual machine cores have a regional total limit and a regional limit per size series. Both limits are separately enforced.
For example, consider a subscription with a US East total VM core limit of 30, an A series core limit of 30, and a D series core limit of 30. This subscription would be allowed to deploy 30 A1 VMs, or 30 D1 VMs, or a combination of the two that does not exceed a total of 30 cores.
You can't raise limits for virtual machines above the values shown in the following table.
|Subscriptions associated with an Azure Active Directory tenant||Unlimited|
|Coadministrators per subscription||Unlimited|
|Resource groups per subscription||980|
|Azure Resource Manager API request size||4,194,304 bytes|
|Tags per subscription1||50|
|Unique tag calculations per subscription1||80,000|
|Subscription-level deployments per location||8002|
1You can apply up to 50 tags directly to a subscription. However, the subscription can contain an unlimited number of tags that are applied to resource groups and resources within the subscription. The number of tags per resource or resource group is limited to 50. Resource Manager returns a list of unique tag name and values in the subscription only when the number of tags is 80,000 or less. You still can find a resource by tag when the number exceeds 80,000.
2If you reach the limit of 800 deployments, delete deployments that are no longer needed from the history. To delete subscription-level deployments, use Remove-AzDeployment or az deployment sub delete.
For more information, see Container Instances limits.
Azure Storage has a limit of 250 storage accounts per region, per subscription. This limit includes both Standard and Premium storage accounts.
To increase the limit, make a request through Azure Support. The Azure Storage team will review your case and can approve up to 250 storage accounts for a region.
Use workspace-level quotas to manage Azure Machine Learning compute target allocation between multiple workspaces in the same subscription.
By default, all workspaces share the same quota as the subscription-level quota for VM families. However, you can set a maximum quota for individual VM families on workspaces in a subscription. This lets you share capacity and avoid resource contention issues.
- Go to any workspace in your subscription.
- In the left pane, select Usages + quotas.
- Select the Configure quotas tab to view the quotas.
- Expand a VM family.
- Set a quota limit on any workspace listed under that VM family.
You can't set a negative value or a value higher than the subscription-level quota.
You need subscription-level permissions to set a quota at the workspace level.
View your usage and quotas
To view your quota for various Azure resources like virtual machines, storage, or network, use the Azure portal:
On the left pane, select All services and then select Subscriptions under the General category.
From the list of subscriptions, select the subscription whose quota you're looking for.
Select Usage + quotas to view your current quota limits and usage. Use the filters to select the provider and locations.
You manage the Azure Machine Learning compute quota on your subscription separately from other Azure quotas:
Go to your Azure Machine Learning workspace in the Azure portal.
On the left pane, in the Support + troubleshooting section, select Usage + quotas to view your current quota limits and usage.
Select a subscription to view the quota limits. Filter to the region you're interested in.
You can switch between a subscription-level view and a workspace-level view.
Request quota increases
To raise the limit or quota above the default limit, open an online customer support request at no charge.
You can't raise limits above the maximum values shown in the preceding tables. If there's no maximum limit, you can't adjust the limit for the resource.
When you're requesting a quota increase, select the service that you have in mind. For example, select Azure Machine Learning, Container Instances, or Storage. For Azure Machine Learning compute, you can select the Request Quota button while viewing the quota in the preceding steps.
Free trial subscriptions are not eligible for limit or quota increases. If you have a free trial subscription, you can upgrade to a pay-as-you-go subscription. For more information, see Upgrade Azure free trial to pay-as-you-go and Azure free account FAQ.
Private endpoint and private DNS quota increases
There are limits on the number of private endpoints and private DNS zones that you can create in a subscription.
Azure Machine Learning creates resources in your (customer) subscription, but some scenarios create resources in a Microsoft-owned subscription.
In the following scenarios, you might need to request a quota allowance in the Microsoft-owned subscription:
- Azure Private Link enabled workspace with a customer-managed key (CMK)
- Attaching a Private Link enabled Azure Kubernetes Service cluster to your workspace
To request an allowance for these scenarios, use the following steps:
Create an Azure support request and select the following options in the Basics section:
Field Selection Issue type Technical Service My services. Then select Machine Learning in the drop-down list. Problem type Workspace Configuration and Security Problem subtype Private Endpoint and Private DNS Zone allowance request
In the Details section, use the Description field to provide the Azure region and the scenario that you plan to use. If you need to request quota increases for multiple subscriptions, list the subscription IDs in this field.
Select Create to create the request.