What are compute targets in Azure Machine Learning?
A compute target is a designated compute resource/environment where you run your training script or host your service deployment. This location may be your local machine or a cloud-based compute resource. Using compute targets make it easy for you to later change your compute environment without having to change your code.
In a typical model development lifecycle, you might:
- Start by developing and experimenting on a small amount of data. At this stage, we recommend your local environment (local computer or cloud-based VM) as your compute target.
- Scale up to larger data, or do distributed training using one of these training compute targets.
- Once your model is ready, deploy it to a web hosting environment or IoT device with one of these deployment compute targets.
The compute resources you use for your compute targets are attached to a workspace. Compute resources other than the local machine are shared by users of the workspace.
Training compute targets
Azure Machine Learning has varying support across different compute resources. You can also attach your own compute resource, although support for various scenarios may vary.
Compute targets can be reused from one training job to the next. For example, once you attach a remote VM to your workspace, you can reuse it for multiple jobs. For machine learning pipelines, use the appropriate pipeline step for each compute target.
|Training targets||Automated ML||ML pipelines||Azure Machine Learning designer|
|Azure Machine Learning compute cluster||yes &
|Azure Machine Learning compute instance||yes &
|Remote VM||yes &
|Azure Databricks||yes (SDK local mode only)||yes|
|Azure Data Lake Analytics||yes|
Learn more about setting up and using a compute target for model training.
The following compute resources can be used to host your model deployment.
The compute target you use to host your model will affect the cost and availability of your deployed endpoint. Use the table below to choose an appropriate compute target.
|Compute target||Used for||GPU support||FPGA support||Description|
|Local web service||Testing/debugging||Use for limited testing and troubleshooting. Hardware acceleration depends on use of libraries in the local system.|
|Azure Machine Learning compute instance web service||Testing/debugging||Use for limited testing and troubleshooting.|
|Azure Kubernetes Service (AKS)||Real-time inference||Yes (web service deployment)||Yes||Use for high-scale production deployments. Provides fast response time and autoscaling of the deployed service. Cluster autoscaling isn't supported through the Azure Machine Learning SDK. To change the nodes in the AKS cluster, use the UI for your AKS cluster in the Azure portal. AKS is the only option available for the designer.|
|Azure Container Instances||Testing or development||Use for low-scale CPU-based workloads that require less than 48 GB of RAM.|
|Azure Machine Learning compute clusters||Batch inference||Yes (machine learning pipeline)||Run batch scoring on serverless compute. Supports normal and low-priority VMs.|
|Azure Functions||(Preview) Real-time inference|
|Azure IoT Edge||(Preview) IoT module||Deploy and serve ML models on IoT devices.|
|Azure Data Box Edge||Via IoT Edge||Yes||Deploy and serve ML models on IoT devices.|
Although compute targets like local, Azure Machine Learning compute instance, and Azure Machine Learning compute clusters support GPU for training and experimentation, using GPU for inference when deployed as a web service is supported only on Azure Kubernetes Service.
Using a GPU for inference when scoring with a machine learning pipeline is supported only on Azure Machine Learning Compute.
- Azure Container Instances (ACI) are suitable only for small models less than 1 GB in size.
- We recommend using single-node Azure Kubernetes Service (AKS) clusters for dev-test of larger models.
Azure Machine Learning compute (managed)
A managed compute resource is created and managed by Azure Machine Learning. This compute is optimized for machine learning workloads. Azure Machine Learning compute clusters and compute instances are the only managed computes. Additional managed compute resources may be added in the future.
You can create Azure Machine Learning compute instances or compute clusters from:
- Azure Machine Learning studio
- Azure portal
- Python SDK ComputeInstance and AmlCompute classes
- R SDK (preview)
- Resource Manager template. For an example template, see the create Azure Machine Learning compute template.
- Machine learning extension for the Azure CLI.
When created these compute resources are automatically part of your workspace, unlike other kinds of compute targets.
|Capability||Compute cluster||Compute instance|
|Single- or multi-node cluster||✓|
|Autoscales each time you submit a run||✓|
|Automatic cluster management and job scheduling||✓||✓|
|Support for both CPU and GPU resources||✓||✓|
When a compute cluster is idle, it autoscales to 0 nodes, so you don't pay when it's not in use. A compute instance, however, is always on and does not autoscale. You should stop the compute instance when you are not using it to avoid extra cost.
Supported VM series and sizes
When you select a node size for a managed compute resource in Azure Machine Learning, you can choose from among select VM sizes available in Azure. Azure offers a range of sizes for Linux and Windows for different workloads. Refer here to learn more about the different VM types and sizes.
There are a few exceptions and limitations to choosing a VM size:
- Some VM series are not supported in Azure Machine Learning.
- Some VM series are restricted. To use a restricted series, contact support and request a quota increase for the series. For information on contacting support, see Azure support options
See the following table to learn more about supported series and restrictions.
|Supported VM series||Restrictions|
While Azure Machine Learning supports these VM series, they may not be available in all Azure regions. You can check with VM series are available here: Products Available by Region.
An unmanaged compute target is not managed by Azure Machine Learning. You create this type of compute target outside Azure Machine Learning, then attach it to your workspace. Unmanaged compute resources can require additional steps for you to maintain or to improve performance for machine learning workloads.
Learn how to: