Create compute targets for model training and deployment in Azure Machine Learning studio

In this article, learn how to create and manage compute targets in Azure Machine studio. You can also create and manage compute targets with:

Important

Items marked (preview) in this article are currently in public preview. The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Prerequisites

What's a compute target?

With Azure Machine Learning, you can train your model on a variety of resources or environments, collectively referred to as compute targets. A compute target can be a local machine or a cloud resource, such as an Azure Machine Learning Compute, Azure HDInsight, or a remote virtual machine. You can also create compute targets for model deployment as described in "Where and how to deploy your models".

View compute targets

To see all compute targets for your workspace, use the following steps:

  1. Navigate to Azure Machine Learning studio.

  2. Under Manage, select Compute.

  3. Select tabs at the top to show each type of compute target.

    View list of compute targets

Start creation process

Follow the previous steps to view the list of compute targets. Then use these steps to create a compute target:

  1. Select the tab at the top corresponding to the type of compute you will create.

  2. If you have no compute targets, select Create in the middle of the page.

    Create compute target

  3. If you see a list of compute resources, select +New above the list.

    Select new

  4. Fill out the form for your compute type:

  5. Select Create.

  6. View the status of the create operation by selecting the compute target from the list:

    View compute status from a list

Follow the steps in Create and manage an Azure Machine Learning compute instance.

Create compute clusters

Create a single or multi node compute cluster for your training, batch inferencing or reinforcement learning workloads. Use the steps above to create the compute cluster. Then fill out the form as follows:

Field Description
Location The Azure region where the compute cluster will be created. By default, this is the same location as the workspace. Setting the location to a different region than the workspace is in preview, and is only available for compute clusters, not compute instances.
When using a different region than your workspace or datastores, you may see increased network latency and data transfer costs. The latency and costs can occur when creating the cluster, and when running jobs on it.
Virtual machine type Choose CPU or GPU. This type cannot be changed after creation
Virtual machine priority Choose Dedicated or Low priority. Low priority virtual machines are cheaper but don't guarantee the compute nodes. Your job may be preempted.
Virtual machine size Supported virtual machine sizes might be restricted in your region. Check the availability list

Select Next to proceed to Advanced Settings and fill out the form as follows:

Field Description
Compute name
  • Name is required and must be between 3 to 24 characters long.
  • Valid characters are upper and lower case letters, digits, and the - character.
  • Name must start with a letter
  • Name needs to be unique across all existing computes within an Azure region. You will see an alert if the name you choose is not unique
  • If - character is used, then it needs to be followed by at least one letter later in the name
  • Minimum number of nodes Minimum number of nodes that you want to provision. If you want a dedicated number of nodes, set that count here. Save money by setting the minimum to 0, so you won't pay for any nodes when the cluster is idle.
    Maximum number of nodes Maximum number of nodes that you want to provision. The compute will autoscale to a maximum of this node count when a job is submitted.
    Idle seconds before scale down Idle time before scaling the cluster down to the minimum node count.
    Enable SSH access Use the same instructions as Enable SSH access for a compute instance (above).
    Advanced settings Optional. Configure a virtual network. Specify the Resource group, Virtual network, and Subnet to create the compute instance inside an Azure Virtual Network (vnet). For more information, see these network requirements for vnet. Also attach managed identities to grant access to resources

    Enable SSH access

    SSH access is disabled by default. SSH access cannot be changed after creation. Make sure to enable access if you plan to debug interactively with VS Code Remote.

    After you have selected Next: Advanced Settings:

    1. Turn on Enable SSH access.
    2. In the SSH public key source, select one of the options from the dropdown:
      • If you Generate new key pair:
        1. Enter a name for the key in Key pair name.
        2. Select Create.
        3. Select Download private key and create compute. The key is usually downloaded into the Downloads folder.
      • If you select Use existing public key stored in Azure, search for and select the key in Stored key.
      • If you select Use existing public key, provide an RSA public key in the single-line format (starting with "ssh-rsa") or the multi-line PEM format. You can generate SSH keys using ssh-keygen on Linux and OS X, or PuTTYGen on Windows.

    Once the compute cluster is created and running, see Connect with SSH access.

    Set up managed identity

    Azure Machine Learning compute clusters also support managed identities to authenticate access to Azure resources without including credentials in your code. There are two types of managed identities:

    • A system-assigned managed identity is enabled directly on the Azure Machine Learning compute cluster. The life cycle of a system-assigned identity is directly tied to the compute cluster. If the compute cluster is deleted, Azure automatically cleans up the credentials and the identity in Azure AD.
    • A user-assigned managed identity is a standalone Azure resource provided through Azure Managed Identity service. You can assign a user-assigned managed identity to multiple resources, and it persists for as long as you want.

    During cluster creation or when editing compute cluster details, in the Advanced settings, toggle Assign a managed identity and specify a system-assigned identity or user-assigned identity.

    Managed identity usage

    The default managed identity is the system-assigned managed identity or the first user-assigned managed identity.

    During a run there are two applications of an identity:

    1. The system uses an identity to set up the user's storage mounts, container registry, and datastores.

      • In this case, the system will use the default-managed identity.
    2. The user applies an identity to access resources from within the code for a submitted run

      • In this case, provide the client_id corresponding to the managed identity you want to use to retrieve a credential.
      • Alternatively, get the user-assigned identity's client ID through the DEFAULT_IDENTITY_CLIENT_ID environment variable.

      For example, to retrieve a token for a datastore with the default-managed identity:

      client_id = os.environ.get('DEFAULT_IDENTITY_CLIENT_ID')
      credential = ManagedIdentityCredential(client_id=client_id)
      token = credential.get_token('https://storage.azure.com/')
      

    Create inference clusters

    Important

    Using Azure Kubernetes Service with Azure Machine Learning has multiple configuration options. Some scenarios, such as networking, require additional setup and configuration. For more information on using AKS with Azure ML, see Create and attach an Azure Kubernetes Service cluster.

    Create or attach an Azure Kubernetes Service (AKS) cluster for large scale inferencing. Use the steps above to create the AKS cluster. Then fill out the form as follows:

    Field Description
    Compute name
  • Name is required. Name must be between 2 to 16 characters.
  • Valid characters are upper and lower case letters, digits, and the - character.
  • Name must start with a letter
  • Name needs to be unique across all existing computes within an Azure region. You will see an alert if the name you choose is not unique
  • If - character is used, then it needs to be followed by at least one letter later in the name
  • Kubernetes Service Select Create New and fill out the rest of the form. Or select Use existing and then select an existing AKS cluster from your subscription.
    Region Select the region where the cluster will be created
    Virtual machine size Supported virtual machine sizes might be restricted in your region. Check the availability list
    Cluster purpose Select Production or Dev-test
    Number of nodes The number of nodes multiplied by the virtual machine’s number of cores (vCPUs) must be greater than or equal to 12.
    Network configuration Select Advanced to create the compute within an existing virtual network. For more information about AKS in a virtual network, see Network isolation during training and inference with private endpoints and virtual networks.
    Enable SSL configuration Use this to configure SSL certificate on the compute

    Attach other compute

    To use compute targets created outside the Azure Machine Learning workspace, you must attach them. Attaching a compute target makes it available to your workspace. Use Attached compute to attach a compute target for training. Use Inference clusters to attach an AKS cluster for inferencing.

    Use the steps above to attach a compute. Then fill out the form as follows:

    1. Enter a name for the compute target.

    2. Select the type of compute to attach. Not all compute types can be attached from Azure Machine Learning studio. The compute types that can currently be attached for training include:

      • An Azure Virtual Machine (to attach a Data Science Virtual Machine)
      • Azure Databricks (for use in machine learning pipelines)
      • Azure Data Lake Analytics (for use in machine learning pipelines)
      • Azure HDInsight
      • Kubernetes (preview)
    3. Fill out the form and provide values for the required properties.

      Note

      Microsoft recommends that you use SSH keys, which are more secure than passwords. Passwords are vulnerable to brute force attacks. SSH keys rely on cryptographic signatures. For information on how to create SSH keys for use with Azure Virtual Machines, see the following documents:

    4. Select Attach.

    Note

    To create and attach a compute target for training on Azure Arc enabled Kubernetes cluster, see Configure Azure Arc enabled Machine Learning

    Important

    To attach an Azure Kubernetes Services (AKS) or Arc enabled Kubernetes cluster, you must be subscription owner or have permission to access AKS cluster resources under the subscription. Otherwise, the cluster list on "attach new compute" page will be blank.

    To detach your compute use the following steps:

    1. In Azure Machine Learning studio, select Compute, Attached compute, and the compute you wish to remove.
    2. Use the Detach link to detach your compute.

    Connect with SSH access

    If you created your compute instance or compute cluster with SSH access enabled, use these steps for access.

    1. Find the compute in your workspace resources:

      1. On the left, select Compute.
      2. Use the tabs at the top to select Compute instance or Compute cluster to find your machine.
    2. Select the compute name in the list of resources.

    3. Find the connection string:

      • For a compute instance, select Connect at the top of the Details section.

        Screenshot: Connect tool at the top of the Details page.

      • For a compute cluster, select Nodes at the top, then select the Connection string in the table for your node. Screenshot: Connection string for a node in a compute cluster.

    4. Copy the connection string.

    5. For Windows, open PowerShell or a command prompt:

      1. Go into the directory or folder where your key is stored

      2. Add the -i flag to the connection string to locate the private key and point to where it is stored:

        ssh -i <keyname.pem> azureuser@... (rest of connection string)

    6. For Linux users, follow the steps from Create and use an SSH key pair for Linux VMs in Azure

    Next steps

    After a target is created and attached to your workspace, you use it in your run configuration with a ComputeTarget object:

    from azureml.core.compute import ComputeTarget
    myvm = ComputeTarget(workspace=ws, name='my-vm-name')