Use an Azure Resource Manager template to create a workspace for Azure Machine Learning


In this article, you learn several ways to create an Azure Machine Learning workspace using Azure Resource Manager templates. A Resource Manager template makes it easy to create resources as a single, coordinated operation. A template is a JSON document that defines the resources that are needed for a deployment. It may also specify deployment parameters. Parameters are used to provide input values when using the template.

For more information, see Deploy an application with Azure Resource Manager template.

Prerequisites

Limitations

  • When creating a new workspace, you can either automatically create services needed by the workspace or use existing services. If you want to use existing services from a different Azure subscription than the workspace, you must register the Azure Machine Learning namespace in the subscription that contains those services. For example, creating a workspace in subscription A that uses a storage account from subscription B, the Azure Machine Learning namespace must be registered in subscription B before you can use the storage account with the workspace.

    The resource provider for Azure Machine Learning is Microsoft.MachineLearningService. For information on how to see if it is registered and how to register it, see the Azure resource providers and types article.

    Important

    This only applies to resources provided during workspace creation; Azure Storage Accounts, Azure Container Register, Azure Key Vault, and Application Insights.

Workspace Resource Manager template

The Azure Resource Manager template used throughout this document can be found in the 201-machine-learning-advanced directory of the Azure quickstart templates GitHub repository.

This template creates the following Azure services:

  • Azure Storage Account
  • Azure Key Vault
  • Azure Application Insights
  • Azure Container Registry
  • Azure Machine Learning workspace

The resource group is the container that holds the services. The various services are required by the Azure Machine Learning workspace.

The example template has two required parameters:

  • The location where the resources will be created.

    The template will use the location you select for most resources. The exception is the Application Insights service, which is not available in all of the locations that the other services are. If you select a location where it is not available, the service will be created in the South Central US location.

  • The workspaceName, which is the friendly name of the Azure Machine Learning workspace.

    Note

    The workspace name is case-insensitive.

    The names of the other services are generated randomly.

Tip

While the template associated with this document creates a new Azure Container Registry, you can also create a new workspace without creating a container registry. One will be created when you perform an operation that requires a container registry. For example, training or deploying a model.

You can also reference an existing container registry or storage account in the Azure Resource Manager template, instead of creating a new one. However, the container registry you use must have the admin account enabled. For information on enabling the admin account, see Admin account.

Warning

Once an Azure Container Registry has been created for a workspace, do not delete it. Doing so will break your Azure Machine Learning workspace.

For more information on templates, see the following articles:

Deploy template

To deploy your template you have to create a resource group.

See the Azure portal section if you prefer using the graphical user interface.

az group create --name "examplegroup" --location "eastus"

Once your resource group is successfully created, deploy the template with the following command:

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" location="eastus"

By default, all of the resources created as part of the template are new. However, you also have the option of using existing resources. By providing additional parameters to the template, you can use existing resources. For example, if you want to use an existing storage account set the storageAccountOption value to existing and provide the name of your storage account in the storageAccountName parameter.

Important

If you want to use an existing Azure Storage account, it cannot be a premium account (Premium_LRS and Premium_GRS). It also cannot have a hierarchical namespace (used with Azure Data Lake Storage Gen2). Neither premium storage or hierarchical namespace are supported with the default storage account of the workspace. Neither premium storage or hierarchical namespaces are supported with the default storage account of the workspace. You can use premium storage or hierarchical namespace with non-default storage accounts.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      storageAccountOption="existing" \
      storageAccountName="existingstorageaccountname"

Deploy an encrypted workspace

The following example template demonstrates how to create a workspace with three settings:

  • Enable high confidentiality settings for the workspace. This creates a new Cosmos DB instance.

  • Enable encryption for the workspace.

  • Uses an existing Azure Key Vault to retrieve customer-managed keys. Customer-managed keys are used to create a new Cosmos DB instance for the workspace.

    Important

    The Cosmos DB instance is created in a Microsoft-managed resource group in your subscription, along with any resources it needs. This means that you are charged for this Cosmos DB instance. The managed resource group is named in the format <AML Workspace Resource Group Name><GUID>. If your Azure Machine Learning workspace uses a private endpoint, a virtual network is also created for the Cosmos DB instance. This VNet is used to secure communication between Cosmos DB and Azure Machine Learning.

    • Do not delete the resource group that contains this Cosmos DB instance, or any of the resources automatically created in this group. If you need to delete the resource group, Cosmos DB instance, etc., you must delete the Azure Machine Learning workspace that uses it. The resource group, Cosmos DB instance, and other automatically created resources are deleted when the associated workspace is deleted.
    • The default Request Units for this Cosmos DB account is set at 8000.
    • You cannot provide your own VNet for use with the Cosmos DB instance that is created. You also cannot modify the virtual network. For example, you cannot change the IP address range that it uses.

Important

Once a workspace has been created, you cannot change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace using the new values.

For more information, see Encryption at rest.

Important

There are some specific requirements your subscription must meet before using this template:

  • You must have an existing Azure Key Vault that contains an encryption key.
  • The Azure Key Vault must be in the same region where you plan to create the Azure Machine Learning workspace.
  • You must specify the ID of the Azure Key Vault and the URI of the encryption key.

To get the values for the cmk_keyvault (ID of the Key Vault) and the resource_cmk_uri (key URI) parameters needed by this template, use the following steps:

  1. To get the Key Vault ID, use the following command:

    az keyvault show --name <keyvault-name> --query 'id' --output tsv	
    

    This command returns a value similar to /subscriptions/{subscription-guid}/resourceGroups/<resource-group-name>/providers/Microsoft.KeyVault/vaults/<keyvault-name>.

  2. To get the value for the URI for the customer managed key, use the following command:

    az keyvault key show --vault-name <keyvault-name> --name <key-name> --query 'key.kid' --output tsv	
    

    This command returns a value similar to https://mykeyvault.vault.azure.net/keys/mykey/{guid}.

Important

Once a workspace has been created, you cannot change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace using the new values.

To enable use of Customer Managed Keys, set the following parameters when deploying the template:

  • encryption_status to Enabled.
  • cmk_keyvault to the cmk_keyvault value obtained in previous steps.
  • resource_cmk_uri to the resource_cmk_uri value obtained in previous steps.
az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      encryption_status="Enabled" \
      cmk_keyvault="/subscriptions/{subscription-guid}/resourceGroups/<resource-group-name>/providers/Microsoft.KeyVault/vaults/<keyvault-name>" \
      resource_cmk_uri="https://mykeyvault.vault.azure.net/keys/mykey/{guid}" \

When using a customer-managed key, Azure Machine Learning creates a secondary resource group which contains the Cosmos DB instance. For more information, see encryption at rest - Cosmos DB.

An additional configuration you can provide for your data is to set the confidential_data parameter to true. Doing so, does the following:

  • Starts encrypting the local scratch disk for Azure Machine Learning compute clusters, providing you have not created any previous clusters in your subscription. If you have previously created a cluster in the subscription, open a support ticket to have encryption of the scratch disk enabled for your compute clusters.

  • Cleans up the local scratch disk between runs.

  • Securely passes credentials for the storage account, container registry, and SSH account from the execution layer to your compute clusters by using key vault.

  • Enables IP filtering to ensure the underlying batch pools cannot be called by any external services other than AzureMachineLearningService.

    Important

    Once a workspace has been created, you cannot change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace using the new values.

    For more information, see encryption at rest.

Deploy workspace behind a virtual network

By setting the vnetOption parameter value to either new or existing, you are able to create the resources used by a workspace behind a virtual network.

Important

For container registry, only the 'Premium' sku is supported.

Important

Application Insights does not support deployment behind a virtual network.

Only deploy workspace behind private endpoint

If your associated resources are not behind a virtual network, you can set the privateEndpointType parameter to AutoAproval or ManualApproval to deploy the workspace behind a private endpoint. This can be done for both new and existing workspaces. When updating an existing workspace, fill in the template parameters with the information from the existing workspace.

Important

Using an Azure Machine Learning workspace with private link is not available in the Azure Government regions or Azure China 21Vianet regions.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      privateEndpointType="AutoApproval"

Use a new virtual network

To deploy a resource behind a new virtual network, set the vnetOption to new along with the virtual network settings for the respective resource. The deployment below shows how to deploy a workspace with the storage account resource behind a new virtual network.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="new" \
      vnetName="examplevnet" \
      storageAccountBehindVNet="true"
      privateEndpointType="AutoApproval"

Alternatively, you can deploy multiple or all dependent resources behind a virtual network.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="new" \
      vnetName="examplevnet" \
      storageAccountBehindVNet="true" \
      keyVaultBehindVNet="true" \
      containerRegistryBehindVNet="true" \
      containerRegistryOption="new" \
      containerRegistrySku="Premium"
      privateEndpointType="AutoApproval"

Use an existing virtual network & resources

To deploy a workspace with existing associated resources you have to set the vnetOption parameter to existing along with subnet parameters. However, you need to create service endpoints in the virtual network for each of the resources before deployment. Like with new virtual network deployments, you can have one or all of your resources behind a virtual network.

Important

Subnet should have Microsoft.Storage service endpoint

Important

Subnets do not allow creation of private endpoints. Disable private endpoint to enable subnet.

  1. Enable service endpoints for the resources.

    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.Storage"
    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.KeyVault"
    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.ContainerRegistry"
    
  2. Deploy the workspace

    az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/201-machine-learning-advanced/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="existing" \
      vnetName="examplevnet" \
      vnetResourceGroupName="examplegroup" \
      storageAccountBehindVNet="true" \
      keyVaultBehindVNet="true" \
      containerRegistryBehindVNet="true" \
      containerRegistryOption="new" \
      containerRegistrySku="Premium" \
      subnetName="examplesubnet" \
      subnetOption="existing"
      privateEndpointType="AutoApproval"
    

Use the Azure portal

  1. Follow the steps in Deploy resources from custom template. When you arrive at the Select a template screen, choose the 201-machine-learning-advanced template from the dropdown.

  2. Select Select template to use the template. Provide the following required information and any other parameters depending on your deployment scenario.

    • Subscription: Select the Azure subscription to use for these resources.
    • Resource group: Select or create a resource group to contain the services.
    • Region: Select the Azure region where the resources will be created.
    • Workspace name: The name to use for the Azure Machine Learning workspace that will be created. The workspace name must be between 3 and 33 characters. It may only contain alphanumeric characters and '-'.
    • Location: Select the location where the resources will be created.
  3. Select Review + create.

  4. In the Review + create screen, agree to the listed terms and conditions and select Create.

For more information, see Deploy resources from custom template.

Troubleshooting

Resource provider errors

When creating an Azure Machine Learning workspace, or a resource used by the workspace, you may receive an error similar to the following messages:

  • No registered resource provider found for location {location}
  • The subscription is not registered to use namespace {resource-provider-namespace}

Most resource providers are automatically registered, but not all. If you receive this message, you need to register the provider mentioned.

For information on registering resource providers, see Resolve errors for resource provider registration.

Azure Key Vault access policy and Azure Resource Manager templates

When you use an Azure Resource Manager template to create the workspace and associated resources (including Azure Key Vault), multiple times. For example, using the template multiple times with the same parameters as part of a continuous integration and deployment pipeline.

Most resource creation operations through templates are idempotent, but Key Vault clears the access policies each time the template is used. Clearing the access policies breaks access to the Key Vault for any existing workspace that is using it. For example, Stop/Create functionalities of Azure Notebooks VM may fail.

To avoid this problem, we recommend one of the following approaches:

  • Do not deploy the template more than once for the same parameters. Or delete the existing resources before using the template to recreate them.

  • Examine the Key Vault access policies and then use these policies to set the accessPolicies property of the template. To view the access policies, use the following Azure CLI command:

    az keyvault show --name mykeyvault --resource-group myresourcegroup --query properties.accessPolicies
    

    For more information on using the accessPolicies section of the template, see the AccessPolicyEntry object reference.

  • Check if the Key Vault resource already exists. If it does, do not recreate it through the template. For example, to use the existing Key Vault instead of creating a new one, make the following changes to the template:

    • Add a parameter that accepts the ID of an existing Key Vault resource:

      "keyVaultId":{
        "type": "string",
        "metadata": {
          "description": "Specify the existing Key Vault ID."
        }
      }
      
    • Remove the section that creates a Key Vault resource:

      {
        "type": "Microsoft.KeyVault/vaults",
        "apiVersion": "2018-02-14",
        "name": "[variables('keyVaultName')]",
        "location": "[parameters('location')]",
        "properties": {
          "tenantId": "[variables('tenantId')]",
          "sku": {
            "name": "standard",
            "family": "A"
          },
          "accessPolicies": [
          ]
        }
      },
      
    • Remove the "[resourceId('Microsoft.KeyVault/vaults', variables('keyVaultName'))]", line from the dependsOn section of the workspace. Also Change the keyVault entry in the properties section of the workspace to reference the keyVaultId parameter:

      {
        "type": "Microsoft.MachineLearningServices/workspaces",
        "apiVersion": "2019-11-01",
        "name": "[parameters('workspaceName')]",
        "location": "[parameters('location')]",
        "dependsOn": [
          "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]",
          "[resourceId('Microsoft.Insights/components', variables('applicationInsightsName'))]"
        ],
        "identity": {
          "type": "systemAssigned"
        },
        "sku": {
          "tier": "[parameters('sku')]",
          "name": "[parameters('sku')]"
        },
        "properties": {
          "friendlyName": "[parameters('workspaceName')]",
          "keyVault": "[parameters('keyVaultId')]",
          "applicationInsights": "[resourceId('Microsoft.Insights/components',variables('applicationInsightsName'))]",
          "storageAccount": "[resourceId('Microsoft.Storage/storageAccounts/',variables('storageAccountName'))]"
        }
      }
      

    After these changes, you can specify the ID of the existing Key Vault resource when running the template. The template will then reuse the Key Vault by setting the keyVault property of the workspace to its ID.

    To get the ID of the Key Vault, you can reference the output of the original template run or use the Azure CLI. The following command is an example of using the Azure CLI to get the Key Vault resource ID:

    az keyvault show --name mykeyvault --resource-group myresourcegroup --query id
    

    This command returns a value similar to the following text:

    /subscriptions/{subscription-guid}/resourceGroups/myresourcegroup/providers/Microsoft.KeyVault/vaults/mykeyvault
    

Virtual network not linked to private DNS zone

When creating a workspace with a private endpoint, the template creates a Private DNS Zone named privatelink.api.azureml.ms. A virtual network link is automatically added to this private DNS zone. The link is only added for the first workspace and private endpoint you create in a resource group; if you create another virtual network and workspace with a private endpoint in the same resource group, the second virtual network may not get added to the private DNS zone.

To view the virtual network links that already exist for the private DNS zone, use the following Azure CLI command:

az network private-dns link vnet list --zone-name privatelink.api.azureml.ms --resource-group myresourcegroup

To add the virtual network that contains another workspace and private endpoint, use the following steps:

  1. To find the virtual network ID for the network that you want to add, use the following command:

    az network vnet show --name myvnet --resource-group myresourcegroup --query id
    

    This command returns a value similar to `"/subscriptions/GUID/resourceGroups/myresourcegroup/providers/Microsoft.Network/virtualNetworks/myvnet"'. Save this value and use it in the next step.

  2. To add a virtual network link to the privatelink.api.azureml.ms Private DNS Zone, use the following command. For the --virtual-network parameter, use the output of the previous command:

    az network private-dns link vnet create --name mylinkname --registration-enabled true --resource-group myresourcegroup --virtual-network myvirtualnetworkid --zone-name privatelink.api.azureml.ms
    

Next steps