Manage Azure Machine Learning workspaces using Terraform

In this article, you learn how to create and manage an Azure Machine Learning workspace using Terraform configuration files. Terraform's template-based configuration files enable you to define, create, and configure Azure resources in a repeatable and predictable manner. Terraform tracks resource state and is able to clean up and destroy resources.

A Terraform configuration is a document that defines the resources that are needed for a deployment. It may also specify deployment variables. Variables are used to provide input values when using the configuration.

Prerequisites

Limitations

  • When creating a new workspace, you can either automatically create services needed by the workspace or use existing services. If you want to use existing services from a different Azure subscription than the workspace, you must register the Azure Machine Learning namespace in the subscription that contains those services. For example, creating a workspace in subscription A that uses a storage account from subscription B, the Azure Machine Learning namespace must be registered in subscription B before you can use the storage account with the workspace.

    The resource provider for Azure Machine Learning is Microsoft.MachineLearningServices. For information on how to see if it is registered and how to register it, see the Azure resource providers and types article.

    Important

    This only applies to resources provided during workspace creation; Azure Storage Accounts, Azure Container Register, Azure Key Vault, and Application Insights.

Tip

An Azure Application Insights instance is created when you create the workspace. You can delete the Application Insights instance after cluster creation if you want. Deleting it limits the information gathered from the workspace, and may make it more difficult to troubleshoot problems. If you delete the Application Insights instance created by the workspace, you cannot re-create it without deleting and recreating the workspace.

For more information on using this Application Insights instance, see Monitor and collect data from Machine Learning web service endpoints.

Declare the Azure provider

Create the Terraform configuration file that declares the Azure provider:

  1. Create a new file named main.tf. If working with Azure Cloud Shell, use bash:

    code main.tf
    
  2. Paste the following code into the editor:

    main.tf:

    terraform {
      required_version = ">=1.0"
    
      required_providers {
        azurerm = {
          source  = "hashicorp/azurerm"
          version = "=2.76.0"
        }
      }
    }
    
    provider "azurerm" {
      features {}
    }
    
    data "azurerm_client_config" "current" {}
    
    resource "azurerm_resource_group" "default" {
      name     = "rg-${var.name}-${var.environment}"
      location = var.location
    }
    
  3. Save the file (<Ctrl>S) and exit the editor (<Ctrl>Q).

Deploy a workspace

The following Terraform configurations can be used to create an Azure Machine Learning workspace. When you create an Azure Machine Learning workspace, various other services are required as dependencies. The template also specifies these associated resources to the workspace. Depending on your needs, you can choose to use the template that creates resources with either public or private network connectivity.

Some resources in Azure require globally unique names. Before deploying your resources using the following templates, set the name variable to a value that is unique.

variables.tf:

variable "name" {
  type        = string
  description = "Name of the deployment"
}

variable "environment" {
  type        = string
  description = "Name of the environment"
  default     = "dev"
}

variable "location" {
  type        = string
  description = "Location of the resources"
  default     = "East US"
}

workspace.tf:

# Dependent resources for Azure Machine Learning
resource "azurerm_application_insights" "default" {
  name                = "appi-${var.name}-${var.environment}"
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name
  application_type    = "web"
}

resource "azurerm_key_vault" "default" {
  name                     = "kv-${var.name}-${var.environment}"
  location                 = azurerm_resource_group.default.location
  resource_group_name      = azurerm_resource_group.default.name
  tenant_id                = data.azurerm_client_config.current.tenant_id
  sku_name                 = "premium"
  purge_protection_enabled = false
}

resource "azurerm_storage_account" "default" {
  name                     = "st${var.name}${var.environment}"
  location                 = azurerm_resource_group.default.location
  resource_group_name      = azurerm_resource_group.default.name
  account_tier             = "Standard"
  account_replication_type = "GRS"
}

resource "azurerm_container_registry" "default" {
  name                = "cr${var.name}${var.environment}"
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name
  sku                 = "Premium"
  admin_enabled       = true
}

# Machine Learning workspace
resource "azurerm_machine_learning_workspace" "default" {
  name                    = "mlw-${var.name}-${var.environment}"
  location                = azurerm_resource_group.default.location
  resource_group_name     = azurerm_resource_group.default.name
  application_insights_id = azurerm_application_insights.default.id
  key_vault_id            = azurerm_key_vault.default.id
  storage_account_id      = azurerm_storage_account.default.id
  container_registry_id   = azurerm_container_registry.default.id

  identity {
    type = "SystemAssigned"
  }
}

Troubleshooting

Resource provider errors

When creating an Azure Machine Learning workspace, or a resource used by the workspace, you may receive an error similar to the following messages:

  • No registered resource provider found for location {location}
  • The subscription is not registered to use namespace {resource-provider-namespace}

Most resource providers are automatically registered, but not all. If you receive this message, you need to register the provider mentioned.

The following table contains a list of the resource providers required by Azure Machine Learning:

Resource provider Why it's needed
Microsoft.MachineLearningServices Creating the Azure Machine Learning workspace.
Microsoft.Storage Azure Storage Account is used as the default storage for the workspace.
Microsoft.ContainerRegistry Azure Container Registry is used by the workspace to build Docker images.
Microsoft.KeyVault Azure Key Vault is used by the workspace to store secrets.
Microsoft.Notebooks/NotebookProxies Integrated notebooks on Azure Machine Learning compute instance.
Microsoft.ContainerService If you plan on deploying trained models to Azure Kubernetes Services.

If you plan on using a customer-managed key with Azure Machine Learning, then the following service providers must be registered:

Resource provider Why it's needed
Microsoft.DocumentDB/databaseAccounts Azure CosmosDB instance that logs metadata for the workspace.
Microsoft.Search/searchServices Azure Search provides indexing capabilities for the workspace.

For information on registering resource providers, see Resolve errors for resource provider registration.

Next steps