Tutorial: How to create a secure workspace by using template

Templates provide a convenient way to create reproducible service deployments. The template defines what will be created, with some information provided by you when you use the template. For example, specifying a unique name for the Azure Machine Learning workspace.

In this tutorial, you learn how to use a Microsoft Bicep and Hashicorp Terraform template to create the following Azure resources:

  • Azure Virtual Network. The following resources are secured behind this VNet:
    • Azure Machine Learning workspace
      • Azure Machine Learning compute instance
      • Azure Machine Learning compute cluster
    • Azure Storage Account
    • Azure Key Vault
    • Azure Application Insights
    • Azure Container Registry
    • Azure Bastion host
    • Azure Machine Learning Virtual Machine (Data Science Virtual Machine)
    • The Bicep template also creates an Azure Kubernetes Service cluster, and a separate resource group for it.

Tip

Microsoft recommends using Azure Machine Learning managed virtual networks instead of the steps in this article. With a managed virtual network, Azure Machine Learning handles the job of network isolation for your workspace and managed computes. You can also add private endpoints for resources needed by the workspace, such as Azure Storage Account. For more information, see Workspace managed network isolation.

Prerequisites

Before using the steps in this article, you must have an Azure subscription. If you don't have an Azure subscription, create a free account.

You must also have either a Bash or Azure PowerShell command line.

Tip

When reading this article, use the tabs in each section to select whether to view information on using Bicep or Terraform templates.

  1. To install the command-line tools, see Set up Bicep development and deployment environments.

  2. The Bicep template used in this article is located at https://github.com/Azure/azure-quickstart-templates/blob/master/quickstarts/microsoft.machinelearningservices/machine-learning-end-to-end-secure. Use the following commands to clone the GitHub repo to your development environment:

    Tip

    If you do not have the git command on your development environment, you can install it from https://git-scm.com/.

    git clone https://github.com/Azure/azure-quickstart-templates
    cd azure-quickstart-templates/quickstarts/microsoft.machinelearningservices/machine-learning-end-to-end-secure
    

Understanding the template

The Bicep template is made up of the main.bicep and the .bicep files in the modules subdirectory. The following table describes what each file is responsible for:

File Description
main.bicep Parameters and variables. Passing parameters & variables to other modules in the modules subdirectory.
vnet.bicep Defines the Azure Virtual Network and subnets.
nsg.bicep Defines the network security group rules for the VNet.
bastion.bicep Defines the Azure Bastion host and subnet. Azure Bastion allows you to easily access a VM inside the VNet using your web browser.
dsvmjumpbox.bicep Defines the Data Science Virtual Machine (DSVM). Azure Bastion is used to access this VM through your web browser.
storage.bicep Defines the Azure Storage account used by the workspace for default storage.
keyvault.bicep Defines the Azure Key Vault used by the workspace.
containerregistry.bicep Defines the Azure Container Registry used by the workspace.
applicationinsights.bicep Defines the Azure Application Insights instance used by the workspace.
machinelearningnetworking.bicep Defines the private endpoints and DNS zones for the Azure Machine Learning workspace.
Machinelearning.bicep Defines the Azure Machine Learning workspace.
machinelearningcompute.bicep Defines an Azure Machine Learning compute cluster and compute instance.
privateaks.bicep Defines an Azure Kubernetes Services cluster instance.

Important

The example templates may not always use the latest API version for Azure Machine Learning. Before using the template, we recommend modifying it to use the latest API versions. For information on the latest API versions for Azure Machine Learning, see the Azure Machine Learning REST API.

Each Azure service has its own set of API versions. For information on the API for a specific service, check the service information in the Azure REST API reference.

To update the API version, find the Microsoft.MachineLearningServices/<resource> entry for the resource type and update it to the latest version. The following example is an entry for the Azure Machine Learning workspace that uses an API version of 2022-05-01:

resource machineLearning 'Microsoft.MachineLearningServices/workspaces@2022-05-01' = {

Important

The DSVM and Azure Bastion is used as an easy way to connect to the secured workspace for this tutorial. In a production environment, we recommend using an Azure VPN gateway or Azure ExpressRoute to access the resources inside the VNet directly from your on-premises network.

Configure the template

To run the Bicep template, use the following commands from the machine-learning-end-to-end-secure where the main.bicep file is:

  1. To create a new Azure Resource Group, use the following command. Replace exampleRG with your resource group name, and eastus with the Azure region you want to use:

    az group create --name exampleRG --location eastus
    
  2. To run the template, use the following command. Replace the prefix with a unique prefix. The prefix will be used when creating Azure resources that are required for Azure Machine Learning. Replace the securepassword with a secure password for the jump box. The password is for the login account for the jump box (azureadmin in the examples below):

    Tip

    The prefix must be 5 or less characters. It can't be entirely numeric or contain the following characters: ~ ! @ # $ % ^ & * ( ) = + _ [ ] { } \ | ; : . ' " , < > / ?.

    az deployment group create \
        --resource-group exampleRG \
        --template-file main.bicep \
        --parameters \
        prefix=prefix \
        dsvmJumpboxUsername=azureadmin \
        dsvmJumpboxPassword=securepassword
    

Connect to the workspace

After the template completes, use the following steps to connect to the DSVM:

  1. From the Azure portal, select the Azure Resource Group you used with the template. Then, select the Data Science Virtual Machine that was created by the template. If you have trouble finding it, use the filters section to filter the Type to virtual machine.

    Screenshot of filtering and selecting the vm.

  2. From the Overview section of the Virtual Machine, select Connect, and then select Bastion from the dropdown.

    Screenshot of selecting to connect using Bastion.

  3. When prompted, provide the username and password you specified when configuring the template and then select Connect.

    Important

    The first time you connect to the DSVM desktop, a PowerShell window opens and begins running a script. Allow this to complete before continuing with the next step.

  4. From the DSVM desktop, start Microsoft Edge and enter https://ml.azure.com as the address. Sign in to your Azure subscription, and then select the workspace created by the template. The studio for your workspace is displayed.

Troubleshooting

Error: Windows computer name cannot be more than 15 characters long, be entirely numeric, or contain the following characters

This error can occur when the name for the DSVM jump box is greater than 15 characters or includes one of the following characters: ~ ! @ # $ % ^ & * ( ) = + _ [ ] { } \ | ; : . ' " , < > / ?.

When using the Bicep template, the jump box name is generated programmatically using the prefix value provided to the template. To make sure the name does not exceed 15 characters or contain any invalid characters, use a prefix that is 5 characters or less and do not use any of the following characters in the prefix: ~ ! @ # $ % ^ & * ( ) = + _ [ ] { } \ | ; : . ' " , < > / ?.

When using the Terraform template, the jump box name is passed using the dsvm_name parameter. To avoid this error, use a name that is not greater than 15 characters and does not use any of the following characters as part of the name: ~ ! @ # $ % ^ & * ( ) = + _ [ ] { } \ | ; : . ' " , < > / ?.

Next steps

Important

The Data Science Virtual Machine (DSVM) and any compute instance resources bill you for every hour that they are running. To avoid excess charges, you should stop these resources when they are not in use. For more information, see the following articles:

To continue learning how to use the secured workspace from the DSVM, see Tutorial: Azure Machine Learning in a day.

To learn more about common secure workspace configurations and input/output requirements, see Azure Machine Learning secure workspace traffic flow.