What is an Azure Machine Learning compute instance?
An Azure Machine Learning compute instance is a managed cloud-based workstation for data scientists.
Compute instances make it easy to get started with Azure Machine Learning development as well as provide management and enterprise readiness capabilities for IT administrators.
Use a compute instance as your fully configured and managed development environment in the cloud for machine learning. They can also be used as a compute target for training and inferencing for development and testing purposes.
For compute instance Jupyter functionality to work, ensure that web socket communication is not disabled. Please ensure your network allows websocket connections to *.instances.azureml.net and *.instances.azureml.ms.
Why use a compute instance?
A compute instance is a fully managed cloud-based workstation optimized for your machine learning development environment. It provides the following benefits:
|Productivity||You can build and deploy models using integrated notebooks and the following tools in Azure Machine Learning studio:
- VS Code (preview)
- RStudio (preview)
Compute instance is fully integrated with Azure Machine Learning workspace and studio. You can share notebooks and data with other data scientists in the workspace.
You can also use VS Code with compute instances.
|Managed & secure||Reduce your security footprint and add compliance with enterprise security requirements. Compute instances provide robust management policies and secure networking configurations such as:
- Autoprovisioning from Resource Manager templates or Azure Machine Learning SDK
- Azure role-based access control (Azure RBAC)
- Virtual network support
- SSH policy to enable/disable SSH access
TLS 1.2 enabled
|Preconfigured for ML||Save time on setup tasks with pre-configured and up-to-date ML packages, deep learning frameworks, GPU drivers.|
|Fully customizable||Broad support for Azure VM types including GPUs and persisted low-level customization such as installing packages and drivers makes advanced scenarios a breeze.|
You can also use a setup script (preview) for an automated way to customize and configure the compute instance as per your needs.
Tools and environments
Azure Machine Learning compute instance enables you to author, train, and deploy models in a fully integrated notebook experience in your workspace.
Following tools and environments are already installed on the compute instance:
|General tools & environments||Details|
|Intel MPI library|
|Azure Machine Learning samples|
|R tools & environments||Details|
|RStudio Server Open Source Edition (preview)|
|Azure Machine Learning SDK for R||azuremlsdkSDK samples|
|PYTHON tools & environments||Details|
|Jupyter and extensions|
|Jupyterlab and extensions|
|Azure Machine Learning SDK for Pythonfrom PyPI||Includes most of the azureml extra packages. To see the full list, open a terminal window on your compute instance and run
|Other PyPI packages||
|Deep learning packages||
|Azure Machine Learning Python & R SDK samples|
Python packages are all installed in the Python 3.8 - AzureML environment. Compute instance has Ubuntu 18.04 as the base OS.
Notebooks and R scripts are stored in the default storage account of your workspace in Azure file share. These files are located under your “User files” directory. This storage makes it easy to share notebooks between compute instances. The storage account also keeps your notebooks safely preserved when you stop or delete a compute instance.
The Azure file share account of your workspace is mounted as a drive on the compute instance. This drive is the default working directory for Jupyter, Jupyter Labs, and RStudio. This means that the notebooks and other files you create in Jupyter, JupyterLab, or RStudio are automatically stored on the file share and available to use in other compute instances as well.
The files in the file share are accessible from all compute instances in the same workspace. Any changes to these files on the compute instance will be reliably persisted back to the file share.
You can also clone the latest Azure Machine Learning samples to your folder under the user files directory in the workspace file share.
Writing small files can be slower on network drives than writing to the compute instance local disk itself. If you are writing many small files, try using a directory directly on the compute instance, such as a
/tmp directory. Note these files will not be accessible from other compute instances.
Do not store training data on the notebooks file share. You can use the
/tmp directory on the compute instance for your temporary data. However, do not write very large files of data on the OS disk of the compute instance. OS disk on compute instance has 128 GB capacity. You can also store temporary training data on temporary disk mounted on /mnt. Temporary disk size is configurable based on the VM size chosen and can store larger amounts of data if a higher size VM is chosen. You can also mount datastores and datasets.
Managing a compute instance
In your workspace in Azure Machine Learning studio, select Compute, then select Compute Instance on the top.
For more about managing the compute instance, see Create and manage an Azure Machine Learning compute instance.
Create a compute instance
As an administrator, you can create a compute instance for others in the workspace (preview).
You can also use a setup script (preview) for an automated way to customize and configure the compute instance.
To create your a compute instance for yourself, use your workspace in Azure Machine Learning studio, create a new compute instance from either the Compute section or in the Notebooks section when you are ready to run one of your notebooks.
You can also create an instance
- Directly from the integrated notebooks experience
- In Azure portal
- From Azure Resource Manager template. For an example template, see the create an Azure Machine Learning compute instance template.
- With Azure Machine Learning SDK
- From the CLI extension for Azure Machine Learning
The dedicated cores per region per VM family quota and total regional quota, which applies to compute instance creation, is unified and shared with Azure Machine Learning training compute cluster quota. Stopping the compute instance does not release quota to ensure you will be able to restart the compute instance. Please do not stop the compute instance through the OS terminal by doing a sudo shutdown.
Compute instance comes with P10 OS disk. Temp disk type depends on the VM size chosen. Currently, it is not possible to change the OS disk type.
Compute instances can be used as a training compute target similar to Azure Machine Learning compute training clusters.
A compute instance:
- Has a job queue.
- Runs jobs securely in a virtual network environment, without requiring enterprises to open up SSH port. The job executes in a containerized environment and packages your model dependencies in a Docker container.
- Can run multiple small jobs in parallel (preview). Two jobs per core can run in parallel while the rest of the jobs are queued.
- Supports single-node multi-GPU distributed training jobs
You can use compute instance as a local inferencing deployment target for test/debug scenarios.
The compute instance has 120GB OS disk. If you run out of disk space and get into an unusable state, please clear at least 5 GB disk space on OS disk (mounted on /) through the compute instance terminal by removing files/folders and then do
sudo reboot. To access the terminal go to compute list page or compute instance details page and click on Terminal link. You can check available disk space by running
df -h on the terminal. Clear at least 5 GB space before doing
sudo reboot. Please do not stop or restart the compute instance through the Studio until 5 GB disk space has been cleared.