What tools are included on the Azure Data Science Virtual Machine?
The Data Science Virtual Machine is an easy way to explore data and do machine learning in the cloud. The Data Science Virtual Machines are pre-configured with the complete operating system, security patches, drivers, and popular data science and development software. You can choose the hardware environment, ranging from lower-cost CPU-centric machines to very powerful machines with multiple GPUs, NVMe storage, and large amounts of memory. For machines with GPUs, all drivers are installed, all machine learning frameworks are version-matched for GPU compatibility, and acceleration is enabled in all application software that supports GPUs.
The Data Science Virtual Machine comes with the most useful data-science tools pre-installed.
Build deep learning and machine learning solutions
| Tool | Windows Server 2019 DSVM | Ubuntu 18.04 DSVM | Usage notes |
|---|---|---|---|
| CUDA, cuDNN, NVIDIA Driver | ✅ | ✅ | CUDA, cuDNN, NVIDIA Driver on the DSVM |
| Horovod | ❌ | ✅ | Horovod on the DSVM |
| NVidia System Management Interface (nvidia-smi) | ✅ | ✅ | nvidia-smi on the DSVM |
| PyTorch | ✅ | ✅ | PyTorch on the DSVM |
| TensorFlow | ✅ | ✅ | TensorFlow on the DSVM |
| Integration with Azure Machine Learning (Python) | ✅ (Python SDK, samples) | ✅ (Python SDK,CLI, samples) | Azure ML SDK |
| XGBoost | ✅ (CUDA support) | ✅ (CUDA support) | XGBoost on the DSVM |
| Vowpal Wabbit | ✅ | ✅ | Vowpal Wabbit on the DSVM |
| Weka | ❌ | ❌ | |
| LightGBM | ❌ | ✅ (GPU, MPI support) | |
| H2O | ❌ | ✅ | |
| CatBoost | ❌ | ✅ | |
| Intel MKL | ❌ | ✅ | |
| OpenCV | ❌ | ✅ | |
| Dlib | ❌ | ✅ | |
| Docker | ✅ (Windows containers only) |
✅ | |
| Nccl | ❌ | ✅ | |
| Rattle | ❌ | ✅ | |
| ONNX Runtime | ❌ | ✅ |
Store, retrieve, and manipulate data
| Tool | Windows Server 2019 DSVM | Ubuntu 18.04 DSVM | Usage notes |
|---|---|---|---|
| Relational databases | SQL Server 2019 Developer Edition |
SQL Server 2019 Developer Edition |
SQL Server on the DSVM |
| Database tools | SQL Server Management Studio SQL Server Integration Services bcp, sqlcmd |
SQuirreL SQL (querying tool), bcp, sqlcmd ODBC/JDBC drivers |
|
| Azure Storage Explorer | ✅ | ✅ | |
| Azure CLI | ✅ | ✅ | |
| AzCopy | ✅ | ❌ | AzCopy on the DSVM |
| Blob FUSE driver | ❌ | ✅ | blobfuse on the DSVM |
| Azure Cosmos DB Data Migration Tool | ✅ | ❌ | Cosmos DB on the DSVM |
| Unix/Linux command-line tools | ❌ | ✅ | |
| Apache Spark 3.1 (standalone) | ✅ | ✅ |
Program in Python, R, Julia, and Node.js
| Tool | Windows Server 2019 DSVM | Ubuntu 18.04 DSVM | Usage notes |
|---|---|---|---|
| CRAN-R with popular packages pre-installed | ✅ | ✅ | |
| Anaconda Python with popular packages pre-installed | ✅ (Miniconda) |
✅ (Miniconda) | |
| Julia (Julialang) | ✅ | ✅ | |
| JupyterHub (multiuser notebook server) | ❌ | ✅ | |
| JupyterLab (multiuser notebook server) | ✅ | ✅ | |
| Node.js | ✅ | ✅ | |
| Jupyter Notebook Server with the following kernels: | ✅ | ✅ | Jupyter Notebook samples |
| R | R Jupyter Samples | ||
| Python | Python Jupyter Samples | ||
| Julia | Julia Jupyter Samples | ||
| PySpark | pySpark Jupyter Samples |
Ubuntu 18.04 DSVM and Windows Server 2019 DSVM has the following Jupyter Kernels:-
- Python 3.8 - default
- Python 3.8 - PyTorch
- Python 3.8 - TensorFlow
- Python 3.6 - AzureML - TensorFlow
- Python 3.6 - AzureML - PyTorch
- Python 3.6 - AzureML – AutoML
- R
- Python 3.7 - Spark (local)
- Julia 1.2.0
- R Spark – HDInsight
- Scala Spark – HDInsight
- Python 3 Spark – HDInsight
Ubuntu 18.04 DSVM and Windows Server 2019 DSVM has the following conda environments:-
- py38_default
- py38_tensorflow
- py38_pytorch
- azureml_py36_tensorflow
- azureml_py36_pytorch
- azureml_py36_automl
Use your preferred editor or IDE
| Tool | Windows Server 2019 DSVM | Ubuntu 18.04 DSVM | Usage notes |
|---|---|---|---|
| Notepad++ | ✅ | ❌ | |
| Nano | ✅ | ❌ | |
| Visual Studio 2019 Community Edition | ✅ | ❌ | Visual Studio on the DSVM |
| Visual Studio Code | ✅ | ✅ | Visual Studio Code on the DSVM |
| RStudio Desktop | ✅ | ✅ | RStudio Desktop on the DSVM |
| RStudio Server (disabled by default) |
❌ | ✅ | |
| PyCharm Community Edition | ✅ | ✅ | PyCharm on the DSVM |
| IntelliJ IDEA | ❌ | ✅ | |
| Vim | ❌ | ✅ | |
| Emacs | ❌ | ✅ | |
| Git and Git Bash | ✅ | ✅ | |
| OpenJDK 11 | ✅ | ✅ | |
| .NET Framework | ✅ | ❌ | |
| Azure SDK | ✅ | ✅ |
Organize & present results
| Tool | Windows Server 2019 DSVM | Ubuntu 18.04 DSVM | Usage notes |
|---|---|---|---|
| Microsoft 365 (Word, Excel, PowerPoint) | ✅ | ❌ | |
| Microsoft Teams | ✅ | ❌ | |
| Power BI Desktop | ✅ | ❌ | |
| Microsoft Edge Browser | ✅ | ✅ |