GPU optimized virtual machine sizes

GPU optimized VM sizes are specialized virtual machines available with single or multiple NVIDIA GPUs. These sizes are designed for compute-intensive, graphics-intensive, and visualization workloads. This article provides information about the number and type of GPUs, vCPUs, data disks, and NICs. Storage throughput and network bandwidth are also included for each size in this grouping.

  • NC, NCv2, NCv3, ND, and NDv2 sizes are optimized for compute-intensive and network-intensive applications and algorithms. Some examples are CUDA- and OpenCL-based applications and simulations, AI, and Deep Learning. The NCv3-series is focused on high-performance computing workloads featuring NVIDIA’s Tesla V100 GPU. The ND-series is focused on training and inference scenarios for deep learning. It uses the NVIDIA Tesla P40 GPU.
  • NV and NVv2 sizes are optimized and designed for remote visualization, streaming, gaming, encoding, and VDI scenarios using frameworks such as OpenGL and DirectX. These VMs are backed by the NVIDIA Tesla M60 GPU.

NC-series

Premium Storage: Not Supported

Premium Storage Caching: Not Supported

NC-series VMs are powered by the NVIDIA Tesla K80 card. Users can crunch through data faster by leveraging CUDA for energy exploration applications, crash simulations, ray traced rendering, deep learning, and more. The NC24r configuration provides a low latency, high-throughput network interface optimized for tightly coupled parallel computing workloads.

Size vCPU Memory: GiB Temp storage (SSD) GiB GPU GPU memory: GiB Max data disks Max NICs
Standard_NC6 6 56 340 1 8 24 1
Standard_NC12 12 112 680 2 16 48 2
Standard_NC24 24 224 1440 4 32 64 4
Standard_NC24r* 24 224 1440 4 32 64 4

1 GPU = one-half K80 card.

*RDMA capable

NCv2-series

Premium Storage: Supported

Premium Storage Caching: Supported

NCv2-series VMs are powered by NVIDIA Tesla P100 GPUs. These GPUs can provide more than 2x the computational performance of the NC-series. Customers can take advantage of these updated GPUs for traditional HPC workloads such as reservoir modeling, DNA sequencing, protein analysis, Monte Carlo simulations, and others. The NC24rs v2 configuration provides a low latency, high-throughput network interface optimized for tightly coupled parallel computing workloads.

Important

For this size family, the vCPU (core) quota in your subscription is initially set to 0 in each region. Request a vCPU quota increase for this family in an available region.

Size vCPU Memory: GiB Temp storage (SSD) GiB GPU GPU memory: GiB Max data disks Max NICs
Standard_NC6s_v2 6 112 736 1 16 12 4
Standard_NC12s_v2 12 224 1474 2 32 24 8
Standard_NC24s_v2 24 448 2948 4 64 32 8
Standard_NC24rs_v2* 24 448 2948 4 64 32 8

1 GPU = one P100 card.

*RDMA capable

NCv3-series

Premium Storage: Supported

Premium Storage Caching: Supported

NCv3-series VMs are powered by NVIDIA Tesla V100 GPUs. These GPUs can provide 1.5x the computational performance of the NCv2-series. Customers can take advantage of these updated GPUs for traditional HPC workloads such as reservoir modeling, DNA sequencing, protein analysis, Monte Carlo simulations, and others. The NC24rs v3 configuration provides a low latency, high-throughput network interface optimized for tightly coupled parallel computing workloads.

Important

For this size family, the vCPU (core) quota in your subscription is initially set to 0 in each region. Request a vCPU quota increase for this family in an available region.

Size vCPU Memory: GiB Temp storage (SSD) GiB GPU GPU memory: GiB Max data disks Max NICs
Standard_NC6s_v3 6 112 736 1 16 12 4
Standard_NC12s_v3 12 224 1474 2 32 24 8
Standard_NC24s_v3 24 448 2948 4 64 32 8
Standard_NC24rs_v3* 24 448 2948 4 64 32 8

1 GPU = one V100 card.

*RDMA capable

NDv2-series (Preview)

Premium Storage: Supported

Premium Storage Caching: Supported

Infiniband: Not supported

NDv2-series virtual machine is a new addition to the GPU family designed for the needs of the HPC, AI, and machine learning workloads. It’s powered by 8 NVIDIA Tesla V100 NVLINK interconnected GPUs and 40 Intel Skylake cores and 672 GiB of system memory. NDv2 instance provides excellent FP32 and FP64 performance for HPC and AI workloads utilizing Cuda, TensorFlow, Pytorch, Caffe, and other frameworks.

Sign-up and get access to these machines during preview.

Size vCPU’s GPU Memory NICs (Max) Max. disk size Max. data disks (1023 GB each) Max network bandwidth
Standard_ND40s_v2 40 8 V100 (NVlilnk) 672 GiB 8 Temporary 1344 / 2948XIO 32 24,000 Mbps

ND-series

Premium Storage: Supported

Premium Storage Caching: Supported

The ND-series virtual machines are a new addition to the GPU family designed for AI, and Deep Learning workloads. They offer excellent performance for training and inference. ND instances are powered by NVIDIA Tesla P40 GPUs. These instances provide excellent performance for single-precision floating point operations, for AI workloads utilizing Microsoft Cognitive Toolkit, TensorFlow, Caffe, and other frameworks. The ND-series also offers a much larger GPU memory size (24 GB), enabling to fit much larger neural net models. Like the NC-series, the ND-series offers a configuration with a secondary low-latency, high-throughput network through RDMA, and InfiniBand connectivity so you can run large-scale training jobs spanning many GPUs.

Important

For this size family, the vCPU (core) quota per region in your subscription is initially set to 0. Request a vCPU quota increase for this family in an available region.

Size vCPU Memory: GiB Temp storage (SSD) GiB GPU GPU memory: GiB Max data disks Max NICs
Standard_ND6s 6 112 736 1 24 12 4
Standard_ND12s 12 224 1474 2 48 24 8
Standard_ND24s 24 448 2948 4 96 32 8
Standard_ND24rs* 24 448 2948 4 96 32 8

1 GPU = one P40 card.

*RDMA capable

NV-series

Premium Storage: Not Supported

Premium Storage Caching: Not Supported

The NV-series virtual machines are powered by NVIDIA Tesla M60 GPUs and NVIDIA GRID technology for desktop accelerated applications and virtual desktops where customers are able to visualize their data or simulations. Users are able to visualize their graphics intensive workflows on the NV instances to get superior graphics capability and additionally run single precision workloads such as encoding and rendering.

Each GPU in NV instances comes with a GRID license. This license gives you the flexibility to use an NV instance as a virtual workstation for a single user, or 25 concurrent users can connect to the VM for a virtual application scenario.

Size vCPU Memory: GiB Temp storage (SSD) GiB GPU GPU memory: GiB Max data disks Max NICs Virtual Workstations Virtual Applications
Standard_NV6 6 56 340 1 8 24 1 1 25
Standard_NV12 12 112 680 2 16 48 2 2 50
Standard_NV24 24 224 1440 4 32 64 4 4 100

1 GPU = one-half M60 card.

NVv2-series (Preview)

Premium Storage: Supported

Premium Storage Caching: Supported

The NVv2-series virtual machines are powered by NVIDIA Tesla M60 GPUs and NVIDIA GRID technology with Intel Broadwell CPUs. These virtual machines are targeted for GPU accelerated graphics applications and virtual desktops where customers want to visualize their data, simulate results to view, work on CAD, or render and stream content. Additionally, these virtual machines can run single precision workloads such as encoding and rendering. NVv2 virtual machines support Premium Storage and come with twice the system memory (RAM) when compared with its predecessor NV-series.

Each GPU in NVv2 instances comes with a GRID license. This license gives you the flexibility to use an NV instance as a virtual workstation for a single user, or 25 concurrent users can connect to the VM for a virtual application scenario.

Size vCPU Memory: GiB Temp storage (SSD) GiB GPU GPU memory: GiB Max data disks Max NICs Virtual Workstations Virtual Applications
Standard_NV6s_v2 6 112 320 1 8 12 4 1 25
Standard_NV12s_v2 12 224 640 2 16 24 8 2 50
Standard_NV24s_v2 24 448 1280 4 32 32 8 4 100

1 GPU = one-half M60 card.

Size table definitions

  • Storage capacity is shown in units of GiB or 1024^3 bytes. When comparing disks measured in GB (1000^3 bytes) to disks measured in GiB (1024^3) remember that capacity numbers given in GiB may appear smaller. For example, 1023 GiB = 1098.4 GB
  • Disk throughput is measured in input/output operations per second (IOPS) and MBps where MBps = 10^6 bytes/sec.
  • Data disks can operate in cached or uncached modes. For cached data disk operation, the host cache mode is set to ReadOnly or ReadWrite. For uncached data disk operation, the host cache mode is set to None.
  • If you want to get the best performance for your VMs, you should limit the number of data disks to 2 disks per vCPU.
  • Expected network bandwidth is the maximum aggregated bandwidth allocated per VM type across all NICs, for all destinations. Upper limits are not guaranteed, but are intended to provide guidance for selecting the right VM type for the intended application. Actual network performance will depend on a variety of factors including network congestion, application loads, and network settings. For information on optimizing network throughput, see Optimizing network throughput for Windows and Linux. To achieve the expected network performance on Linux or Windows, it may be necessary to select a specific version or optimize your VM. For more information, see How to reliably test for virtual machine throughput.

Supported operating systems and drivers

To take advantage of the GPU capabilities of Azure N-series VMs running Windows, NVIDIA GPU drivers must be installed. The NVIDIA GPU Driver Extension installs appropriate NVIDIA CUDA or GRID drivers on an N-series VM. Install or manage the extension using the Azure portal or tools such as Azure PowerShell or Azure Resource Manager templates. See the NVIDIA GPU Driver Extension documentation for supported operating systems and deployment steps. For general information about VM extensions, see Azure virtual machine extensions and features.

If you choose to install NVIDIA GPU drivers manually, see N-series GPU driver setup for Windows for supported operating systems, drivers, and installation and verification steps.

Deployment considerations

  • For availability of N-series VMs, see Products available by region.

  • N-series VMs can only be deployed in the Resource Manager deployment model.

  • N-series VMs differ in the type of Azure Storage they support for their disks. NC and NV VMs only support VM disks that are backed by Standard Disk Storage (HDD). NCv2, NCv3, ND, NDv2, and NVv2 VMs only support VM disks that are backed by Premium Disk Storage (SSD).

  • If you want to deploy more than a few N-series VMs, consider a pay-as-you-go subscription or other purchase options. If you're using an Azure free account, you can use only a limited number of Azure compute cores.

  • You might need to increase the cores quota (per region) in your Azure subscription, and increase the separate quota for NC, NCv2, NCv3, ND, NDv2, NV, or NVv2 cores. To request a quota increase, open an online customer support request at no charge. Default limits may vary depending on your subscription category.

Other sizes

Next steps

Learn more about how Azure compute units (ACU) can help you compare compute performance across Azure SKUs.