Azure Stack Hub 上的图形处理单元 (GPU) 虚拟机 (VM)Graphics processing unit (GPU) virtual machine (VM) on Azure Stack Hub

适用于:Azure Stack 集成系统Applies to: Azure Stack integrated systems

本文介绍 Azure Stack Hub 多节点系统 (GPU) 型号的图形处理单元。This article describes which graphics processing unit (GPU) models are supported on an Azure Stack Hub multinode system. 还可以找到有关安装与 GPU 配合使用的驱动程序的说明。You can also find instructions on installing the drivers used with the GPUs. Azure Stack Hub 中的 GPU 支持实现了诸如人工智能、训练、推理和数据可视化之类的解决方案。GPU support in Azure Stack Hub enables solutions such as Artificial Intelligence, training, inference, and data visualization. 可以使用 AMD Radeon Instinct MI25 来支持图形密集型应用程序,如 Autodesk AutoCAD。The AMD Radeon Instinct MI25 can be used to support graphic-intensive applications such as Autodesk AutoCAD.

可以在公共预览期间从三个 GPU 模型中进行选择。You can choose from three GPU models in the public preview period. 它们分别为 NVIDIA V100 GPU、NVIDIA T4 GPU 和 AMD Mi25 GPU。They are available in NVIDIA V100, NVIDIA T4 and AMD MI25 GPUs. 这些物理 GPU 支持以下 Azure N 系列虚拟机 (VM) 类型,如下所示:These physical GPUs align with the following Azure N-Series virtual machine (VM) types as follows:

重要

公共预览版目前支持 Azure Stack Hub GPU。Azure Stack Hub GPU support is currently in public preview. 若要参与预览,请完成 aka.ms/azurestackhubgpupreview 上的表单。To participate in the preview, complete the form at aka.ms/azurestackhubgpupreview. 此预览版在提供时没有附带服务级别协议,不建议将其用于生产工作负荷。This preview version is provided without a service level agreement, and it's not recommended for production workloads. 某些功能可能不受支持或者受限。Certain features might not be supported or might have constrained capabilities. 有关详细信息,请参阅 Microsoft Azure 预览版补充使用条款For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

NCv3NCv3

NCv3 系列 VM 采用 NVIDIA Tesla V100 GPU。NCv3-series VMs are powered by NVIDIA Tesla V100 GPUs. 客户可将这些更新的 GPU 用于传统的 HPC 工作负荷,例如油藏模拟、DNA 测序、蛋白质分析、Monte Carlo 模拟和其他工作负荷。Customers can take advantage of these updated GPUs for traditional HPC workloads such as reservoir modeling, DNA sequencing, protein analysis, Monte Carlo simulations, and others.

大小Size vCPUvCPU 内存:GiBMemory: GiB 临时存储 (SSD) GiBTemp storage (SSD) GiB GPUGPU GPU 内存:GiBGPU memory: GiB 最大数据磁盘数Max data disks 最大 NIC 数Max NICs
Standard_NC6s_v3Standard_NC6s_v3 66 112112 736736 11 1616 1212 44
Standard_NC12s_v3Standard_NC12s_v3 1212 224224 14741474 22 3232 2424 88
Standard_NC24s_v3Standard_NC24s_v3 2424 448448 29482948 44 6464 3232 88

NVv4NVv4

NVv4 系列虚拟机由 AMD Radeon Instinct MI25 GPU 提供支持。The NVv4-series virtual machines are powered by AMD Radeon Instinct MI25 GPUs. 通过 NVv4 系列,Azure Stack Hub 正在引入使用部分 GPU 的虚拟机。With NVv4-series Azure Stack Hub is introducing virtual machines with partial GPUs. 此大小可用于 GPU 加速的图形应用程序和虚拟桌面。This size can be used for GPU accelerated graphics applications and virtual desktops. NVv4 虚拟机目前只支持 Windows 来宾操作系统。NVv4 virtual machines currently support only Windows guest operating system.

大小Size vCPUvCPU 内存:GiBMemory: GiB 临时存储 (SSD) GiBTemp storage (SSD) GiB GPUGPU GPU 内存:GiBGPU memory: GiB 最大数据磁盘数Max data disks 最大 NIC 数Max NICs
Standard_NV4as_v4Standard_NV4as_v4 44 1414 8888 1/81/8 22 44 22

NCasT4_v3NCasT4_v3

大小Size vCPUvCPU 内存:GiBMemory: GiB GPUGPU GPU 内存:GiBGPU memory: GiB 最大数据磁盘数Max data disks 最大 NIC 数Max NICs
Standard_NC4as_T4_v3Standard_NC4as_T4_v3 44 2828 11 1616 88 44
Standard_NC8as_T4_v3Standard_NC8as_T4_v3 88 5656 11 1616 1616 88
Standard_NC16as_T4_v3Standard_NC16as_T4_v3 1616 112112 11 1616 3232 88
Standard_NC64as_T4_v3Standard_NC64as_T4_v3 6464 448448 44 6464 3232 88

VM 的修补升级以及 FRU 行为Patch and update, FRU behavior of VMs

GPU VM 将在执行 Azure Stack Hub 的修补升级 (PnU) 以及硬件更换 (FRU) 等操作期间停机。GPU VMs will undergo downtime during operations such as patch and update (PnU) as well as hardware replacement (FRU) of Azure Stack Hub. 下表介绍了在这些活动期间观察到的 VM 状态,以及用户在这些操作后可以执行(以使这些 VM 再次可用)的手动操作。The following table goes over the state of the VM as observed during these activities as well as the manual action that the user can do to make these VMs available again post these operations.

操作Operation PnU - 快速更新PnU - Express Update PnU - 完全更新、OEM 更新PnU - Full Update, OEM update FRUFRU
VM 状态VM state 在更新期间和更新后不可用,无需手动启动操作Unavailable during and post update without manual start operation 在更新期间不可用。Unavailable during update. 在更新后可用,需要手动操作Available post update with manual operation 在更新期间不可用。Unavailable during update. 在更新后可用,需要手动操作Available post update with manual operation
手动操作Manual operation 如果 VM 需要在更新期间可用,并且可用 GPU 分区存在,则可通过单击“重启”按钮从门户重启 VM。If the VM needs to be made available during the update, if there are available GPU partitions, the VM can be restarted from the portal by clicking the Restart button. 在更新后,使用“重启”按钮从门户重启 VMRestart the VM after the update from the portal using the Restart button 在更新期间无法使 VM 可用。VM cannot be made available during the update. 完成更新后,需要使用“停止”按钮停止对 VM 解除分配,并使用“开始”按钮开始备份Post update completion, VM needs to be stop-deallocated using the Stop button and started back up using the "Start" button 在更新期间无法使 VM 可用。完成更新后,需要使用“停止”按钮停止对 VM 解除分配,并使用“开始”按钮开始备份。VM cannot be made available during the update.Post update completion, VM needs to be stop-deallocated using the Stop button and started back up using the Start button.

来宾驱动程序安装Guest driver installation

AMD MI25AMD MI25

在运行 Windows 的 N 系列 VM 上安装 AMD GPU 驱动程序一文提供了有关在启用了 NVv4 GPU-P 的 VM 内安装 AMD Radeon Instinct MI25 的驱动程序的说明,以及有关如何验证驱动程序安装的步骤。The article Install AMD GPU drivers on N-series VMs running Windows provides instructions on installing the driver for the AMD Radeon Instinct MI25 inside the NVv4 GPU-P enabled VM along with steps on how to verify driver installation. 此扩展只能在联网模式下正常工作。This extension only works in connected mode.

NVIDIANVIDIA

必须在虚拟机内安装 NVIDIA 驱动程序,才能使用 GPU 进行 CUDA 或网格工作负荷。NVIDIA drivers must be installed inside the virtual machine for CUDA or GRID workloads using the GPU.

用例:图形/可视化Use case: graphics/visualization

此方案需要使用网格驱动程序。This scenario requires the use of GRID drivers. 可以通过 NVIDIA 应用程序中心下载 GRID 驱动程序,前提是具有所需的许可证。GRID drivers can be downloaded through the NVIDIA Application Hub provided you have the required licenses. 网格驱动程序还需要具有相应网格许可证的网格许可证服务器,然后才能在 VM 上使用网格驱动程序。The GRID drivers also require a GRID license server with appropriate GRID licenses before using the GRID drivers on the VM. 可参阅此文档,了解如何设置许可证服务器。This can be used to learn how to setup the license server.

用例:计算/CUDAUse case: compute/CUDA

需要在 VM 上手动安装 NVIDIA CUDA 驱动程序和 GRID 驱动程序。NVIDIA CUDA driers and GRID drivers will need to be manually installed on the VM. Tesla CUDA 驱动程序可从 NVIDIA 下载网站获取。The Tesla CUDA drivers can be obtained from the NVIDIA download website. CUDA 驱动程序不需要许可证服务器。CUDA drivers do not need a license server.

后续步骤Next steps