您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

高性能计算 VM 大小High performance computing VM sizes

Azure H 系列虚拟机 (Vm) 旨在为各种真实的 HPC 工作负荷提供领导级的性能、可伸缩性和成本效益。Azure H-series virtual machines (VMs) are designed to deliver leadership-class performance, scalability, and cost efficiency for a variety of real-world HPC workloads.

HBv2 系列 Vm 针对内存带宽驱动的应用程序进行了优化,例如流体 dynamics、有限元素分析和容器模拟。HBv2-series VMs are optimized for applications driven by memory bandwidth, such as fluid dynamics, finite element analysis, and reservoir simulation. HBv2 Vm 功能 120 AMD EPYC 7742 处理器核心,每个 CPU 核心 4 GB RAM,无并发多线程处理。HBv2 VMs feature 120 AMD EPYC 7742 processor cores, 4 GB of RAM per CPU core, and no simultaneous multithreading. 每个 HBv2 VM 提供高达 340 GB/秒的内存带宽和最多4个兆次的 FP64 计算。Each HBv2 VM provides up to 340 GB/sec of memory bandwidth, and up to 4 teraFLOPS of FP64 compute.

HBv2 Vm 功能 200 Gb/秒,而 HB-ACCT-WC 和 HC 系列 Vm 功能 100 Gb/秒,则不会产生任何功能。HBv2 VMs feature 200 Gb/sec Mellanox HDR InfiniBand, while both HB and HC-series VMs feature 100 Gb/sec Mellanox EDR InfiniBand. 其中每个 VM 类型都连接在一个非阻塞 fat 树中,以实现优化和一致的 RDMA 性能。Each of these VM types are connected in a non-blocking fat tree for optimized and consistent RDMA performance. HBv2 Vm 支持自适应路由,而动态连接传输 (DCT,) 中的附加到标准 RC 和 UD 传输。HBv2 VMs support Adaptive Routing and the Dynamic Connected Transport (DCT, in additional to standard RC and UD transports). 这些功能增强了应用程序的性能、可伸缩性和一致性,并且强烈建议使用这些功能。These features enhance application performance, scalability, and consistency, and usage of them is strongly recommended.

Hb-acct-wc 系列 Vm 针对内存带宽驱动的应用程序进行了优化,例如流体 dynamics、显式有限元素分析和天气建模。HB-series VMs are optimized for applications driven by memory bandwidth, such as fluid dynamics, explicit finite element analysis, and weather modeling. HB-ACCT-WC Vm 功能 60 AMD EPYC 7551 处理器核心,每个 CPU 核心 4 GB RAM,无超线程。HB VMs feature 60 AMD EPYC 7551 processor cores, 4 GB of RAM per CPU core, and no hyperthreading. AMD EPYC 平台提供超过 260 GB/秒的内存带宽。The AMD EPYC platform provides more than 260 GB/sec of memory bandwidth.

HC 系列 Vm 针对密集计算驱动的应用程序进行了优化,如隐式有限元素分析、分子 dynamics 和计算化学。HC-series VMs are optimized for applications driven by dense computation, such as implicit finite element analysis, molecular dynamics, and computational chemistry. HC Vm 功能 44 Intel 至强白金8168处理器核心,每 CPU 内核 8 GB RAM,无超线程。HC VMs feature 44 Intel Xeon Platinum 8168 processor cores, 8 GB of RAM per CPU core, and no hyperthreading. Intel 强白金平台支持 Intel 丰富的软件工具生态系统,如 Intel 数学内核库。The Intel Xeon Platinum platform supports Intel’s rich ecosystem of software tools such as the Intel Math Kernel Library.

H 系列 Vm 针对按 CPU 频率高的应用程序或每个核心要求提供较大内存的应用程序进行了优化。H-series VMs are optimized for applications driven by high CPU frequencies or large memory per core requirements. H 系列 Vm 的功能8或 16 Intel 至强 E5 2667 v3 处理器核心、7或 14 GB 的 RAM/CPU 核心,无超线程。H-series VMs feature 8 or 16 Intel Xeon E5 2667 v3 processor cores, 7 or 14 GB of RAM per CPU core, and no hyperthreading. H 系列功能 56 Gb/秒,FDR 不会阻止 fat 树配置,以实现一致的 RDMA 性能。H-series features 56 Gb/sec Mellanox FDR InfiniBand in a non-blocking fat tree configuration for consistent RDMA performance. H 系列 Vm 支持 Intel MPI 1.x 和 MS-CHAP。H-series VMs support Intel MPI 5.x and MS-MPI.

备注

所有 HBv2、HB-ACCT-WC 和 HC 系列 Vm 都具有对物理服务器的独占访问权限。All HBv2, HB, and HC-series VMs have exclusive access to the physical servers. 每个物理服务器仅有1个 VM,对于这些 VM 大小,没有任何其他 Vm 共享多租户。There is only 1 VM per physical server and there is no shared multi-tenancy with any other VMs for these VM sizes.

备注

A8 – A11 vm计划于3/2021 停用。The A8 – A11 VMs are planned for retirement on 3/2021. 有关详细信息,请参阅 HPC 迁移指南For more information, see HPC Migration Guide.

支持 RDMA 的实例RDMA-capable instances

大多数 HPC VM 大小 (HBv2、HB-ACCT-WC、HC、H16r、H16mr、A8 和 A9) 为远程直接内存访问提供网络接口, (RDMA) 连接性。Most of the HPC VM sizes (HBv2, HB, HC, H16r, H16mr, A8 and A9) feature a network interface for remote direct memory access (RDMA) connectivity. 选定的 N 系列 大小用 "r" 指定 (ND40rs_v2、ND24rs、NC24rs_v3、NC24rs_v2 和 NC24r) 也支持 RDMA 功能。Selected N-series sizes designated with 'r' (ND40rs_v2, ND24rs, NC24rs_v3, NC24rs_v2 and NC24r) are also RDMA-capable. 此接口是对其他 VM 大小中可用的标准 Azure 以太网网络接口的补充。This interface is in addition to the standard Azure Ethernet network interface available in the other VM sizes.

此接口允许支持 RDMA 的实例通过一种不合格的 (IB) 网络,以 HBv2、HB-ACCT-WC、HC、NDv2、FDR 速率为 H16r、H16mr 和其他支持 RDMA 的 N 系列虚拟机进行通信,并为 A8 和 A9 Vm 提供 QDR 速率。This interface allows the RDMA-capable instances to communicate over an InfiniBand (IB) network, operating at HDR rates for HBv2, EDR rates for HB, HC, NDv2, FDR rates for H16r, H16mr, and other RDMA-capable N-series virtual machines, and QDR rates for A8 and A9 VMs. 这些 RDMA 功能可以提高某些消息传递接口 (MPI) 应用程序的可伸缩性和性能。These RDMA capabilities can boost the scalability and performance of certain Message Passing Interface (MPI) applications.

备注

在 Azure HPC 中,有两类 Vm,具体取决于虚拟机是否已启用 SR-IOV。In Azure HPC, there are two classes of VMs depending on whether they are SR-IOV enabled for InfiniBand. 目前,Azure 上几乎所有支持 RDMA 的、支持 RDMA 的 Vm 或启用了允许的虚拟机均已启用 SR-IOV,但 H16r、H16mr、NC24r、A8、A9 除外。Currently, almost all the newer generation, RDMA-capable or InfiniBand enabled VMs on Azure are SR-IOV enabled except for H16r, H16mr, NC24r, A8, A9. RDMA 仅在 (IB) 网络上启用并且支持所有支持 RDMA 的 Vm。RDMA is only enabled over the InfiniBand (IB) network and is supported for all RDMA-capable VMs. 仅支持 SR-IOV 的 Vm 上的 IP over IB。IP over IB is only supported on the SR-IOV enabled VMs. 不通过以太网网络启用 RDMA。RDMA is not enabled over the Ethernet network.

  • 操作系统 -适用于 HPC vm 的 Linux 非常受支持;通常使用发行版,例如 CentOS、RHEL、Ubuntu 和 SUSE。Operating System - Linux is very well supported for HPC VMs; distros such as CentOS, RHEL, Ubuntu, SUSE are commonly used. 对于 Windows 支持,所有 HPC 系列 Vm 都支持 Windows Server 2016 和更高版本。Regarding Windows support, Windows Server 2016 and newer versions are supported on all the HPC series VMs. Windows Server 2012 R2 和 Windows Server 2012 在非 SR-IOV 启用的 Vm (H16r、H16mr、A8 和 A9) 上都受支持。Windows Server 2012 R2, Windows Server 2012 are also supported on the non-SR-IOV enabled VMs (H16r, H16mr, A8 and A9). 请注意, 在 HBv2 和具有超过 64 (虚拟或物理) 内核的其他 vm 上不支持 Windows Server 2012 R2Note that Windows Server 2012 R2 is not supported on HBv2 and other VMs with more than 64 (virtual or physical) cores. 请参阅 VM 映像 ,获取 Marketplace 上支持的 vm 映像列表,以及如何对其进行相应的配置。See VM Images for a list of supported VM Images on the Marketplace and how they can be configured appropriately.

  • 无限和驱动程序 -在启用了允许的虚拟机上,需要适当的驱动程序才能启用 RDMA。InfiniBand and Drivers - On InfiniBand enabled VMs, the appropriate drivers are required to enable RDMA. 在 Linux 上,对于启用 SR-IOV 和非 SR-IOV 虚拟机的 Vm,Marketplace 中的 CentOS-HPC VM 映像已预先配置了适当的驱动程序。On Linux, for both SR-IOV and non-SR-IOV enabled VMs, the CentOS-HPC VM images in the Marketplace come pre-configured with the appropriate drivers. 可以使用 此处的说明,使用正确的驱动程序配置 Ubuntu VM 映像。The Ubuntu VM images can be configured with the right drivers using the instructions here. 请参阅 配置和优化 LINUX OS 的 vm ,详细了解可用的 VM linux os 映像。See Configure and Optimize VMs for Linux OS for more details on ready-to-use VM Linux OS images.

    在 Linux 上,可以使用 INFINIBANDDRIVERLINUX VM 扩展 安装 Mellanox OFED 驱动程序,并在启用 Sr-iov 的 H 和 N 系列 vm 上启用不受支持。On Linux, the InfiniBandDriverLinux VM extension can be used to install the Mellanox OFED drivers and enable InfiniBand on the SR-IOV enabled H- and N-series VMs. 详细了解如何在 HPC 工作负荷的支持 RDMA 的 vm 上启用允许。Learn more about enabling InfiniBand on RDMA-capable VMs at HPC Workloads.

    在 Windows 上, INFINIBANDDRIVERWINDOWS VM 扩展 会将 Windows 网络直接 (驱动程序安装在非 sr-iov vm) 或 OFED 驱动程序 (上的驱动程序上,以实现 RDMA 连接) 。On Windows, the InfiniBandDriverWindows VM extension installs Windows Network Direct drivers (on non-SR-IOV VMs) or Mellanox OFED drivers (on SR-IOV VMs) for RDMA connectivity. 在某些 A8 和 A9 实例的部署中,会自动添加 HpcVmDrivers 扩展。In certain deployments of A8 and A9 instances, the HpcVmDrivers extension is added automatically. 请注意,不推荐使用 HpcVmDrivers VM 扩展;它不会更新。Note that the HpcVmDrivers VM extension is being deprecated; it will not be updated.

    若要将 VM 扩展添加到 VM,可以使用 Azure PowerShell cmdlet。To add the VM extension to a VM, you can use Azure PowerShell cmdlets. 有关详细信息,请参阅虚拟机扩展和功能For more information, see Virtual machine extensions and features. 还可使用经典部署模型中部署的 VM 扩展。You can also work with extensions for VMs deployed in the classic deployment model.

  • MPI -Azure 上启用了 SR-IOV 的 VM 大小,几乎可以使用任何一种与 Mellanox OFED 一起使用的 MPI 风格。MPI - The SR-IOV enabled VM sizes on Azure allow almost any flavor of MPI to be used with Mellanox OFED. 在非 SR-IOV 启用 Vm 上,支持的 MPI 实现使用 Microsoft Network Direct (ND) 接口在 Vm 之间进行通信。On non-SR-IOV enabled VMs, supported MPI implementations use the Microsoft Network Direct (ND) interface to communicate between VMs. 因此,只支持 Microsoft MPI (MS-MPI) 2012 R2 或更高版本和 Intel MPI 1.x 版本。Hence, only Microsoft MPI (MS-MPI) 2012 R2 or later and Intel MPI 5.x versions are supported. (2017,2018) 的 Intel MPI 运行时库的更高版本可能与 Azure RDMA 驱动程序兼容或不兼容。Later versions (2017, 2018) of the Intel MPI runtime library may or may not be compatible with the Azure RDMA drivers. 有关在 Azure 上的 HPC Vm 上设置 MPI 的详细信息,请参阅 设置 mpi FOR HPCSee Setup MPI for HPC for more details on setting up MPI on HPC VMs on Azure.

  • RDMA 网络地址空间 - Azure 中的 RDMA 网络保留地址空间 172.16.0.0/16。RDMA network address space - The RDMA network in Azure reserves the address space 172.16.0.0/16. 若要在 Azure 虚拟网络中部署的实例上运行 MPI 应用程序,请确保虚拟网络地址空间不与 RDMA 网络重叠。To run MPI applications on instances deployed in an Azure virtual network, make sure that the virtual network address space does not overlap the RDMA network.

群集配置选项Cluster configuration options

Azure 提供了多个选项,用于创建可使用 RDMA 网络通信的 Windows HPC VM 的群集,包括:Azure provides several options to create clusters of Windows HPC VMs that can communicate using the RDMA network, including:

  • 虚拟机 -使用 Azure 资源管理器部署) 模型时,在同一规模集或可用性集中部署支持 RDMA 的 HPC vm (。Virtual machines - Deploy the RDMA-capable HPC VMs in the same scale set or availability set (when you use the Azure Resource Manager deployment model). 如果使用经典部署模型,请在同一云服务中部署 VM。If you use the classic deployment model, deploy the VMs in the same cloud service.

  • 虚拟机规模 集-在虚拟机规模集中,确保将部署限制为单个放置组,以便在规模集内进行不受限制的通信。Virtual machine scale sets - In a virtual machine scale set, ensure that you limit the deployment to a single placement group for InfiniBand communication within the scale set. 例如,在资源管理器模板中,将 singlePlacementGroup 属性设置为 trueFor example, in a Resource Manager template, set the singlePlacementGroup property to true. 请注意,默认情况下,可以使用属性设置的最大规模集大小限制 singlePlacementGroup true 在 100 vm。Note that the maximum scale set size that can be spun up with singlePlacementGroup property to true is capped at 100 VMs by default. 如果你的 HPC 作业规模需求高于单个租户中的 100 Vm,则你可以请求增加,免费 打开联机客户支持请求If your HPC job scale needs are higher than 100 VMs in a single tenant, you may request an increase, open an online customer support request at no charge. 单个规模集中 Vm 的数目限制可增加到300。The limit on the number of VMs in a single scale set can be increased to 300. 请注意,使用可用性集部署 Vm 时,最大限制为每个可用性集200个 Vm。Note that when deploying VMs using Availability Sets the maximum limit is at 200 VMs per Availability Set.

  • 虚拟机之间的 MPI -如果 RDMA (例如,在虚拟机 (vm) 之间需要使用 MPI 通信) ,请确保 vm 处于相同的虚拟机规模集或可用性集中。MPI among virtual machines - If RDMA (e.g. using MPI communication) is required between virtual machines (VMs), ensure that the VMs are in the same virtual machine scale set or availability set.

  • Azure CycleCloud -在 azure CYCLECLOUD 中创建 HPC 群集来运行 MPI 作业。Azure CycleCloud - Create an HPC cluster in Azure CycleCloud to run MPI jobs.

  • Azure Batch -创建 Azure Batch 池来运行 MPI 工作负荷。Azure Batch - Create an Azure Batch pool to run MPI workloads. 若要在 Azure Batch 中运行 MPI 应用程序时使用计算密集型实例,请参阅在 Azure Batch 中使用多实例任务来运行消息传递接口 (MPI) 应用程序To use compute-intensive instances when running MPI applications with Azure Batch, see Use multi-instance tasks to run Message Passing Interface (MPI) applications in Azure Batch.

  • MICROSOFT HPC Pack - HPC Pack包括用于 MS MPI 的运行时环境,该环境在支持 Rdma 的 Linux vm 上部署时使用 Azure RDMA 网络。Microsoft HPC Pack - HPC Pack includes a runtime environment for MS-MPI that uses the Azure RDMA network when deployed on RDMA-capable Linux VMs. 有关示例部署,请参阅 使用 HPC Pack 设置 LINUX RDMA 群集以运行 MPI 应用程序For example deployments, see Set up a Linux RDMA cluster with HPC Pack to run MPI applications.

部署注意事项Deployment considerations

  • Azure 订阅 - 若要部署不止一些计算密集型实例,请考虑使用即用即付订阅或其他购买选项。Azure subscription – To deploy more than a few compute-intensive instances, consider a pay-as-you-go subscription or other purchase options. 如果使用的是 Azure 免费帐户,则仅可以使用有限数量的 Azure 计算核心。If you're using an Azure free account, you can use only a limited number of Azure compute cores.

  • 定价和可用性 - 只在标准定价层提供 VM 大小。Pricing and availability - These VM sizes are offered only in the Standard pricing tier. 有关各 Azure 区域推出的产品,请查看 Products available by region(按区域提供的产品)。Check Products available by region for availability in Azure regions.

  • 核心配额 - 可能需要在 Azure 订阅中在默认值的基础上增加核心配额。Cores quota – You might need to increase the cores quota in your Azure subscription from the default value. 订阅可能也会限制可在特定 VM 大小系列(包括 H 系列)中部署的核心数目。Your subscription might also limit the number of cores you can deploy in certain VM size families, including the H-series. 若要请求增加配额,可免费 建立联机客户支持请求To request a quota increase, open an online customer support request at no charge. (默认限制可能会因订阅类别而异。)(Default limits may vary depending on your subscription category.)

    备注

    如果有大规模容量需求,请联系 Azure 支持。Contact Azure Support if you have large-scale capacity needs. Azure 配额为信用额度,而不是容量保障。Azure quotas are credit limits, not capacity guarantees. 不管配额是什么,都只根据所用的核心数计费。Regardless of your quota, you are only charged for cores that you use.

  • 虚拟网络 – Azure 虚拟网络不需要使用计算密集型实例。Virtual network – An Azure virtual network is not required to use the compute-intensive instances. 但是,对于许多部署来说,如果需要访问本地资源,则可能至少需要一个基于云的 Azure 虚拟网络或站点到站点连接。However, for many deployments you need at least a cloud-based Azure virtual network, or a site-to-site connection if you need to access on-premises resources. 需要时,请创建一个新的虚拟网络来部署实例。When needed, create a new virtual network to deploy the instances. 不支持将计算密集型 VM 添加到地缘组中的虚拟网络。Adding compute-intensive VMs to a virtual network in an affinity group is not supported.

  • 调整大小 –由于其专用硬件,只能在同一大小系列 (H 系列或 N 系列) 内调整计算密集型实例的大小。Resizing – Because of their specialized hardware, you can only resize compute-intensive instances within the same size family (H-series or N-series). 例如,可仅将 H 系列 VM 的大小从一个 H 系列大小调整为另一个。For example, you can only resize an H-series VM from one H-series size to another. 对于某些 Vm,可能需要考虑有关不受支持的驱动程序支持和 NVMe 磁盘的其他注意事项。Additional considerations around InfiniBand driver support and NVMe disks may need to be considered for certain VMs.

其他大小Other sizes

后续步骤Next steps