HBv4-series virtual machine performance
Applies to: ✔️ Linux VMs ✔️ Windows VMs ✔️ Flexible scale sets ✔️ Uniform scale sets
Performance expectations using common HPC microbenchmarks are as follows:
Workload | HBv4 |
---|---|
STREAM Triad | 750-780 GB/s of DDR5, up to 5.7 TB/s of 3D V-Cache bandwidth |
High-Performance Linpack (HPL) | Up to 7.6 TF (Rpeak, FP64) for 144-core VM size |
RDMA latency & bandwidth | < 2 microseconds (1 byte), 400 Gb/s (one-way) |
FIO on local NVMe SSDs (RAID0) | 12 GB/s reads, 7 GB/s writes; 186k IOPS reads, 201k IOPS writes |
Process pinning
Process pinning works well on HBv4-series VMs because we expose the underlying silicon as-is to the guest VM. We strongly recommend process pinning for optimal performance and consistency.
Memory bandwidth test
The STREAM memory test can be run using the scripts in this GitHub repository.
git clone https://github.com/Azure/woc-benchmarking
cd woc-benchmarking/apps/hpc/stream/
sh build_stream.sh
sh stream_run_script.sh $PWD “hbrs_v4”
Compute performance test
The HPL benchmark can be run using the script in this GitHub repository.
git clone https://github.com/Azure/woc-benchmarking
cd woc-benchmarking/apps/hpc/hpl
sh hpl_build_script.sh
sh hpl_run_scr_hbv4.sh $PWD
MPI latency
The MPI latency test from the OSU microbenchmark suite can be executed as shown. Sample scripts are on GitHub.
module load mpi/hpcx
mpirun -np 2 --host $src,$dst --map-by node -x LD_LIBRARY_PATH $HPCX_OSU_DIR/osu_latency
MPI bandwidth
The MPI bandwidth test from the OSU microbenchmark suite can be executed per below. Sample scripts are on GitHub.
module load mpi/hpcx
mpirun -np 2 --host $src,$dst --map-by node -x LD_LIBRARY_PATH $HPCX_OSU_DIR/osu_bw
[!NOTE] Define source(src) and destination(dst).
Mellanox Perftest
The Mellanox Perftest package has many InfiniBand tests such as latency (ib_send_lat) and bandwidth (ib_send_bw). An example command is below.
numactl --physcpubind=[INSERT CORE #] ib_send_lat -a
Note
The NUMA node affinity for InfiniBand NIC is NUMA0.
Next steps
- Learn about scaling MPI applications.
- Review the performance and scalability results of HPC applications on the HBv4 VMs at the TechCommunity article.
- Read about the latest announcements, HPC workload examples, and performance results at the Azure HPC Microsoft Community Hub.
- For a higher-level architectural view of running HPC workloads, see High Performance Computing (HPC) on Azure.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for