Checklist: Best practices for SQL Server on Azure VMs

APPLIES TO: SQL Server on Azure VM

This article provides a quick checklist as a series of best practices and guidelines to optimize performance of your SQL Server on Azure Virtual Machines (VMs).

For comprehensive details, see the other articles in this series: Checklist, VM size, Storage, Security, HADR configuration, Collect baseline.

Overview

While running SQL Server on Azure Virtual Machines, continue using the same database performance tuning options that are applicable to SQL Server in on-premises server environments. However, the performance of a relational database in a public cloud depends on many factors, such as the size of a virtual machine, and the configuration of the data disks.

There is typically a trade-off between optimizing for costs and optimizing for performance. This performance best practices series is focused on getting the best performance for SQL Server on Azure Virtual Machines. If your workload is less demanding, you might not require every recommended optimization. Consider your performance needs, costs, and workload patterns as you evaluate these recommendations.

VM Size

The following is a quick checklist of VM size best practices for running your SQL Server on Azure VM:

  • Use VM sizes with 4 or more vCPU like the Standard_M8-4ms, the E4ds_v4, or the DS12_v2 or higher.
  • Use memory optimized virtual machine sizes for the best performance of SQL Server workloads.
  • The DSv2 11-15, Edsv4 series, the M-, and the Mv2- series offer the optimal memory-to-vCore ratio required for OLTP workloads. Both M series VMs offer the highest memory-to-vCore ratio required for mission critical workloads and are also ideal for data warehouse workloads.
  • Consider a higher memory-to-vCore ratio for mission critical and data warehouse workloads.
  • Use the Azure Virtual Machine marketplace images as the SQL Server settings and storage options are configured for optimal SQL Server performance.
  • Collect the target workload's performance characteristics and use them to determine the appropriate VM size for your business.
  • Use the Data Migration Assistant SKU recommendation tool to find the right VM size for your existing SQL Server workload.

To learn more, see the comprehensive VM size best practices.

Storage

The following is a quick checklist of storage configuration best practices for running your SQL Server on Azure VM:

  • Monitor the application and determine storage bandwidth and latency requirements for SQL Server data, log, and tempdb files before choosing the disk type.
  • To optimize storage performance, plan for highest uncached IOPS available and use data caching as a performance feature for data reads while avoiding virtual machine and disks capping/throttling.
  • Place data, log, and tempdb files on separate drives.
    • For the data drive, only use premium P30 and P40 disks to ensure the availability of cache support
    • For the log drive plan for capacity and test performance versus cost while evaluating the premium P30 - P80 disks.
      • If submillisecond storage latency is required, use Azure ultra disks for the transaction log.
      • For M-series virtual machine deployments consider Write Accelerator over using Azure ultra disks.
    • Place tempdb on the local ephemeral SSD (default D:\) drive for most SQL Server workloads after choosing the optimal VM size.
      • If the capacity of the local drive is not enough for tempdb, consider sizing up the VM. See Data file caching policies for more information.
  • Stripe multiple Azure data disks using Storage Spaces to increase I/O bandwidth up to the target virtual machine's IOPS and throughput limits.
  • Set host caching to read-only for data file disks.
  • Set host caching to none for log file disks.
    • Do not enable read/write caching on disks that contain SQL Server files.
    • Always stop the SQL Server service before changing the cache settings of your disk.
  • For development and test workloads consider using standard storage. It is not recommended to use Standard HDD/SDD for production workloads.
  • Credit-based Disk Bursting (P1-P20) should only be considered for smaller dev/test workloads and departmental systems.
  • Provision the storage account in the same region as the SQL Server VM.
  • Disable Azure geo-redundant storage (geo-replication) and use LRS (local redundant storage) on the storage account.
  • Format your data disk to use 64-KB allocation unit size for all data files placed on a drive other than the temporary D:\ drive (which has a default of 4 KB). SQL Server VMs deployed through Azure Marketplace come with data disks formatted with allocation unit size and interleave for the storage pool set to 64 KB.

To learn more, see the comprehensive Storage best practices.

SQL Server features

The following is a quick checklist of best practices for SQL Server configuration settings when running your SQL Server instances in an Azure virtual machine in production:

Azure features

The following is a quick checklist of best practices for Azure-specific guidance when running your SQL Server on Azure VM:

HADR configuration

High availability and disaster recovery (HADR) features, such as the Always On availability group and the failover cluster instance rely on underlying Windows Server Failover Cluster technology. Review the best practices for modifying your HADR settings to better support the cloud environment.

For your Windows cluster, consider these best practices:

  • Change the cluster to less aggressive parameters to avoid unexpected outages from transient network failures or Azure platform maintenance. To learn more, see heartbeat and threshold settings. For Windows Server 2012 and later, use the following recommended values:
    • SameSubnetDelay: 1 second
    • SameSubnetThreshold: 40 heartbeats
    • CrossSubnetDelay: 1 second
    • CrossSubnetThreshold: 40 heartbeats
  • Place your VMs in an availability set or different availability zones. To learn more, see VM availability settings.
  • Use a single NIC per cluster node and a single subnet.
  • Configure cluster quorum voting to use 3 or more odd number of votes. Do not assign votes to DR regions.
  • Carefully monitor resource limits to avoid unexpected restarts or failovers due to resource constraints.
    • Ensure your OS, drivers, and SQL Server are at the latest builds.
    • Optimize performance for SQL Server on Azure VMs. Review the other sections in this article to learn more.
    • Reduce or spread out workload to avoid resource limits.
    • Move to a VM or disk that his higher limits to avoid constraints.

For your SQL Server availability group or failover cluster instance, consider these best practices:

  • If you're experiencing frequent unexpected failures, follow the performance best practices outlined in the rest of this article.
  • If optimizing SQL Server VM performance does not resolve your unexpected failovers, consider relaxing the monitoring for the availability group or failover cluster instance. However, doing so may not address the underlying source of the issue and could mask symptoms by reducing the likelihood of failure. You may still need to investigate and address the underlying root cause. For Windows Server 2012 or higher, use the following recommended values:
    • Lease timeout: Use this equation to calculate the maximum lease time out value:
      Lease timeout < (2 * SameSubnetThreshold * SameSubnetDelay).
      Start with 40 seconds. If you're using the relaxed SameSubnetThreshold and SameSubnetDelay values recommended previously, do not exceed 80 seconds for the lease timeout value.
    • Max failures in a specified period: You can set this value to 6.
    • Healthcheck timeout: You can set this value to 60000 initially, adjust as necessary.
  • When using the virtual network name (VNN) to connect to your HADR solution, specify MultiSubnetFailover = true in the connection string, even if your cluster only spans one subnet.
    • If the client does not support MultiSubnetFailover = True you may need to set RegisterAllProvidersIP = 0 and HostRecordTTL = 300 to cache client credentials for shorter durations. However, doing so may cause additional queries to the DNS server.
  • To connect to your HADR solution using the distributed network name (DNN), consider the following:
    • You must use a client driver that supports MultiSubnetFailover = True, and this parameter must be in the connection string.
    • Use a unique DNN port in the connection string when connecting to the DNN listener for an availability group.
  • Use a database mirroring connection string for a basic availability group to bypass the need for a load balancer or DNN.
  • Validate the sector size of your VHDs before deploying your high availability solution to avoid having misaligned I/Os. See KB3009974 to learn more.

To learn more, see the comprehensive HADR best practices.

Next steps

To learn more, see the other articles in this series:

For security best practices, see Security considerations for SQL Server on Azure Virtual Machines.

Review other SQL Server Virtual Machine articles at SQL Server on Azure Virtual Machines Overview. If you have questions about SQL Server virtual machines, see the Frequently Asked Questions.