What are virtual machine scale sets in Azure?

Virtual machine scale sets are an Azure Compute resource, you can use to deploy and manage a set of identical VMs. With all VMs configured the same, scale sets are designed to support true autoscale – no pre-provisioning of VMs is required – making it easier to build large-scale services targeting big compute, big data, and containerized workloads.

For applications that need to scale compute resources out and in, scale operations are implicitly balanced across fault and update domains. For an introduction to scale sets, refer to the Azure blog announcement.

Watch these videos for more about scale sets:

Creating and managing scale sets

You can create a scale set in the Azure portal by selecting new and typing in "scale" in the search bar. "Virtual machine scale set" is listed in the results. From there, you can fill in the required fields to customize and deploy your scale set. Note there are also options to set up basic autoscale rules based on CPU usage in the portal.

Scale sets can also be defined and deployed using JSON templates and REST APIs just like individual Azure Resource Manager VMs. Therefore, any standard Azure Resource Manager deployment methods can be used. For more information about templates, see Authoring Azure Resource Manager templates.

A set of example templates for virtual machine scale sets can be found in the Azure Quickstart templates GitHub repository here (look for templates with vmss in the title).

There is a button that links to the portal deployment feature in the detail pages for these templates. To deploy the scale set, click the button and then fill in any parameters that are required in the portal. If you are not sure whether a resource supports upper or mixed case, it is safer to always use lower case letters and numbers in parameter values. There is also a handy video dissection of a scale set template here:

VM scale set Template Dissection

Scaling a scale set out and in

You can change the capacity of a scale set in the Azure portal by clicking the Scaling section under Settings.

To change scale set capacity on the command line, Azure CLI provides a scale command. For example, to set a scale set to a capacity of 10 VMs:

az vmss scale -g resourcegroupname -n scalesetname --new-capacity 10 

To set the number of VMs in a scale set using PowerShell, use the Update-AzureRmVmss command:

$vmss = Get-AzureRmVmss -ResourceGroupName resourcegroupname -VMScaleSetName scalesetname  
$vmss.Sku.Capacity = 10
Update-AzureRmVmss -ResourceGroupName resourcegroupname -Name scalesetname -VirtualMachineScaleSet $vmss

To increase or decrease the number of virtual machines in a scale set using an Azure Resource Manager template, change the capacity property and redeploy the template. This simplicity makes it easy to integrate scale sets with Azure autoscale, or to write your own custom scaling layer if you need to define custom scale events that are not supported by Azure autoscale.

If you are redeploying an Azure Resource Manager template to change the capacity, you could define a much smaller template, which only includes the 'SKU' property packet with the updated capacity. An example is shown here.

Autoscale

A scale set can be optionally configured with autoscale settings when it is created in the Azure portal, allowing the number of VMs to be increased or decreased based on average CPU usage. Many of the scale set templates in Azure quickstart templates define autoscale settings. You can also add autoscale settings to an existing scale set. For example, here is an Azure PowerShell script to add CPU based autoscale to a scale set:


$subid = "yoursubscriptionid"
$rgname = "yourresourcegroup"
$vmssname = "yourscalesetname"
$location = "yourlocation" # e.g. southcentralus

$rule1 = New-AzureRmAutoscaleRule -MetricName "Percentage CPU" -MetricResourceId /subscriptions/$subid/resourceGroups/$rgname/providers/Microsoft.Compute/virtualMachineScaleSets/$vmssname -Operator GreaterThan -MetricStatistic Average -Threshold 60 -TimeGrain 00:01:00 -TimeWindow 00:05:00 -ScaleActionCooldown 00:05:00 -ScaleActionDirection Increase -ScaleActionValue 1
$rule2 = New-AzureRmAutoscaleRule -MetricName "Percentage CPU" -MetricResourceId /subscriptions/$subid/resourceGroups/$rgname/providers/Microsoft.Compute/virtualMachineScaleSets/$vmssname -Operator LessThan -MetricStatistic Average -Threshold 30 -TimeGrain 00:01:00 -TimeWindow 00:05:00 -ScaleActionCooldown 00:05:00 -ScaleActionDirection Decrease -ScaleActionValue 1
$profile1 = New-AzureRmAutoscaleProfile -DefaultCapacity 2 -MaximumCapacity 10 -MinimumCapacity 2 -Rules $rule1,$rule2 -Name "autoprofile1"
Add-AzureRmAutoscaleSetting -Location $location -Name "autosetting1" -ResourceGroup $rgname -TargetResourceId /subscriptions/$subid/resourceGroups/$rgname/providers/Microsoft.Compute/virtualMachineScaleSets/$vmssname -AutoscaleProfiles $profile1

You can find a list of valid metrics to scale on here: Supported metrics with Azure Monitor under the heading Microsoft.Compute/virtualMachineScaleSets. More advanced autoscale options are also available, including schedule-based autoscale, and using webhooks to integrate with alerting systems.

Monitoring your scale set

The Azure portal lists scale sets, and shows their properties. The portal also supports management operations, which can be performed both on scale sets, and on individual VMs within a scale set. The portal also provides a customizable resource usage graph. If you need to see or edit the underlying JSON definition of an Azure resource, you can also use the Azure Resource Explorer. Scale sets are a resource under the Microsoft.Compute Azure Resource Provider, so from this site you can see them by expanding the following links:

Subscriptions -> your subscription -> resourceGroups -> providers -> Microsoft.Compute -> virtualMachineScaleSets -> your scale set -> etc.

Scale set scenarios

This section lists some typical scale set scenarios. Some higher-level Azure services (like Batch, Service Fabric, Azure Container Service) use these scenarios.

  • RDP / SSH to scale set instances - A scale set is created inside a VNET and individual VMs in the scale set are not allocated public IP addresses. This policy avoids the expense and management overhead of allocating separate public IP addresses to all the nodes in your compute grid. You can connect to these VMs from other resources in your VNET, for example load balancers and standalone virtual machines, which can be allocated public IP addresses.
  • Connect to VMs using NAT rules - You can create a public IP address, assign it to a load balancer, and define an inbound NAT pool, which maps ports on the IP address to a port on a VM in the scale set. For example:

    Source Source Port Destination Destination Port
    Public IP Port 50000 vmss_0 Port 22
    Public IP Port 50001 vmss_1 Port 22
    Public IP Port 50002 vmss_2 Port 22

    In this example, NAT rules are defined to enable an SSH connection to every VM in a scale set, using a single public IP address: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-linux-nat

    Here's an example of doing the same with RDP and Windows: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-windows-nat

  • Connect to VMs using a "jumpbox" - If you create a scale set and a standalone VM in the same VNET, the standalone VM and the scale set VMs can connect to one another using their internal IP addresses as defined by the VNET/Subnet. If you create a public IP address and assign it to the standalone VM, you can RDP or SSH to the standalone VM, and then connect from that machine to your scale set instances. You may notice at this point that a simple scale set is inherently more secure than a simple standalone VM with a public IP address in its default configuration.

    For example, this template deploys a simple scale set with a standalone VM: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-linux-jumpbox

  • Load balancing to scale set instances - If you want to deliver work to a compute cluster of VMs using a "round-robin" approach, you can configure an Azure load balancer with layer-4 load-balancing rules accordingly. You can define probes to verify your application is running by pinging ports with a specified protocol, interval, and request path. The Azure Application Gateway also supports scale sets, along with layer-7 and more sophisticated load balancing scenarios.

    This example creates a scale set running Apache web servers, and uses a load balancer to balance the load that each VM receives: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-ubuntu-web-ssl (look at the Microsoft.Network/loadBalancers resource type and the networkProfile and extensionProfile in the virtualMachineScaleSet)

    This example uses an Application Gateway. Linux: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-ubuntu-app-gateway. Windows: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-windows-app-gateway

  • Deploying a scale set as a compute cluster in a PaaS cluster manager - scale sets are sometimes described as a next-generation worker role. Though a valid description, it does run the risk of confusing scale set features with Azure Cloud Services features. In a sense, scale sets provide a true "worker role" or worker resource, in that they are a generalized compute resource, which is platform/runtime independent, customizable, and integrates into Azure Resource Manager IaaS.

    A Cloud Services worker role, while limited in terms of platform/runtime support (Windows platform images only) also includes services such as VIP swap, configurable upgrade settings, runtime/app deployment-specific settings, which are either not yet available in scale sets, or are delivered by other higher-level PaaS services like Service Fabric. You can look at scale sets as an infrastructure that supports PaaS. PaaS solutions like Azure Service Fabric build on this infrastructure.

    For an example of this approach, the Azure Container Service deploys a cluster based on scale sets with a container orchestrator: https://github.com/Azure/azure-quickstart-templates/tree/master/101-acs-dcos.

Scale set performance and scale guidance

  • Scale sets support up to 1,000 VMs in a scale set. If you create and upload your own custom VM images, the limit is 100. For considerations when using large scale sets, see Working with large virtual machine scale sets.
  • You do not have to pre-create Azure storage accounts to use scale sets. Scale sets support Azure Managed Disks, which negates performance concerns about the number of disks per storage account. For more information, see Azure virtual machine scale sets and managed disks.
  • Consider using Azure Premium storage instead of Standard storage for faster, more predictable VM provisioning times, and improved IO performance.
  • The number of VMs you can create is limited by the core quota in the region in which you are deploying. You may need to contact Customer Support to increase your Compute quota limit increased, even if you have a high limit of cores for use with Azure cloud services today. To query your quota, run this Azure CLI command: azure vm list-usage, or the following PowerShell command: Get-AzureRmVMUsage (if using a version of PowerShell below 1.0 use Get-AzureVMUsage).

Scale set frequently asked questions

Q. How many VMs can you have in a scale set?

A. A scale set can have between 0 and 1,000 VMs based on platform images, or 0-100 VMs based on custom images.

Q. Are data disks Supported within scale sets?

A. Yes. A scale set can define an attached data drives configuration that applies to all VMs in the set. For more information, see (Azure scale sets and attached data disks)[virtual-machine-scale-sets-attached-disks.md]. Other options for storing data include:

  • Azure files (SMB shared drives)
  • OS drive
  • Temp drive (local, not backed by Azure storage)
  • Azure data service (for example, Azure tables, Azure blobs)
  • External data service (for example, remote DB)

Q. Which Azure regions support scale sets?

A. All regions support scale sets.

Q. How do you create a scale set using a custom image?

A. Create a Managed Disk based on your custom image VHD and reference it in your scale set template. Here is an example: https://github.com/chagarw/MDPP/tree/master/101-vmss-custom-os.

Q. If I reduce my scale set capacity from 20 to 15, which VMs are removed?

A. Virtual machines are removed from the scale set evenly across upgrade domains and fault domains to maximize availability. VMs with the highest id's are removed first.

Q. How about it if I then increase the capacity from 15 to 18?

A. If you increase capacity to 18, then 3 new VMs are created. Each time the VM instance id are incremented from the previous highest value (for example, 20, 21, 22). VMs are balanced across FDs and UDs.

Q. When using multiple extensions in a scale set, can I enforce an execution sequence?

A. Not directly, but for the customScript extension, your script could wait for another extension to complete (for example by monitoring the extension log). Additional guidance on extension sequencing can be found in this blog post: Extension Sequencing in Azure VM Scale Sets.

Q. Do scale sets work with Azure availability sets?

A. Yes. A scale set is an implicit availability set with 5 FDs and 5 UDs. Scale sets of more than 100 VMs span multiple 'placement groups', which are equivalent to multiple availability sets. For more information about placement groups, see Working with large virtual machine scale sets. An availability set of VMs can exist in the same VNET as a scale set of VMs. A common configuration is to put control node VMs, which often require unique configuration, in an availability set, and data nodes in the scale set.

More frequently asked questions about scale sets can be found in the Azure Virtual Machine Scale Sets FAQ.