Overview of autoscale in Microsoft Azure

This article describes what Microsoft Azure autoscale is, its benefits, and how to get started using it.

Azure Monitor autoscale applies only to Virtual Machine Scale Sets, Cloud Services, App Service - Web Apps, API Management services, and Azure Data Explorer Clusters.


Azure has two autoscale methods. An older version of autoscale applies to Virtual Machines (availability sets). This feature has limited support and we recommend migrating to virtual machine scale sets for faster and more reliable autoscale support. A link on how to use the older technology is included in this article.

What is autoscale?

Autoscale allows you to have the right amount of resources running to handle the load on your application. It allows you to add resources to handle increases in load and also save money by removing resources that are sitting idle. You specify a minimum and maximum number of instances to run and add or remove VMs automatically based on a set of rules. Having a minimum makes sure your application is always running even under no load. Having a maximum limits your total possible hourly cost. You automatically scale between these two extremes using rules you create.

Autoscale explained. Add and remove VMs

When rule conditions are met, one or more autoscale actions are triggered. You can add and remove VMs, or perform other actions. The following conceptual diagram shows this process.

Autoscale Flow Diagram

The following explanation applies to the pieces of the previous diagram.

Resource Metrics

Resources emit metrics, these metrics are later processed by rules. Metrics come via different methods. Virtual machine scale sets use telemetry data from Azure diagnostics agents whereas telemetry for Web apps and Cloud services comes directly from the Azure Infrastructure. Some commonly used statistics include CPU Usage, memory usage, thread counts, queue length, and disk usage. For a list of what telemetry data you can use, see Autoscale Common Metrics.

Custom Metrics

You can also leverage your own custom metrics that your application(s) may be emitting. If you have configured your application(s) to send metrics to Application Insights you can leverage those metrics to make decisions on whether to scale or not.


Schedule-based rules are based on UTC. You must set your time zone properly when setting up your rules.


The diagram shows only one autoscale rule, but you can have many of them. You can create complex overlapping rules as needed for your situation. Rule types include

  • Metric-based - For example, do this action when CPU usage is above 50%.
  • Time-based - For example, trigger a webhook every 8am on Saturday in a given time zone.

Metric-based rules measure application load and add or remove VMs based on that load. Schedule-based rules allow you to scale when you see time patterns in your load and want to scale before a possible load increase or decrease occurs.

Actions and automation

Rules can trigger one or more types of actions.

  • Scale - Scale VMs in or out
  • Email - Send email to subscription admins, co-admins, and/or additional email address you specify
  • Automate via webhooks - Call webhooks, which can trigger multiple complex actions inside or outside Azure. Inside Azure, you can start an Azure Automation runbook, Azure Function, or Azure Logic App. Example third-party URL outside Azure include services like Slack and Twilio.

Autoscale Settings

Autoscale use the following terminology and structure.

  • An autoscale setting is read by the autoscale engine to determine whether to scale up or down. It contains one or more profiles, information about the target resource, and notification settings.

    • An autoscale profile is a combination of a:

      • capacity setting, which indicates the minimum, maximum, and default values for number of instances.

      • set of rules, each of which includes a trigger (time or metric) and a scale action (up or down).

      • recurrence, which indicates when autoscale should put this profile into effect.

        You can have multiple profiles, which allow you to take care of different overlapping requirements. You can have different autoscale profiles for different times of day or days of the week, for example.

    • A notification setting defines what notifications should occur when an autoscale event occurs based on satisfying the criteria of one of the autoscale setting’s profiles. Autoscale can notify one or more email addresses or make calls to one or more webhooks.

Azure autoscale setting, profile, and rule structure

The full list of configurable fields and descriptions is available in the Autoscale REST API.

For code examples, see

Horizontal vs vertical scaling

Autoscale only scales horizontally, which is an increase ("out") or decrease ("in") in the number of VM instances. Horizontal is more flexible in a cloud situation as it allows you to run potentially thousands of VMs to handle load.

In contrast, vertical scaling is different. It keeps the same number of VMs, but makes the VMs more ("up") or less ("down") powerful. Power is measured in memory, CPU speed, disk space, etc. Vertical scaling has more limitations. It's dependent on the availability of larger hardware, which quickly hits an upper limit and can vary by region. Vertical scaling also usually requires a VM to stop and restart.

Methods of access

You can set up autoscale via

Supported services for autoscale

Service Schema & Docs
Web Apps Scaling Web Apps
Cloud Services Autoscale a Cloud Service
Virtual Machines: Classic Scaling Classic Virtual Machine Availability Sets
Virtual Machines: Windows Scale Sets Scaling virtual machine scale sets in Windows
Virtual Machines: Linux Scale Sets Scaling virtual machine scale sets in Linux
Virtual Machines: Windows Example Advanced Autoscale configuration using Resource Manager templates for VM Scale Sets
Azure App Service Scale up an app in Azure App service
API Management service Automatically scale an Azure API Management instance
Azure Data Explorer Clusters Manage Azure Data Explorer clusters scaling to accommodate changing demand
Logic Apps Adding integration service environment (ISE) capacity
Spring Cloud Set up autoscale for microservice applications
Service Bus Automatically update messaging units of an Azure Service Bus namespace

Next steps

To learn more about autoscale, use the Autoscale Walkthroughs listed previously or refer to the following resources: