Scaling Application Gateway v2 and WAF v2

Article
11/03/2023

Application Gateway and WAF can be configured to scale in two modes:

Autoscaling - With autoscaling enabled, the Application Gateway and WAF v2 SKUs scale out or in based on application traffic requirements. This mode offers better elasticity to your application and eliminates the need to guess the application gateway size or instance count. This mode also allows you to save cost by not requiring the gateway to run at peak-provisioned capacity for expected maximum traffic load. You must specify a minimum and optionally maximum instance count. Minimum capacity ensures that Application Gateway and WAF v2 don't fall below the minimum instance count specified, even without traffic. Each instance is roughly equivalent to 10 more reserved Capacity Units. Zero signifies no reserved capacity and is purely autoscaling in nature. You can also optionally specify a maximum instance count, which ensures that the Application Gateway doesn't scale beyond the specified number of instances. You are only billed for the amount of traffic served by the Gateway. The instance counts can range from 0 to 125. The default value for maximum instance count is 10 if not specified.
Manual - You can also choose Manual mode where the gateway doesn't autoscale. In this mode, if there's more traffic than what Application Gateway or WAF can handle, it could result in traffic loss. With manual mode, specifying instance count is mandatory. Instance count can vary from 1 to 125 instances.

Autoscaling and High Availability

Azure Application Gateways are always deployed in a highly available fashion. The service is made up of multiple instances that are created as configured if autoscaling is disabled, or required by the application load if autoscaling is enabled. From the user's perspective, you don't necessarily have visibility into the individual instances, but just into the Application Gateway service as a whole. If a certain instance has a problem and stops being functional, Azure Application Gateway transparently creates a new instance.

Even if you configure autoscaling with zero minimum instances the service is still highly available, which is always included with the fixed price.

However, creating a new instance can take around six or seven minutes. If you don't want to have this downtime, you can configure a minimum instance count of two, ideally with Availability Zone support. This way you have at least two instances in your Azure Application Gateway under normal circumstances. So if one of them had a problem the other tries to handle the traffic while a new instance is being created. An Azure Application Gateway instance can support around 10 Capacity Units. Depending on how much traffic you typically have, you might want to configure your minimum instance autoscaling setting to a value higher than two.

For scale-in events, Application Gateway drains existing connections for 5 minutes on the instance that is subject for removal. After 5 minutes, existing connections are closed and the instance removed. Any new connections during or after the 5 minute scale-in time is established to other existing instances on the same gateway.

Next steps

Learn how to Schedule autoscaling for Application Gateway
Learn more about Application Gateway v2
Create an autoscaling, zone redundant application gateway with a reserved virtual IP address using Azure PowerShell

Scaling Application Gateway v2 and WAF v2

Autoscaling and High Availability

Next steps

Feedback

Additional resources