question

caliburn1994 avatar image
0 Votes"
caliburn1994 asked brtrach-MSFT answered

Azure App Service - WEBSITE_HEALTHCHECK_MAXPINGFAILURES and time of Load Balancing

Description


The required number of failed requests for an instance to be deemed unhealthy and removed from the load balancer. For example, when set to 2, your instances will be removed after 2 failed pings. (Default value is 10)

Here is the description for WEBSITE_HEALTHCHECK_MAXPINGFAILURES. What is the difference between WEBSITE_HEALTHCHECK_MAXPINGFAILURES and the Load Balancing in the picture below.

I found when I change Load Balancing to 5, the value of WEBSITE_HEALTHCHECK_MAXPINGFAILURES will be changed to 5.


192643-image.png

Test


Localhost will send two requests in one minute.

  • Before enabling Health Check, there is no any requet.

  • After enabling Health Check, two requests will be recieved every minute every instance.

  • Sometimes it will send multiple error request in one minute. (I let healthcheck api be error for test)

192579-image.png


azure-webappsazure-webapps-availability
image.png (206.7 KiB)
image.png (327.0 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

brtrach-MSFT avatar image
0 Votes"
brtrach-MSFT answered

@caliburn1994 Thank you for your question regarding Azure Web Apps high availability.

The default method of ensuring high availability of your site's instances is when an instance should be removed from the load balancer or restarted by setting a time limit. This allows customers to define how many minutes it is acceptable for your instance(s) to be throwing failed health check errors.

Some customers though who might have a high incoming request load (10,000s of requests per minute or even per second in some cases) might prefer to take an approach that focuses on the number of failed requests rather than wait for a timer to be triggered. WEBSITE_HEALTHCHECK_MAXPINGFAILURES allows customers to define a certain number of failures to be unacceptable and then for the health service to take action.

When both are set, whatever parameter is triggered first will cause the instance to be rebooted or removed from your sites load balancer.

Ultimately, the default time method is going to be best for a majority of use cases unless you find yourself needing more granule control.

Please let us know if you have any further questions or concerns.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.