Azure Functions scale and hosting

When you create a function app in Azure, you must choose a hosting plan for your app. There are three basic hosting plans available for Azure Functions: Consumption plan, Premium plan, and Dedicated (App Service) plan. All hosting plans are generally available (GA) on both Linux and Windows virtual machines.

The hosting plan you choose dictates the following behaviors:

  • How your function app is scaled.
  • The resources available to each function app instance.
  • Support for advanced features, such as Azure Virtual Network connectivity.

Both Consumption and Premium plans automatically add compute power when your code is running. Your app is scaled out when needed to handle load, and scaled in when code stops running. For the Consumption plan, you also don't have to pay for idle VMs or reserve capacity in advance.

Premium plan provides additional features, such as premium compute instances, the ability to keep instances warm indefinitely, and VNet connectivity.

App Service plan allows you to take advantage of dedicated infrastructure, which you manage. Your function app doesn't scale based on events, which means it never scales in to zero. (Requires that Always on is enabled.)

For a detailed comparison between the various hosting plans (including Kubernetes-based hosting), see the Hosting plans comparison section.

Consumption plan

When you're using the Consumption plan, instances of the Azure Functions host are dynamically added and removed based on the number of incoming events. This serverless plan scales automatically, and you're charged for compute resources only when your functions are running. On a Consumption plan, a function execution times out after a configurable period of time.

Billing is based on number of executions, execution time, and memory used. Usage is aggregated across all functions within a function app. For more information, see the Azure Functions pricing page.

The Consumption plan is the default hosting plan and offers the following benefits:

  • Pay only when your functions are running
  • Scale out automatically, even during periods of high load

Function apps in the same region can be assigned to the same Consumption plan. There's no downside or impact to having multiple apps running in the same Consumption plan. Assigning multiple apps to the same Consumption plan has no impact on resilience, scalability, or reliability of each app.

To learn more about how to estimate costs when running in a Consumption plan, see Understanding Consumption plan costs.

Premium plan

When you're using the Premium plan, instances of the Azure Functions host are added and removed based on the number of incoming events just like the Consumption plan. Premium plan supports the following features:

  • Perpetually warm instances to avoid any cold start
  • VNet connectivity
  • Unlimited execution duration (60 minutes guaranteed)
  • Premium instance sizes (one core, two core, and four core instances)
  • More predictable pricing
  • High-density app allocation for plans with multiple function apps

To learn how you can create a function app in a Premium plan, see Azure Functions Premium plan.

Instead of billing per execution and memory consumed, billing for the Premium plan is based on the number of core seconds and memory allocated across instances. There is no execution charge with the Premium plan. At least one instance must be allocated at all times per plan. This results in a minimum monthly cost per active plan, regardless if the function is active or idle. Keep in mind that all function apps in a Premium plan share allocated instances.

Consider the Azure Functions Premium plan in the following situations:

  • Your function apps run continuously, or nearly continuously.
  • You have a high number of small executions and have a high execution bill but low GB second bill in the Consumption plan.
  • You need more CPU or memory options than what is provided by the Consumption plan.
  • Your code needs to run longer than the maximum execution time allowed on the Consumption plan.
  • You require features that are only available on a Premium plan, such as virtual network connectivity.

Dedicated (App Service) plan

Your function apps can also run on the same dedicated VMs as other App Service apps (Basic, Standard, Premium, and Isolated SKUs).

Consider an App Service plan in the following situations:

  • You have existing, underutilized VMs that are already running other App Service instances.
  • You want to provide a custom image on which to run your functions.

You pay the same for function apps in an App Service Plan as you would for other App Service resources, like web apps. For details about how the App Service plan works, see the Azure App Service plans in-depth overview.

Using an App Service plan, you can manually scale out by adding more VM instances. You can also enable autoscale, though autoscale will be slower than the elastic scale of the Premium plan. For more information, see Scale instance count manually or automatically. You can also scale up by choosing a different App Service plan. For more information, see Scale up an app in Azure.

When running JavaScript functions on an App Service plan, you should choose a plan that has fewer vCPUs. For more information, see Choose single-core App Service plans.

Running in an App Service Environment (ASE) lets you fully isolate your functions and take advantage of higher number of instances than an App Service Plan.

Always On

If you run on an App Service plan, you should enable the Always on setting so that your function app runs correctly. On an App Service plan, the functions runtime goes idle after a few minutes of inactivity, so only HTTP triggers will "wake up" your functions. Always on is available only on an App Service plan. On a Consumption plan, the platform activates function apps automatically.

Function app timeout duration

The timeout duration of a function app is defined by the functionTimeout property in the host.json project file. The following table shows the default and maximum values in minutes for both plans and the different runtime versions:

Plan Runtime Version Default Maximum
Consumption 1.x 5 10
Consumption 2.x 5 10
Consumption 3.x 5 10
Premium 1.x 30 Unlimited
Premium 2.x 30 Unlimited
Premium 3.x 30 Unlimited
App Service 1.x Unlimited Unlimited
App Service 2.x 30 Unlimited
App Service 3.x 30 Unlimited

Note

Regardless of the function app timeout setting, 230 seconds is the maximum amount of time that an HTTP triggered function can take to respond to a request. This is because of the default idle timeout of Azure Load Balancer. For longer processing times, consider using the Durable Functions async pattern or defer the actual work and return an immediate response.

Even with Always On enabled, the execution timeout for individual functions is controlled by the functionTimeout setting in the host.json project file.

Determine the hosting plan of an existing application

To determine the hosting plan used by your function app, see App Service plan in the Overview tab for the function app in the Azure portal. To see the pricing tier, select the name of the App Service Plan, and then select Properties from the left pane.

View scaling plan in the portal

You can also use the Azure CLI to determine the plan, as follows:

appServicePlanId=$(az functionapp show --name <my_function_app_name> --resource-group <my_resource_group> --query appServicePlanId --output tsv)
az appservice plan list --query "[?id=='$appServicePlanId'].sku.tier" --output tsv

When the output from this command is dynamic, your function app is in the Consumption plan. When the output from this command is ElasticPremium, your function app is in the Premium plan. All other values indicate different tiers of an App Service plan.

Storage account requirements

On any plan, a function app requires a general Azure Storage account, which supports Azure Blob, Queue, Files, and Table storage. This is because Azure Functions relies on Azure Storage for operations such as managing triggers and logging function executions, but some storage accounts don't support queues and tables. These accounts, which include blob-only storage accounts (including premium storage) and general-purpose storage accounts with zone-redundant storage replication, are filtered-out from your existing Storage Account selections when you create a function app.

The same storage account used by your function app can also be used by your triggers and bindings to store your application data. However, for storage-intensive operations, you should use a separate storage account.

It's possible for multiple function apps to share the same storage account without any issues. (A good example of this is when you develop multiple apps in your local environment using the Azure Storage Emulator, which acts like one storage account.)

To learn more about storage account types, see Introducing the Azure Storage services.

In Region Data Residency

When necessary for all customer data to remain within a single region, the storage account associated with the function app must be one with in region redundancy. An in-region redundant storage account would also need to be used with Azure Durable Functions for Durable Functions.

Other platform-managed customer data will only be stored within the region when hosting in an Internal Load Balancer App Service Environment (or ILB ASE). Details can be found in ASE zone redundancy.

How the Consumption and Premium plans work

In the Consumption and Premium plans, the Azure Functions infrastructure scales CPU and memory resources by adding additional instances of the Functions host, based on the number of events that its functions are triggered on. Each instance of the Functions host in the Consumption plan is limited to 1.5 GB of memory and one CPU. An instance of the host is the entire function app, meaning all functions within a function app share resource within an instance and scale at the same time. Function apps that share the same Consumption plan are scaled independently. In the Premium plan, your plan size will determine the available memory and CPU for all apps in that plan on that instance.

Function code files are stored on Azure Files shares on the function's main storage account. When you delete the main storage account of the function app, the function code files are deleted and cannot be recovered.

Runtime scaling

Azure Functions uses a component called the scale controller to monitor the rate of events and determine whether to scale out or scale in. The scale controller uses heuristics for each trigger type. For example, when you're using an Azure Queue storage trigger, it scales based on the queue length and the age of the oldest queue message.

The unit of scale for Azure Functions is the function app. When the function app is scaled out, additional resources are allocated to run multiple instances of the Azure Functions host. Conversely, as compute demand is reduced, the scale controller removes function host instances. The number of instances is eventually scaled in to zero when no functions are running within a function app.

Scale controller monitoring events and creating instances

Cold Start

After your function app has been idle for a number of minutes, the platform may scale the number of instances on which your app runs down to zero. The next request has the added latency of scaling from zero to one. This latency is referred to as a cold start. The number of dependencies that must be loaded by your function app can impact the cold start time. Cold start is more of an issue for synchronous operations, such as HTTP triggers that must return a response. If cold starts are impacting your functions, consider running in a Premium plan or in a Dedicated plan with Always on enabled.

Understanding scaling behaviors

Scaling can vary on a number of factors, and scale differently based on the trigger and language selected. There are a few intricacies of scaling behaviors to be aware of:

  • A single function app only scales out to a maximum of 200 instances. A single instance may process more than one message or request at a time though, so there isn't a set limit on number of concurrent executions. You can specify a lower maximum to throttle scale as required.
  • For HTTP triggers, new instances are allocated, at most, once per second.
  • For non-HTTP triggers, new instances are allocated, at most, once every 30 seconds. Scaling is faster when running in a Premium plan.
  • For Service Bus triggers, use Manage rights on resources for the most efficient scaling. With Listen rights, scaling isn't as accurate because the queue length can't be used to inform scaling decisions. To learn more about setting rights in Service Bus access policies, see Shared Access Authorization Policy.
  • For Event Hub triggers, see the scaling guidance in the reference article.

Limit scale out

You may wish to restrict the number of instances an app scales out to. This is most common for cases where a downstream component like a database has limited throughput. By default, consumption plan functions will scale out to as many as 200 instances, and premium plan functions will scale out to as many as 100 instances. You can specify a lower maximum for a specific app by modifying the functionAppScaleLimit value. The functionAppScaleLimit can be set to 0 or null for unrestricted, or a valid value between 1 and the app maximum.

az resource update --resource-type Microsoft.Web/sites -g <resource_group> -n <function_app_name>/config/web --set properties.functionAppScaleLimit=<scale_limit>

Best practices and patterns for scalable apps

There are many aspects of a function app that will impact how well it will scale, including host configuration, runtime footprint, and resource efficiency. For more information, see the scalability section of the performance considerations article. You should also be aware of how connections behave as your function app scales. For more information, see How to manage connections in Azure Functions.

For more information on scaling in Python and Node.js, see Azure Functions Python developer guide - Scaling and concurrency and Azure Functions Node.js developer guide - Scaling and concurrency.

Billing model

Billing for the different plans is described in detail on the Azure Functions pricing page. Usage is aggregated at the function app level and counts only the time that function code is executed. The following are units for billing:

  • Resource consumption in gigabyte-seconds (GB-s). Computed as a combination of memory size and execution time for all functions within a function app.
  • Executions. Counted each time a function is executed in response to an event trigger.

Useful queries and information on how to understand your consumption bill can be found on the billing FAQ.

Hosting plans comparison

The following comparison table shows all important aspects to help the decision of Azure Functions App hosting plan choice:

Plan summary

Consumption plan Scale automatically and only pay for compute resources when your functions are running. On the Consumption plan, instances of the Functions host are dynamically added and removed based on the number of incoming events.
✔ Default hosting plan.
✔ Pay only when your functions are running.
✔ scale-out automatically, even during periods of high load.
Premium plan While automatically scaling based on demand, use pre-warmed workers to run applications with no delay after being idle, run on more powerful instances, and connect to VNETs. Consider the Azure Functions Premium plan in the following situations, in addition to all features of the App Service plan:
✔ Your function apps run continuously, or nearly continuously.
✔ You have a high number of small executions and have a high execution bill but low GB second bill in the Consumption plan.
✔ You need more CPU or memory options than what is provided by the Consumption plan.
✔ Your code needs to run longer than the maximum execution time allowed on the Consumption plan.
✔ You require features that are only available on a Premium plan, such as virtual network connectivity.
Dedicated plan1 Run your functions within an App Service plan at regular App Service plan rates. Good fit for long running operations, as well as when more predictive scaling and costs are required. Consider an App Service plan in the following situations:
✔ You have existing, underutilized VMs that are already running other App Service instances.
✔ You want to provide a custom image on which to run your functions.
ASE1 App Service Environment (ASE) is an App Service feature that provides a fully isolated and dedicated environment for securely running App Service apps at high scale. ASEs are appropriate for application workloads that require:
✔ Very high scale.
✔ Full compute isolation and secure network access.
✔ High memory utilization.
Kubernetes Kubernetes provides a fully isolated and dedicated environment running on top of the Kubernetes platform. Kubernetes is appropriate for application workloads that require:
✔ Custom hardware requirements.
✔ Isolation and secure network access.
✔ Ability to run in hybrid or multi-cloud environment.
✔ Run alongside existing Kubernetes applications and services.

1 For specific limits for the various App Service plan options, see the App Service plan limits.

Operating system/runtime

Linux1
Code-only
Windows2
Code-only
Linux1,3
Docker container
Consumption plan .NET Core
Node.js
Java
Python
.NET Core
Node.js
Java
PowerShell Core
No support
Premium plan .NET Core
Node.js
Java
Python
.NET Core
Node.js
Java
PowerShell Core
.NET Core
Node.js
Java
PowerShell Core
Python
Dedicated plan4 .NET Core
Node.js
Java
Python
.NET Core
Node.js
Java
PowerShell Core
.NET Core
Node.js
Java
PowerShell Core
Python
ASE4 .NET Core
Node.js
Java
Python
.NET Core
Node.js
Java
PowerShell Core
.NET Core
Node.js
Java
PowerShell Core
Python
Kubernetes n/a n/a .NET Core
Node.js
Java
PowerShell Core
Python

1Linux is the only supported operating system for the Python runtime stack.
2Windows is the only supported operating system for the PowerShell runtime stack.
3Linux is the only supported operating system for Docker containers. 4 For specific limits for the various App Service plan options, see the App Service plan limits.

Scale

Scale out Max # instances
Consumption plan Event driven. Scale out automatically, even during periods of high load. Azure Functions infrastructure scales CPU and memory resources by adding additional instances of the Functions host, based on the number of events that its functions are triggered on. 200
Premium plan Event driven. Scale out automatically, even during periods of high load. Azure Functions infrastructure scales CPU and memory resources by adding additional instances of the Functions host, based on the number of events that its functions are triggered on. 100
Dedicated plan1 Manual/autoscale 10-20
ASE1 Manual/autoscale 100
Kubernetes Event-driven autoscale for Kubernetes clusters using KEDA. Varies by cluster.  

1 For specific limits for the various App Service plan options, see the App Service plan limits.

Cold start behavior

Consumption plan Apps may scale to zero if idle for a period of time, meaning some requests may have additional latency at startup. The consumption plan does have some optimizations to help decrease cold start time, including pulling from pre-warmed placeholder functions that already have the function host and language processes running.
Premium plan Perpetually warm instances to avoid any cold start.
Dedicated plan1 When running in a Dedicated plan, the Functions host can run continuously, which means that cold start isn’t really an issue.
ASE1 When running in a Dedicated plan, the Functions host can run continuously, which means that cold start isn’t really an issue.
Kubernetes Depends on KEDA configuration. Apps can be configured to always run and never have cold start, or configured to scale to zero, which results in cold start on new events.

1 For specific limits for the various App Service plan options, see the App Service plan limits.

Service limits

Resource Consumption plan Premium plan Dedicated plan ASE Kubernetes
Default timeout duration (min) 5 30 301 30 30
Max timeout duration (min) 10 unbounded7 unbounded2 unbounded unbounded
Max outbound connections (per instance) 600 active (1200 total) unbounded unbounded unbounded unbounded
Max request size (MB)3 100 100 100 100 Depends on cluster
Max query string length3 4096 4096 4096 4096 Depends on cluster
Max request URL length3 8192 8192 8192 8192 Depends on cluster
ACU per instance 100 210-840 100-840 210-2508 AKS pricing
Max memory (GB per instance) 1.5 3.5-14 1.75-14 3.5 - 14 Any node is supported
Function apps per plan 100 100 unbounded4 unbounded unbounded
App Service plans 100 per region 100 per resource group 100 per resource group - -
Storage5 5 TB 250 GB 50-1000 GB 1 TB n/a
Custom domains per app 5006 500 500 500 n/a
Custom domain SSL support unbounded SNI SSL connection included unbounded SNI SSL and 1 IP SSL connections included unbounded SNI SSL and 1 IP SSL connections included unbounded SNI SSL and 1 IP SSL connections included n/a

1 By default, the timeout for the Functions 1.x runtime in an App Service plan is unbounded.
2 Requires the App Service plan be set to Always On. Pay at standard rates.
3 These limits are set in the host.
4 The actual number of function apps that you can host depends on the activity of the apps, the size of the machine instances, and the corresponding resource utilization.
5 The storage limit is the total content size in temporary storage across all apps in the same App Service plan. Consumption plan uses Azure Files for temporary storage.
6 When your function app is hosted in a Consumption plan, only the CNAME option is supported. For function apps in a Premium plan or an App Service plan, you can map a custom domain using either a CNAME or an A record.
7 Guaranteed for up to 60 minutes.
8 Workers are roles that host customer apps. Workers are available in three fixed sizes: One vCPU/3.5 GB RAM; Two vCPU/7 GB RAM; Four vCPU/14 GB RAM.

Networking features

Feature Consumption plan Premium plan Dedicated plan ASE Kubernetes
Inbound IP restrictions and private site access ✅Yes ✅Yes ✅Yes ✅Yes ✅Yes
Virtual network integration ❌No ✅Yes (Regional) ✅Yes (Regional and Gateway) ✅Yes ✅Yes
Virtual network triggers (non-HTTP) ❌No ✅Yes ✅Yes ✅Yes ✅Yes
Hybrid connections (Windows only) ❌No ✅Yes ✅Yes ✅Yes ✅Yes
Outbound IP restrictions ❌No ✅Yes ✅Yes ✅Yes ✅Yes

Billing

Consumption plan Pay only for the time your functions run. Billing is based on number of executions, execution time, and memory used.
Premium plan Premium plan is based on the number of core seconds and memory used across needed and pre-warmed instances. At least one instance per plan must be kept warm at all times. This plan provides more predictable pricing.
Dedicated plan1 You pay the same for function apps in an App Service Plan as you would for other App Service resources, like web apps.
ASE1 there's a flat monthly rate for an ASE that pays for the infrastructure and doesn't change with the size of the ASE. In addition, there's a cost per App Service plan vCPU. All apps hosted in an ASE are in the Isolated pricing SKU.
Kubernetes You pay only the costs of your Kubernetes cluster; no additional billing for Functions. Your function app runs as an application workload on top of your cluster, just like a regular app.

1 For specific limits for the various App Service plan options, see the App Service plan limits.

Next steps