Power BI Premium Gen2 FAQ

This article addresses questions frequently asked about Power BI Premium Gen2. For an overview, see What is Power BI Premium Gen2?.

Power BI Premium Gen2

This section addresses questions and answers for Power BI Premium Gen2.

What is Power BI Premium Generation 2?

Power BI Premium recently released a new version of Power BI Premium, Premium Gen2. Premium Gen2 will simplify the management of Premium capacities, and reduce management overhead. For more information about Premium Gen2, see Power BI Premium Generation 2.

How can I control the costs of autoscaling?

Autoscaling is an optional feature of Premium Gen2, and is subject to two limits, each if which is configured by Power BI administrators:

  • Proactive limit – a proactive limit sets the rate of expenses that Autoscale can generate, by limiting the number of autoscale v-cores a capacity can use. For example, by setting a maximum autoscale of v-cores to one v-core, you ensure that the maximum charge you can incur is 30 days of autoscaling with one v-core.

  • Reactive limit – you can also set a reactive limit to the cost for autoscaling, by setting an expenditure limit on the Azure subscription used with autoscale. If the subscription’s budget is exhausted, Power BI is prevented from using the v-core resources for that subscription, and autoscale shuts off. You can set a budget for the Azure subscription that autoscale uses by following the Azure budget tutorial.

How does resource utilization cause Gen2 to autoscale?

Power BI Premium Gen2 evaluates your level of utilization by aggregating utilization records every 30 seconds. Each evaluation is composed of two different aggregations: Interactive utilization and background utilization.

Interactive utilization is evaluated by considering all the interactive operations that completed on or near the current half-minute evaluation cycle.

Background utilization is evaluated by considering all the background operations that completed during the past twenty-four hours, where each background operation contributes only 1/2880 of its total CPU cost (there are 2880 evaluation cycles in each 24-hour period).

A capacity consists of an equal number of frontend and backend v-cores. The CPU time measured in utilization records reflect the backend v-cores utilization, and this utilization drives the need to autoscale. Utilization of frontend v-cores is not tracked. You cannot convert frontend to backend v-cores.

If you have a P1 subscription with four backend v-cores, each evaluation cycle quota is 4*30 = 120 seconds of CPU utilization. If the sum of both utilizations exceeds the total backend core quota in your capacity, your capacity will autoscale in one v-core for the next 24 hours.

Autoscale always looks at your current capacity size to evaluate how much resource you use. If you have already autoscaled with one v-core, that v-core is spread evenly between frontend and backend at 50% each, meaning your maximum capacity is now at 120+0.5*30 = 135 seconds of CPU time in an evaluation cycle.

Autoscale always makes sure that no single interactive operation can consume all of your capacity, and you must have two or more interactive operations taking place in a single evaluation cycle to initiate autoscale.

What happens to traffic during overload if I don't autoscale?

If a capacity’s utilization exceeded a 100% and it cannot use autoscale, due to being turned off or already at its maximum v-core utilization value, the capacity enters into a temporary interactive request delay mode, during which each interactive request (such as report load, visual interaction, and so on) is delayed before it is sent to the engine for execution. The amount of delay is proportional to the amount of overload detected. Overload of 100% will incur a delay of 20 seconds.

The capacity stays in interactive request delay mode if the previous evaluation is at greater than 100% resource usage.

Which operations contribute to interactive utilization, and which to background utilization?

The following events are interactive operations:

  • Datasets workload - Report View, Query, XMLA read
  • Dataflows workloads
  • Paginated Report workload - paginated report render

The following are background operations:

  • Datasets workload - scheduled refresh, on-demand refresh, background query (after refresh)
  • Dataflows workload - scheduled dataflow refresh
  • Paginated reports workload - data driven subscriptions renders
  • Export report to file API calls
  • AI workloads

How can I use my utilization data to predict my capacity needs?

Your metrics report dataset retains 30 to 45 days of data. You can use the report to indicate how close you are to your capacity's maximum resources, and if you save monthly snapshots, you can compare them to indicate trends of growth and extrapolate the rate in which you will arrive at 100% utilization of your resources.

How can my utilization data inform me I should turn on autoscale?

Utilization data does not currently indicate whether requests were throttled due to capacity being in interactive request delay mode. The information will be added to the utilization app so admins can determine whether users experienced delays, and to what extent the delays are due to overload without autoscaling.

How can I get notified that I'm approaching my max capacity?

The Capacity management page in the Power BI admin portal has a utilization notification checkbox. Users can choose the threshold at which an alert will be triggered (default is 80%) and the email address to which utilization alerts should be sent.

How much data is Power BI storing? How can I retain more?

The Power BI service stores over 90 days of utilization data. Users who need longer data retention can use Bring Your Own Log Analytics (BYOLA) to store more utilization data.

How do I get visibility into resources of Gen2 beyond CPU time?

Today, customers don't have visibility through utilization data to the memory footprint of their operations, and cannot know ahead of time whether any of their operations is subject to failures.

How do I use utilization data to perform chargebacks?

On the left side of the utilization report, a bar chart visual displays utilization information between workspaces for the time span of the report. The bar chart visual can be used for chargebacks, providing each workspace represents a different business unit, cost center, or other entity to which chargebacks can apply.

Premium Per User (PPU)

For information about Premium Per User (PPU), see the Power BI Premium Per User article.

Next steps

The following articles provide more information about Power BI Premium:

More questions? Try asking the Power BI Community