Creating computed entities in dataflows
You can perform in-storage computations when using dataflows with a Power BI Premium subscription. This lets you do calculations on your existing dataflows, and return results that enable you to focus on report creation and analytics.
To perform in-storage computations, you first must create the dataflow and bring data into that Power BI dataflow storage. After you have a dataflow that contains data, you can create computed entities, which are entities that do in-storage computations.
There are two ways you can connect dataflow data to Power BI:
- Using self-service authoring of a dataflow
- Using an external dataflow
The following sections describe how to create computed entities on your dataflow data.
How to create computed entities
After you have a dataflow with a list of entities, you can perform calculations on those entities.
In the dataflow authoring tool in the Power BI service, select Edit entities. Then right-click the entity you want to use as the basis for your computed entity and on which you want to perform calculations. On the shortcut menu, select Reference.
For the entity to be eligible as a computed entity, Enable load must be selected, as shown in the following image. Right-click the entity to display this shortcut menu.
By selecting Enable load, you create a new entity whose source is the referenced entity. The icon changes to the computed icon, as shown in the following image.
Any transformation you do on this newly created entity will be run on the data that already resides in Power BI dataflow storage. That means that the query won't run against the external data source from which the data was imported (for example, the SQL database from which the data was pulled).
Example use cases
What kind of transformations can be done with computed entities? Any transformation that you usually specify by using the transformation user interface in Power BI, or the M editor, are all supported when performing in-storage computation.
Consider the following example. You have an Account entity that contains the raw data for all the customers from your Dynamics 365 subscription. You also have ServiceCalls raw data from the service center, with data from the support calls that were performed from the different accounts on each day of the year.
Imagine you want to enrich the Account entity with data from ServiceCalls.
First you would need to aggregate the data from the ServiceCalls to calculate the number of support calls that were done for each account in the last year.
Next, you merge the Account entity with the ServiceCallsAggregated entity to calculate the enriched Account table.
Then you can see the results, shown as EnrichedAccount in the following image.
And that's it—the transformation is done on the data in the dataflow that resides in your Power BI Premium subscription, not on the source data.
Considerations and limitations
It's important to note that if you remove the workspace from Power BI Premium capacity, the associated dataflow will no longer be refreshed.
When working with dataflows specifically created in an organization's Azure Data Lake Storage account, linked entities and computed entities only work properly when the entities reside in the same storage account. More information: Connect Azure Data Lake Storage Gen2 for dataflow storage
Linked entities are only available for dataflows created in Power BI and Power Apps. As a best practice, when doing computations on data joined by on-premises and cloud data, create a new entity to perform such computations. This provides a better experience than using an existing entity for computations, such as an entity that is also querying data from both sources and doing in-storage transformations.
This article described computed entities and dataflows. Here are some more articles that might be useful:
- Self-service data prep in Power BI
- Using incremental refresh with dataflows
- Connect to data sources for dataflows
- Link entities between dataflows
The following links provide additional information about dataflows in Power BI and other resources:
- Create and use dataflows in Power BI
- Using dataflows with on-premises data sources
- Developer resources for Power BI dataflows
- Configure workspace dataflow settings (Preview)
- Add a CDM folder to Power BI as a dataflow (Preview)
- Connect Azure Data Lake Storage Gen2 for dataflow storage (Preview)
For more information about Power Query and scheduled refresh, you can read these articles:
For more information about Common Data Model, you can read its overview article: