Model relationships in Power BI Desktop
This article targets Import data modelers working with Power BI Desktop. It's an important model design topic that's essential to delivering intuitive, accurate, and optimal models.
For a deeper discussion on optimal model design, including table roles and relationships, see the Understand star schema and the importance for Power BI article.
Put simply, Power BI relationships propagate filters applied on the columns of model tables to other model tables. Filters will propagate so long as there's a relationship path to follow, which can involve propagation to multiple tables.
Relationship paths are deterministic, meaning that filters are always propagated in the same way and without random variation. Relationships can, however, be disabled, or have filter context modified by model calculations that use particular DAX functions. For more information, see the Relevant DAX functions topic later in this article.
It's important to understand that model relationships do not enforce data integrity. For more information, see the Relationship evaluation topic later in this article. This topics explains how model relationships behave when there are data integrity issues with your data.
Let's see how relationships propagate filters with an animated example.
In this example, the model consists of four tables: Category, Product, Year, and Sales. The Category table relates to the Product table, and the Product table relates to the Sales table. The Year table also relates to the Sales table. All relationships are one-to-many (the details of which are described later in this article).
A query—possibly generated by a Power BI card visual—requests the total sales quantity for sales orders made for a single category, Cat-A, and for a single year, CY2018. It's why you can see filters applied on the Category and Year tables. The filter on the Category table propagates to the Product table to isolate two products that are assigned to the category Cat-A. Then the Product table filters propagate to the Sales table to isolate just two sales rows for these products. These two sales rows represent the sales of products assigned to category Cat-A. Their combined quantity is 14 units. At the same time, the Year table filter propagates to further filter the Sales table, resulting in just the one sales row that is for products assigned to category Cat-A and that was ordered in year CY2018. The quantity value returned by the query is 11 units. Note that when multiple filters are applied to a table (like the Sales table in this example), it's always an AND operation, requiring that all conditions must be true.
It's unusual that a model table isn't related to another model table. Such a table in a valid model design can be described as a disconnected table. A disconnected table isn't intended to propagate filters to other model tables. Instead, it's used to accept "user input" (perhaps with a slicer visual), allowing model calculations to use the input value in a meaningful way. For example, consider a disconnected table that is loaded with a range of currency exchange rate values. As long as a filter is applied to filter by a single rate value, the value can be used by a measure expression to convert sales values.
The Power BI Desktop what-if parameter is feature that creates a disconnected table. For more information, see the Create and use a What if parameter to visualize variables in Power BI Desktop article.
A model relationship relates one column in a table to one column in a different table. (There's one specialized case where this requirement isn't true, and it applies only to multi-column relationships in DirectQuery models. For more information, see the COMBINEVALUES DAX function article.)
It's not possible to relate a column to a different column in the same table. This is sometimes confused with the ability to define a relational database foreign key constraint that is table self-referencing. This relational database concept can be used to store parent-child relationships (for example, each employee record is related to a "reports to" employee). Generating a model hierarchy based on this type of relationship can't be solved by creating model relationships. To achieve this, see the Parent and Child functions article.
Each model relationship must be defined with a cardinality type. There are four cardinality type options, representing the data characteristics of the "from" and "to" related columns. The "one" side means the column contains unique values; the "two" side means the column can contain duplicate values.
If a data refresh operation attempts to load duplicate values into a "one" side column, the entire data refresh will fail.
The four options—together with their shorthand notations—are described in the following bulleted list:
- One-to-many (1:*)
- Many-to-one (*:1)
- One-to-one (1:1)
- Many-to-many (*:*)
When you create a relationship in Power BI Desktop, the designer will automatically detect and set the cardinality type. The designer queries the model to know which columns contain unique values. For Import models, it uses internal storage statistics; for DirectQuery models it sends profiling queries to the data source. Sometimes, however, it can get it wrong. It happens because tables are yet to be loaded with data, or because columns that you expect to contain duplicate values currently contain unique values. In either case, you can update the cardinality type as long as any "one" side columns contain unique values (or the table is yet to be loaded with rows of data).
The One-to-many and Many-to-one cardinality options are essentially the same, and they're also the most common cardinality types.
When configuring a One-to-many or Many-to-one relationship, you'll choose the one that matches the order in which you related the columns. Consider how you would configure the relationship from the Product table to the Sales table by using the ProductID column found in each table. The cardinality type would be One-to-many, as the ProductID column in the Product table contains unique values. If you related the tables in the reverse direction, Sales to Product, then the cardinality would be Many-to-one.
A One-to-one relationship means both columns contain unique values. This cardinality type isn't common, and it likely represents a suboptimal model design because of the storage of redundant data. For more information on using this cardinality type, see One-to-one relationship guidance.
A Many-to-many relationship means both columns can contain duplicate values. This cardinality type is infrequently used. It's typically useful when designing complex model requirements. For guidance on using this cardinality type, see Many-to-many relationship guidance.
The Many-to-many cardinality type isn't currently supported for models developed for Power BI Report Server.
In Power BI Desktop model view, you can interpret a relationship's cardinality type by looking at the indicators (1 or *) on either side of the relationship line. To determine which columns are related, you'll need to select—or hover the cursor over—the relationship line to highlight the columns.
Cross filter direction
Each model relationship must be defined with a cross filter direction. Your selection determines the direction(s) that filters will propagate. The possible cross filter options are dependent on the cardinality type.
|Cardinality type||Cross filter options|
|One-to-many (or Many-to-one)||Single
|Many-to-many||Single (Table1 to Table2)
Single (Table2 to Table1)
Single cross filter direction means "single direction", and Both means "both directions". A relationship that filters in both directions is commonly described as bi-directional.
For One-to-many relationships, the cross filter direction is always from the "one" side, and optionally from the "many" side (bi-directional). For One-to-one relationships, the cross filter direction is always from both tables. Lastly, for the Many-to-many relationships, cross filter direction can be from either one of the tables, or from both tables. Notice that when the cardinality type includes a "one" side, that filters will always propagate from that side.
When the cross filter direction is set to Both, an additional property is available. It can apply bi-directional filtering when row-level security (RLS) rules are enforced. For more information about RLS, see the Row-level security (RLS) with Power BI Desktop article.
Modifying the relationship cross filter direction—including the disabling of filter propagation—can also be done by a model calculation. It's achieved by using the CROSSFILTER DAX function.
Bi-directional relationships can impact negatively on performance. Further, attempting to configure a bi-directional relationship could result in ambiguous filter propagation paths. In this case, Power BI Desktop may fail to commit the relationship change and will alert you with an error message. Sometimes, however, Power BI Desktop may allow you to define ambiguous relationship paths between tables. Precedence rules that affect ambiguity detection and path resolution are described later in this article in the Precedence rules topic.
We recommend using bi-directional filtering only as needed. For more information, see Bi-directional relationship guidance.
In Power BI Desktop model view, you can interpret a relationship's cross filter direction by noticing the arrowhead(s) along the relationship line. A single arrowhead represents a single-direction filter in the direction of the arrowhead; a double arrowhead represents a bi-directional relationship.
Make this relationship active
There can only be one active filter propagation path between two model tables. However, it's possible to introduce additional relationship paths, though these relationships must all be configured as inactive. Inactive relationships can only be made active during the evaluation of a model calculation. It is achieved by using the USERELATIONSHIP DAX function.
For more information, see Active vs inactive relationship guidance.
In Power BI Desktop model view, you can interpret a relationship's active vs inactive status. An active relationship is represented by a solid line; an inactive relationship is represented as a dashed line.
Assume referential integrity
The Assume referential integrity property is available only for One-to-many and One-to-one relationships between two DirectQuery storage mode tables that are based on the same data source. When enabled, native queries sent to the data source will join the two tables together by using an INNER JOIN rather than an OUTER JOIN. Generally, enabling this property improves query performance, though it does depend on the specifics of the data source.
Always enable this property when a database foreign key constraint exists between the two tables. When a foreign key constraint doesn't exist, you can still enable the property as long as you're certain data integrity exists.
If data integrity should become compromised, the inner join will eliminate unmatched rows between the tables. For example, consider a model Sales table with a ProductID column value that did not exist in the related Product table. Filter propagation from the Product table to the Sales table will eliminate sales rows for unknown products. This would result in an understatement of the sales results.
For more information, see the Assume referential integrity settings in Power BI Desktop article.
Relevant DAX functions
There are several DAX functions that are relevant to model relationships. Each function is described briefly in the following bulleted list:
- RELATED: Retrieves the value from "one" side.
- RELATEDTABLE: Retrieve a table of rows from "many" side.
- USERELATIONSHIP: Forces the use of a specific inactive model relationship.
- CROSSFILTER: Modifies the relationship cross filter direction (to one or both), or it disables filter propagation (none).
- COMBINEVALUES: Joins two or more text strings into one text string. The purpose of this function is to support multi-column relationships in DirectQuery models.
- TREATAS: Applies the result of a table expression as filters to columns from an unrelated table.
- Parent and Child functions: A family of related functions that can be used to generate calculated columns to naturalize a parent-child hierarchy. These columns can then be used to create a fixed-level hierarchy.
Model relationships, from an evaluation perspective, are classified as either strong or weak. It's not a configurable relationship property. It is in fact inferred from the cardinality type and the data source of the two related tables. It's important to understand the evaluation type because there may be performance implications or consequences should data integrity be compromised. These implications and integrity consequences are described in this topic.
First, some modeling theory is required to fully understand relationship evaluations.
An Import or DirectQuery model sources all of its data from either the Vertipaq cache or the source database. In both instances, Power BI is able to determine that a "one" side of a relationship exists.
A Composite model, however, can comprise tables using different storage modes (Import, DirectQuery or Dual), or multiple DirectQuery sources. Each source, including the Vertipaq cache of Import data, is considered to be a data island. Model relationships can then be classified as intra-island or cross-island. An intra-island relationship is one that relates two tables within a data island, while a cross-island relationship relates tables from different data islands. Note that relationships in Import or DirectQuery models are always intra-island.
Let's see an example of a Composite model.
In this example, the Composite model consists of two islands: a Vertipaq data island and a DirectQuery source data island. The Vertipaq data island contains three tables, and the DirectQuery source data island contains two tables. One cross-island relationship exists to relate a table in the Vertipaq data island to a table in the DirectQuery source data island.
A model relationship is strong when the query engine can determine the "one" side of relationship. It has confirmation that the "one" side column contains unique values. All One-to-many intra-island relationships are strong relationships.
In the following example, there are two strong relationships, both marked as S. Relationships include the One-to-many relationship contained within the Vertipaq island, and the One-to-many relationship contained within the DirectQuery source.
For Import models, where all data is stored in the Vertipaq cache, a data structure is created for each strong relationship at data refresh time. The data structures consist of indexed mappings of all column-to-column values, and their purpose is to accelerate joining tables at query time.
At query time, strong relationships permit table expansion to take place. Table expansion results in the creation of a virtual table by including the native columns of the base table and then expanding into related tables. For Import tables, it's done in the query engine; for DirectQuery tables it is done in the native query sent to the source database (as long as the Assume referential integrity property isn't enabled). The query engine then acts upon the expanded table, applying filters and grouping by the values in the expanded table columns.
Inactive relationships are expanded also, even when the relationship isn't used by a calculation. Bi-directional relationships have no impact on table expansion.
For One-to-many relationships, table expansion takes place from the "many" to the "one" sides by using LEFT OUTER JOIN semantics. When a matching value from the "many" to the "one" side doesn't exist, a blank virtual row is added to the "one" side table.
Table expansion also occurs for One-to-one intra-island relationships, but by using FULL OUTER JOIN semantics. It ensures that blank virtual rows are added on either side, when necessary.
The blank virtual rows are effectively Unknown Members. Unknown members represent referential integrity violations where the "many" side value has no corresponding "one" side value. Ideally these blanks should not exist, and they can be eliminated by cleansing or repairing the source data.
Let's see how table expansion works with an animated example.
In this example, the model consists of three tables: Category, Product, and Sales. The Category table relates to the Product table with a One-to-many relationship, and the Product table relates to the Sales table with a One-to-many relationship. The Category table contains two rows, the Product table contains three rows, and the Sales tables contains five rows. There are matching values on both sides of all relationships meaning that there are no referential integrity violations. A query-time expanded table is revealed. The table consists of the columns from all three tables. It's effectively a denormalized perspective of the data contained in the three tables. A new row is added to the Sales table, and it has a production identifier value (9) that has no matching value in the Product table. It's a referential integrity violation. In the expanded table, the new row has (Blank) values for the Category and Product table columns.
A model relationship is weak when there's no guaranteed "one" side. It can be the case for two reasons:
- The relationship uses a Many-to-many cardinality type (even if one or both columns contain unique values)
- The relationship is cross-island (which can only ever be the case for Composite models)
In the following example, there are two weak relationships, both marked as W. The two relationships include the Many-to-many relationship contained within the Vertipaq island, and the One-to-many cross-island relationship.
For Import models, data structures are never created for weak relationships. This means table joins must be resolved at query time.
Table expansion never occurs for weak relationships. Table joins are achieved by using INNER JOIN semantics, and for this reason, blank virtual rows are not added to compensate for referential integrity violations.
There are additional restrictions related to weak relationships:
- The RELATED DAX function can't be used to retrieve the "one" side column values
- Enforcing RLS has topology restrictions
In Power BI Desktop model view, it's not always possible to determine whether a model relationship is strong or weak. A Many-to-many relationship will always be weak, as is a One-to-many relationship when it's a cross-island relationship. To determine whether it's a cross-island relationship, you'll need to inspect the table storage modes and data sources to arrive at the correct determination.
Bi-directional relationships can introduce multiple—and therefore ambiguous—filter propagation paths between model tables. The following list presents precedence rules that Power BI uses for ambiguity detection and path resolution:
- Many-to-one and One-to-one relationships, including weak relationships
- Many-to-many relationships
- Bi-directional relationships, in the reverse direction (that is, from the "Many" side)
The following list orders filter propagation performance, from fastest to slowest performance:
- One-to-many intra-island relationships
- Many-to-many cardinality relationships
- Many-to-many model relationships achieved with an intermediary table and that involves at least one bi-directional relationship
- Cross-island relationships
For more information about this article, check out the following resources:
- Understand star schema and the importance for Power BI
- One-to-one relationship guidance
- Many-to-many relationship guidance
- Active vs inactive relationship guidance
- Bi-directional relationship guidance
- Relationship troubleshooting guidance
- Video: The Do's and Don'ts of Power BI Relationships
- Questions? Try asking the Power BI Community
- Suggestions? Contribute ideas to improve Power BI