Build the data model for Power BI

Completed

After you import multiple data sets into Power BI Desktop, the next step is to create relationships between the different data sets or tables.

When you import multiple tables, chances are you'll do some analysis using data from all of the tables. Relationships between the imported tables are necessary to accurately calculate results and display the correct information in your reports.

Power BI Desktop makes creating those relationships relatively easy. In fact, in most cases you won't have to do anything, the auto-detect feature will do it for you. However, sometimes you might have to create relationships yourself, or will need to make changes to a relationship. Either way, it is important to understand relationships in Power BI Desktop and how to create and edit them.

The star schema

As a data model for Power BI the star schema is highly recommended. A star schema is a mature modeling approach widely adopted by relational data warehouses. It requires modelers to classify their model tables as either dimension or fact.

Dimension tables describe business entities---the things you model. Entities can include products, people, places, and concepts including time itself. The most consistent table you'll find in a star schema is a date dimension table. A dimension table contains a key column (or columns) that acts as a unique identifier, and descriptive columns.

Fact tables store observations or events, and can be sales orders, stock balances, exchange rates, temperatures, and so on. A fact table contains dimension key columns that relate to dimension tables, and numeric measure columns. The dimension key columns determine the dimensionality of a fact table, while the dimension key values determine the granularity of a fact table.

Diagram of a fact table star schema with five dimensions.

Generally, dimension tables contain a relatively small number of rows. Fact tables, by contrast, can contain a large number of rows and continue to grow over time.

Star schema design and many related concepts we introduce in this module are highly relevant to developing Power BI models that are optimized for performance and usability.

Consider that each Power BI report visual generates a query to send to the Power BI model, which the Power BI service calls a dataset. These queries are used to filter, group, and summarize model data. A well-designed model is one that provides tables for filtering and grouping, and tables for summarizing.

This design fits well with star schema principles:

  • Dimension tables support filtering and grouping

  • Fact tables support summarization

There's no table property that modelers set to configure the table type as dimension or fact. It's in fact determined by the model relationships. A model relationship establishes a filter propagation path between two tables, and it's the Cardinality property of the relationship that determines the table type. A common relationship cardinality is one-to-many or its inverse many-to-one. The "one" side is always a dimension-type table while the "many" side is always a fact-type table.

A well-structured model design should include tables that are either dimension-type tables or fact-type tables. Avoid mixing the two types together for a single table. We also recommend that you should strive to deliver the right number of tables with the right relationships in place. It's also important that fact-type tables always load data at a consistent grain.

Lastly, it's important to understand that optimal model design is part science and part art. Sometimes you can break with good guidance when it makes sense to do so.

Creating an order intake report

Let us look at a practical example. In this example we want to display the order intake. There are many definitions of order intake, and in this example, for the sake of simplicity we define the order intake as what is expected to be sold and shipped but is not yet shipped or invoiced. To be more precise, we want to know the order information already present in Business Central, for the orders that are on the shipment pipeline. So, the order intake can be found in this example in the Sales Line table.

So, it would be a good idea to start with the following queries:

  • Customer
  • Item
  • Order intake

You can instead create one data set that joins customer and items and order intake entries, but in doing so, this data set might not be reusable when you create other reports.

Reusability means that you should think about your data sets as little building blocks. For a certain report you will use them to create a certain data model and for another report you can reuse them to build another data model.

And, if more detail is required, we can always add other queries to the data set of the report.

As an example, we have created the following queries:

Customers query

query 50100 CustomersAPI
{
    QueryType = API;
    APIPublisher = 'contoso';
    APIGroup = 'learnPowerBI';
    APIVersion = 'v2.0';
    EntityName = 'customer';
    EntitySetName = 'customers';
    DataAccessIntent = ReadOnly;

    elements
    {
        
        dataitem(Customer; Customer)
        {
            DataItemTableFilter = Blocked = filter(" ");
            column(SalespersonCode; "Salesperson Code")
            {}
            column(CustomerNo; "No.")
            {}
            column(CustomerName; Name)
            {}
            column(City; City)
            {}
            column(Country; "Country/Region Code")
            {}
        }
    }
}

The customers query fetches all customers from the Customer table who are not blocked.

Items query

query 50101 ItemsAPI
{
    QueryType = API;
    APIPublisher = 'contoso';
    APIGroup = 'learnPowerBI';
    APIVersion = 'v2.0';
    EntityName = 'item';
    EntitySetName = 'items';
    DataAccessIntent = ReadOnly;
    
    elements
    {
        dataitem(Item; Item)
        {
            DataItemTableFilter = Blocked = filter(false);
            column(ItemNo; "No.")
            { }
            column(ItemDescription; Description)
            { }
            column(ItemCategoryCode; "Item Category Code")
            { }
        }
    }
}

The items query fetches all items from the Item table that are not blocked.

Order intake query

The order intake query fetches all sales lines from the Sales Line table that are of type Item.

query 50102 OrderIntakeAPI
{
    QueryType = API;
    APIPublisher = 'contoso';
    APIGroup = 'learnPowerBI';
    APIVersion = 'v2.0';
    EntityName = 'orderintake';
    EntitySetName = 'orderintakes';
    DataAccessIntent = ReadOnly;

    elements
    {
        dataitem(SalesLine; "Sales Line")
        {
            DataItemTableFilter = Type = filter(Item);
            column(CustomerNo; "Sell-to Customer No.")
            { }
            column(LocationCode; "Location Code")
            { }
            column(ItemNo; "No.")
            { }
            column(ShipmentDate; "Shipment Date")
            { }
            column(QuantityOutstanding; "Outstanding Quantity")
            { }
            column(AmountOutstanding; "Outstanding Amount (LCY)")
            { }
        }
    }
}

Screenshot of the Order intake queries.

Now you can import the queries in Power BI Desktop. To do that, follow these steps:

  1. In Power BI Desktop, select Get Data, More....

  2. In the Get Data window, select Online Services, and then Dynamics 365 Business Central.

  3. Expand your sandbox and select the queries and then click Transform Data.

    Screenshot showing the Navigator page demo.

  4. The Power Query Editor should open.

    Screenshot showing the Power Query Editor.

  5. In the Power Query Editor, you can open each query on the left side, select the query and then in the middle window verify the data type of every column.

  6. Select Close & Apply. The queries will be imported into Power BI Desktop.

  7. Select the Model View, in the sidebar on the left.

    Screenshot showing the Power BI Model View.

In the Model View you can see that all queries are imported and that they have been connected using relationships. This is because in all three queries there were matching field named (and data types). The OrderIntake table has been linked with the Customers table using the CustomerNo field and the OrderIntake table has been linked with the Items table using the ItemNo.

In case the relationships were not created automatically, you need to create them manually. You can do that by selecting the CustomerNo field in the OrderIntake table and drag/drop it on the CustomerNo field in the Customer table. Then do the same for the ItemNo field and the Item table.

The data model is now ready. The OrderIntake table is in the center of the data model and can be considered as the fact table. The Item and Customer tables are in the leaves of the data model, in the point of the start, and are considered dimensional tables. Because of the relationships in this data model, you can now aggregate the numerical fields in the fact (OrderIntake) table using fields from the dimensional tables.

If you would want to display data on a timeline, you might be tempted to use a date field from the fact table. Although that might work for most visuals, once you start to create measures, it might give you unexpected and wrong results. To fix this, you need a separate date table and link it to the fact table. This topic will be discussed later.

Shape data

During or after importing data from a data source, like Business Central, you can adjust the data to meet your needs. To shape data, you provide Power Query Editor with step-by-step instructions for adjusting the data while loading and presenting it. Shaping does not affect the original data source, only this view of the data.

Shaping can mean transforming the data, such as renaming columns or tables, removing rows or columns, or changing data types. Power Query Editor captures these steps sequentially under Applied Steps in the Query Settings pane. Each time this query connects to the data source, those steps are carried out, so the data is always shaped the way you specify. This process occurs when you use the query in Power BI Desktop, or when anyone uses your shared query, such as in the Power BI service.

Notice that the Applied Steps in Query Settings already contain a few steps. You can select each step to see its effect in the Power Query Editor. First, you specified a web source, and then you previewed the table in the Navigator window.

Screenshot showing the Query Settings page.

If you need to change a data type, select the column or columns to change. Hold down the Shift key to select several adjacent columns, or Ctrl to select non-adjacent columns. Either right-click a column header, select Change Type, and choose a new data type from the menu, or drop down the list next to Data Type Text in the Transform group of the Home tab, and select a new data type.

Screenshot of the Change Datatype using Power Query Editor feature.

The Power Query Editor in Power BI Desktop uses the ribbon or the right-click menus for available tasks. Most of the tasks you can select on the Home or Transform tabs of the ribbon are also available by right clicking an item and choosing from the menu that appears. You can now apply your own changes and transformations to the data and see them in Applied Steps.

It's important not to import too many columns. In case the data set contains columns you do not want to import then, from the Manage Columns group of the Home tab, select Remove Columns. You can also right-click one of the selected column headers and select Remove Columns from the menu. The selected columns are removed, and the step Removed Columns appears in Applied Steps.

Screenshot of the Remove Columns using Power Query Editor feature.

Finally, you usually change the table title of all your queries. Under Properties in the Query Settings pane, replace the old title with the new name. It's important that you use a name that is easy to understand and that clearly explains to a user what kind of data is in the table.