Dataset design for the Power BI Connector for Azure DevOps
Azure DevOps Services | Azure DevOps Server 2019
Each Analytics view defines a dataset in Power BI. Datasets are the tables and properties used to create visualizations. The datasets generated by the Power BI Data Connector for Azure DevOps have the following characteristics:
- The entities and associated fields available from Analytics are flattened (denormalized) into a single table. For example, the user name in "Created By" is modeled as a string (User Name), rather than the user ID. This eliminates the need to create relationships between tables to create reports.
- Historical data is modeled as snapshots for each time period, so trending reporting is extremely simple.
To learn more about Power BI and datasets, see Power BI - basic concepts for Power BI service.
Flattening the Analytics OData endpoint
The Analytics OData endpoint provides a normalized representation of Analytics data. The data is normalized primarily to support reporting on the "many-to-many" relationships that exist between the data, such as work items and associated tags.
The Power BI Data Connector represents this data as a single table so that the relationships shown in our Analytics data model do not need to be recreated in Power BI. This allows you to filter immediately on complex fields, such as work item tags.
The process greatly simplifies getting your reports up and running. However, not all of the fields available through the Analytics OData endpoint are available for selection in an Analytics view.
Selectable fields in Analytics views
The fields you can select in an Analytics view correspond to regular work tracking fields and Analytics data store fields.
Work tracking fields
You can select all work tracking fields in an Analytics view, except for the following fields:
- Fields that aren't part of the project where the view was created
- Long text fields such as Description, History, and other fields with an HTML data type
- Work item link count fields such as ExternalLinkCount, HyperLinkCount, AttachedFileCount, RelatedLinkCount
- Specific REST API Fields, such as Watermark, IsDeleted
- Fields with many-to-many relationships, such as Team, Board Column, Board Name
Identity or person-name fields, such as Created By, Assigned To, etc., are selectable fields, however you can't select these fields at this time as field criteria for the purposes of filtering work items.
For a description of each work item tracking fields, see Work item field index.
Analytics data store fields
You can select the following Analytics-based fields in an Analytics view:
|Cycle Time||Time for a work item to move from a state category of "In Progress" to "Completed"|
|Date (included automatically with history)||Supports viewing the daily, weekly, or monthly history of the filtered set of work items|
|Is Current (included automatically with history)||Supports filtering the data to view the most recent snapshot of the filtered set of work items by setting the value to
|Lead Time||Time for a work item to move from a state category of "Proposed" to "Completed"|
|Parent Work Item Id||The Work Item Id for the parent of a work item|
|Project Name||Equivalent to the project field|
|Revision||A number that is assigned to the historical revision of a work item|
|Tags||Semicolon delimited list of Tags|
|WorkItemRevisionSK||The Analytics unique key for the work item revision, used to join related entities|
To access any additional fields available through Analytics, include the corresponding Surrogate Keys(SK) or Work Item Id in the Analytics view and create the necessary mapping tables based on the Analytics Navigational Property.
- Iterations (IterationSK)
- Areas (AreaSK)
- Teams (AreaSK - generate mapping Table based on Teams Navigational Property)
- BoardLocations (AreaSK - generate mapping Table based on BoardLocations Navigational Property)
- Dates (DateSK)
- Process (AreaSK - generate mapping Table based on Process Navigational Property)
- WorkItemLinks (Work Item Id)
Understanding the Analytics data model is critical to building good relationships between entities.
By default, when basic data is returned from Analytics, the data is related as shown in the figure below:
You will note that Tags, Teams, and Users are not related to any of the other data. This is due to how those entities are related. They are related by either many-to-many relationships which are not easily handled in these models or there are multiple relationships between the entities such as between users and work items (they are related by Assigned To, Created By, Changed By, and others).
You can handle multiple relationships fairly simply. For example, in the default model you can edit the query, select the AssignedTo column of the WorkItems table and expand the column to include all of the data from the Users table and you can repeat this process for the Created By and Changed By columns as well. This gets you around having multiple links from one table to another which is not allowed.
Another reason for expanding columns in this way is to handle circular relationships which are also not allowed. For example, take the following path: Projects > Areas > Work Items > Projects. This presents a typical circular problem. What if you wanted to see which ares were part of a given project? The model as it is build has relationships between Areas and Work Items and Projects and Work Items but Projects cannot be related to Areas because that completes the circular relationship and so it is not allowed. To handle this, you could expand the Project column in the Areas table. To do that, take the following steps:
Select Edit Queries from the Home tab
Select the Areas query
Scroll to the Project column (the last column) and click the Expand icon at the top of the column
Uncheck all of the columns except for the ProjectName and click OK
Now you can list Areas by Project and get a count of Areas in each project.
Historical data modeling
Analytics models historical data as a periodic snapshot fact table. The fact table contains one row created at midnight for each work item at the end of each period. For example, history on a daily period is modeled as one row at midnight for each day, while a weekly period would be one row at midnight of the last day of the week. If the week has not completed, the snapshot value for the week is based on the current value.
The grain of this table is the period, not the individual work item. This means that a single Work Item will appear multiple times, once for each historical period. Selecting the last 30 days of history will result in a single work item appearing 30 times in the data model. If the work item has not changed within the last 30 days, the most recent revision of the work item is replicated on each day.
When working with the Power BI Data Connector and historical data, we recommend using the
Date field. If the dataset contains historical data, but only the current values are needed, this can be setup by filtering
For example, if you want to show a table of work items and values for the associated fields you would use
Is Current as a filter which is set to True. If you wanted instead to show a trend of work items based on state you would include the Date column on the Axis of the visualization.
Always use the Date option when using the Date column. The Date field is not intended to support default hierarchies in Power BI.