Use Azure Digital Twins data history

Data history is an Azure Digital Twins feature for automatically historizing twin property updates to Azure Data Explorer. This data can be queried using the Azure Digital Twins query plugin for Azure Data Explorer to gain insights about your environment over time.

This article shows how to set up a working data history connection between Azure Digital Twins and Azure Data Explorer. It uses the Azure CLI and the Azure portal to set up and connect the required data history resources, including:

It also contains a sample twin graph that you can use to see the historized twin property updates in Azure Data Explorer.

Tip

Although this article uses the Azure portal, you can also work with data history using the 2022-05-31 version of the rest APIs.

Prerequisites

Prepare your environment for the Azure CLI

Note

You can also use Azure Cloud Shell in the PowerShell environment instead of the Bash environment, if you prefer. The commands on this page are written for the Bash environment, so they may require some small adjustments to be run in PowerShell.

Set up CLI session

To start working with Azure Digital Twins in the CLI, the first thing to do is log in and set the CLI context to your subscription for this session. Run these commands in your CLI window:

az login
az account set --subscription "<your-Azure-subscription-ID>"

Tip

You can also use your subscription name instead of the ID in the command above.

If this is the first time you've used this subscription with Azure Digital Twins, run this command to register with the Azure Digital Twins namespace. (If you're not sure, it's ok to run it again even if you've done it sometime in the past.)

az provider register --namespace 'Microsoft.DigitalTwins'

Next you'll add the Microsoft Azure IoT Extension for Azure CLI, to enable commands for interacting with Azure Digital Twins and other IoT services. Run this command to make sure you have the latest version of the extension:

az extension add --upgrade --name azure-iot

Now you are ready to work with Azure Digital Twins in the Azure CLI.

You can verify this by running az dt --help at any time to see a list of the top-level Azure Digital Twins commands that are available.

Set up local variables for CLI session

This article provides CLI commands that you can use to create the data history resources. In order to make it easy to copy and run those commands later, you can set up local variables in your CLI session now, and then refer to those variables later in the CLI commands when creating your resources. Update the placeholders (identified with <...> brackets) in the commands below, and run these commands to create the variables. Make sure to follow the naming rules described in the comments. These values will be used later when creating the new resources.

Note

These commands are written for the Bash environment. They can be adjusted for PowerShell if you prefer to use a PowerShell CLI environment.

## General Setup
location="<your-resource-region>"
resourcegroup="<your-resource-group-name>"

## Azure Digital Twins Setup
# Instance name can contain letters, numbers, and hyphens. It must start and end with a letter or number, and be between 4 and 62 characters long.
dtname="<name-for-your-digital-twins-instance>"
# Connection name can contain letters, numbers, and hyphens. It must contain at least one letter, and be between 3 and 50 characters long.
connectionname="<name-for-your-data-history-connection>"

## Event Hub Setup
# Namespace can contain letters, numbers, and hyphens. It must start with a letter, end with a letter or number, and be between 6 and 50 characters long.
eventhubnamespace="<name-for-your-event-hub-namespace>"
# Event hub name can contain only letters, numbers, periods, hyphens and underscores. It must start and end with a letter or number.
eventhub="<name-for-your-event-hub>"

## Azure Data Explorer Setup
# Cluster name can contain only lowercase alphanumeric characters. It must start with a letter, and be between 4 and 22 characters long.
clustername="<name-for-your-cluster>"  
# Database name can contain only alphanumeric, spaces, dash and dot characters, and be up to 260 characters in length.
databasename="<name-for-your-database>"

Create an Azure Digital Twins instance with a managed identity

If you already have an Azure Digital Twins instance, ensure that you've enabled a system-managed identity for it.

If you don't have an Azure Digital Twins instance, set one up using the instructions in this section.

Use the following command to create a new instance with a system-managed identity. The command uses three local variables ($dtname, $resourcegroup, and $location) that were created earlier in Set up local variables for CLI session.

az dt create --dt-name $dtname --resource-group $resourcegroup --location $location --assign-identity

Next, use the following command to grant yourself the Azure Digital Twins Data Owner role on the instance. The command has one placeholder, <owneruser@microsoft.com>, that you should replace with your own Azure account information, and uses a local variable ($dtname) that was created earlier in Set up local variables for CLI session.

az dt role-assignment create --dt-name $dtname --assignee "<owneruser@microsoft.com>" --role "Azure Digital Twins Data Owner"

Note

It may take up to five minutes for this RBAC change to apply.

Create an Event Hubs namespace and event hub

The next step is to create an Event Hubs namespace and an event hub. This hub will receive digital twin property update notifications from the Azure Digital Twins instance and then forward the messages to the target Azure Data Explorer cluster.

As part of the data history connection setup later, you'll grant the Azure Digital Twins instance the Azure Event Hubs Data Owner role on the event hub resource.

For more information about Event Hubs and their capabilities, see the Event Hubs documentation.

Use the following CLI commands to create the required resources. The commands use several local variables ($location, $resourcegroup, $eventhubnamespace, and $eventhub) that were created earlier in Set up local variables for CLI session.

Create an Event Hubs namespace:

az eventhubs namespace create --name $eventhubnamespace --resource-group $resourcegroup --location $location

Create an event hub in your namespace:

az eventhubs eventhub create --name $eventhub --resource-group $resourcegroup --namespace-name $eventhubnamespace

Create a Kusto (Azure Data Explorer) cluster and database

Next, create a Kusto (Azure Data Explorer) cluster and database to receive the data from Azure Digital Twins.

As part of the data history connection setup later, you'll grant the Azure Digital Twins instance the Contributor role on at least the database (it can also be scoped to the cluster), and the Admin role on the database.

Use the following CLI commands to create the required resources. The commands use several local variables ($location, $resourcegroup, $clustername, and $databasename) that were created earlier in Set up local variables for CLI session.

Start by adding the Kusto extension to your CLI session, if you don't have it already.

az extension add --name kusto

Next, create the Kusto cluster. The command below requires 5-10 minutes to execute, and will create an E2a v4 cluster in the developer tier. This type of cluster has a single node for the engine and data-management cluster, and is applicable for development and test scenarios. For more information about the tiers in Azure Data Explorer and how to select the right options for your production workload, see Select the correct compute SKU for your Azure Data Explorer cluster and Azure Data Explorer Pricing.

az kusto cluster create --cluster-name $clustername --sku name="Dev(No SLA)_Standard_E2a_v4" tier="Basic" --resource-group $resourcegroup --location $location --type SystemAssigned

Create a database in your new Kusto cluster (using the cluster name from above and in the same location). This database will be used to store contextualized Azure Digital Twins data. The command below creates a database with a soft delete period of 365 days, and a hot cache period of 31 days. For more information about the options available for this command, see az kusto database create.

az kusto database create --cluster-name $clustername --database-name $databasename --resource-group $resourcegroup --read-write-database soft-delete-period=P365D hot-cache-period=P31D location=$location

Set up data history connection

Now that you've created the required resources, use the command below to create a data history connection between the Azure Digital Twins instance, the event hub, and the Azure Data Explorer cluster.

Use the following command to create a data history connection. By default, this command assumes all resources are in the same resource group as the Azure Digital Twins instance. You can also specify resources that are in different resource groups using the parameter options for this command, which can be displayed by running az dt data-history connection create adx -h. The command uses several local variables ($connectionname, $dtname, $clustername, $databasename, $eventhub, and $eventhubnamespace) that were created earlier in Set up local variables for CLI session.

az dt data-history connection create adx --cn $connectionname --dt-name $dtname --adx-cluster-name $clustername --adx-database-name $databasename --eventhub $eventhub --eventhub-namespace $eventhubnamespace

When executing the above command, you'll be given the option of assigning the necessary permissions required for setting up your data history connection on your behalf (if you've already assigned the necessary permissions, you can skip these prompts). These permissions are granted to the managed identity of your Azure Digital Twins instance. The minimum required roles are:

  • Azure Event Hubs Data Owner on the event hub
  • Contributor scoped at least to the specified database (it can also be scoped to the cluster)
  • Database principal assignment with role Admin (for table creation / management) scoped to the specified database

For regular data plane operation, these roles can be reduced to a single Azure Event Hubs Data Sender role, if desired.

Note

If you encounter the error "Could not create Azure Digital Twins instance connection. Unable to create table and mapping rule in database. Check your permissions for the Azure Database Explorer and run az login to refresh your credentials," resolve the error by adding yourself as an AllDatabasesAdmin under Permissions in your Azure Data Explorer cluster.

If you're using the Cloud Shell and encounter the error "Failed to connect to MSI. Please make sure MSI is configured correctly," try running the command with a local Azure CLI installation instead.

After setting up the data history connection, you can optionally remove the roles granted to your Azure Digital Twins instance for accessing the Event Hubs and Azure Data Explorer resources. In order to use data history, the only role the instance needs going forward is Azure Event Hubs Data Sender (or a higher role that includes these permissions, such as Azure Event Hubs Data Owner) on the Event Hubs resource.

Note

Once the connection is set up, the default settings on your Azure Data Explorer cluster will result in an ingestion latency of approximately 10 minutes or less. You can reduce this latency by enabling streaming ingestion (less than 10 seconds of latency) or an ingestion batching policy. For more information about Azure Data Explorer ingestion latency, see End-to-end ingestion latency.

Verify with a sample twin graph

Now that your data history connection is set up, you can test it with data from your digital twins.

If you already have twins in your Azure Digital Twins instance that are receiving property updates, you can skip this section and visualize the results using your own resources.

Otherwise, continue through this section to set up a sample graph containing twins that receives twin property updates.

You can set up a sample graph for this scenario using the Azure Digital Twins Data Simulator. The Azure Digital Twins Data Simulator continuously pushes property updates to several twins in an Azure Digital Twins instance.

Create a sample graph

You can use the Azure Digital Twins Data Simulator to provision a sample twin graph and push property updates to it. The twin graph created here models pasteurization processes for a dairy company.

Start by opening the Azure Digital Twins Data Simulator in your browser. Set these fields:

  • Instance URL: Enter the host name of your Azure Digital Twins instance. The host name can be found in the portal page for your instance, and has a format like <Azure-Digital-Twins-instance-name>.api.<region-code>.digitaltwins.azure.net.
  • Simulation Type: Select Dairy facility from the dropdown menu.

Select Generate Environment.

Screenshot of the Azure Digital Twins Data simulator.

You'll see confirmation messages on the screen as models, twins, and relationships are created in your environment.

When the simulation is ready, the Start simulation button will become enabled. Select Start simulation to push simulated data to your Azure Digital Twins instance. To continuously update the twins in your Azure Digital Twins instance, keep this browser window in the foreground on your desktop (and complete other browser actions in a separate window).

To verify that data is flowing through the data history pipeline, navigate to the Azure portal and open the Event Hubs namespace resource you created. You should see charts showing the flow of messages into and out of the namespace, indicating the flow of incoming messages from Azure Digital Twins and outgoing messages to Azure Data Explorer.

Screenshot of the Azure portal showing an Event Hubs namespace for the simulated environment.

View the historized twin updates in Azure Data Explorer

In this section, you'll view the historized twin updates being stored in Azure Data Explorer.

Start in the Azure portal and navigate to the Azure Data Explorer cluster you created earlier. Choose the Databases pane from the left menu to open the database view. Find the database you created for this article and select the checkbox next to it, then select Query.

Screenshot of the Azure portal showing a database in an Azure Data Explorer cluster.

Next, expand the cluster and database in the left pane to see the name of the table. You'll use this name to run queries on the table.

Screenshot of the Azure portal showing the query view for the database. The name of the data history table is highlighted.

Copy the command below. The command will change the ingestion to batched mode and ingest every 10 seconds.

.alter table <table-name> policy ingestionbatching @'{"MaximumBatchingTimeSpan":"00:00:10", "MaximumNumberOfItems": 500, "MaximumRawDataSizeMB": 1024}'

Paste the command into the query window, replacing the <table-name> placeholder with the name of your table. Select the Run button.

Screenshot of the Azure portal showing the query view for the database. The Run button is highlighted.

Next, add the following command to the query window, and run it again to verify that Azure Data Explorer has ingested twin updates into the table.

Note

It may take up to 5 minutes for the first batch of ingested data to appear.

<table_name>
| count

You should see in the results that the count of items in the table is something greater than 0.

You can also add and run the following command to view 100 records in the table:

<table_name>
| limit 100

Next, run a query based on the data of your twins to see the contextualized time series data.

Use the query below to chart the outflow of all salt machine twins in the Oslo dairy. This Kusto query uses the Azure Digital Twins plugin to select the twins of interest, joins those twins against the data history time series in Azure Data Explorer, and then charts the results. Make sure to replace the <ADT-instance> placeholder with the URL of your instance, in the format https://<instance-host-name>.

let ADTendpoint = "<ADT-instance>";
let ADTquery = ```SELECT SALT_MACHINE.$dtId as tid
FROM DIGITALTWINS FACTORY 
JOIN SALT_MACHINE RELATED FACTORY.contains 
WHERE FACTORY.$dtId = 'OsloFactory'
AND IS_OF_MODEL(SALT_MACHINE , 'dtmi:assetGen:SaltMachine;1')```;
evaluate azure_digital_twins_query_request(ADTendpoint, ADTquery)
| extend Id = tostring(tid)
| join kind=inner (<table_name>) on Id
| extend val_double = todouble(Value)
| where Key == "OutFlow"
| render timechart with (ycolumns = val_double)

The results should show the outflow numbers changing over time.

Screenshot of the Azure portal showing the query view for the database. The result for the example query is a line graph showing changing values over time for the salt machine outflows.

Next steps

To keep exploring the dairy scenario, you can view more sample queries on GitHub that show how you can monitor the performance of the dairy operation based on machine type, factory, maintenance technician, and various combinations of these parameters.

To build Grafana dashboards that visualize the performance of the dairy operation, read Creating dashboards with Azure Digital Twins, Azure Data Explorer, and Grafana.

For more information on using the Azure Digital Twins query plugin for Azure Data Explorer, see Querying with the Azure Data Explorer plugin and this blog post. You can also read more about the plugin here: Querying with the Azure Data Explorer plugin.