Send data to Azure Data Explorer from an Azure IoT Data Processor Preview pipeline

Important

Azure IoT Operations Preview – enabled by Azure Arc is currently in PREVIEW. You shouldn't use this preview software in production environments.

See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

Use the Azure Data Explorer destination to write data to a table in Azure Data Explorer from an Azure IoT Data Processor Preview pipeline. The destination stage batches messages before it sends them to Azure Data Explorer.

Prerequisites

To configure and use an Azure Data Explorer destination pipeline stage, you need:

Set up Azure Data Explorer

Before you can write to Azure Data Explorer from a data pipeline, you need to grant access to the database from the pipeline. You can use either a service principal or a managed identity to authenticate the pipeline to the database. The advantage of using a managed identity is that you don't need to manage the lifecycle of the service principal. The managed identity is automatically managed by Azure and is tied to the lifecycle of the resource it's assigned to.

To create a service principal with a client secret:

  1. Use the following Azure CLI command to create a service principal.

    az ad sp create-for-rbac --name <YOUR_SP_NAME> 
    
  2. The output of this command includes an appId, displayName, password, and tenant. Make a note of these values to use when you configure access to your cloud resource such as Microsoft Fabric, create a secret, and configure a pipeline destination:

    {
        "appId": "<app-id>",
        "displayName": "<name>",
        "password": "<client-secret>",
        "tenant": "<tenant-id>"
    }
    

To grant admin access to your Azure Data Explorer database, run the following command in your database query tab:

.add database <DatabaseName> admins (<ApplicationId>) <Notes>

For the destination stage to connect to Azure Data Explorer, it needs access to a secret that contains the authentication details. To create a secret:

  1. Use the following command to add a secret to your Azure Key Vault that contains the client secret you made a note of when you created the service principal:

    az keyvault secret set --vault-name <your-key-vault-name> --name AccessADXSecret --value <client-secret>
    
  2. Add the secret reference to your Kubernetes cluster by following the steps in Manage secrets for your Azure IoT Operations Preview deployment.

Batching

Data Processor writes to Azure Data Explorer in batches. While you batch data in data processor before sending it, Azure Data Explorer has its own default ingestion batching policy. Therefore, you might not see your data in Azure Data Explorer immediately after Data Processor writes it to the Azure Data Explorer destination.

To view data in Azure Data Explorer as soon as the pipeline sends it, you can set the ingestion batching policy count to 1. To edit the ingestion batching policy, run the following command in your database query tab:

.alter database <your-database-name> policy ingestionbatching
```
{
    "MaximumBatchingTimeSpan" : "00:00:30",
    "MaximumNumberOfItems" : 1,
    "MaximumRawDataSizeMB": 1024
}
```

Configure the destination stage

The Azure Data Explorer destination stage JSON configuration defines the details of the stage. To author the stage, you can either interact with the form-based UI, or provide the JSON configuration on the Advanced tab:

Field Type Description Required Default Example
Display name String A name to show in the Data Processor UI. Yes - Azure IoT MQ output
Description String A user-friendly description of what the stage does. No Write to topic default/topic1
Cluster URL String The URI (This value isn't the data ingestion URI). Yes -
Database String The database name. Yes -
Table String The name of the table to write to. Yes -
Batch Batch How to batch data. No 60s 10s
Retry Retry The retry policy to use. No default fixed
Authentication1 String The authentication details to connect to Azure Data Explorer. Service principal or Managed identity Service principal Yes -
Columns > Name string The name of the column. Yes temperature
Columns > Path Path The location within each record of the data where the value of the column should be read from. No .{{name}} .temperature

1Authentication: Currently, the destination stage supports service principal based authentication or managed identity when it connects to Azure Data Explorer.

To configure service principal based authentication provide the following values. You made a note of these values when you created the service principal and added the secret reference to your cluster.

Field Description Required
TenantId The tenant ID. Yes
ClientId The app ID you made a note of when you created the service principal that has access to the database. Yes
Secret The secret reference you created in your cluster. Yes

Sample configuration

The following JSON example shows a complete Azure Data Explorer destination stage configuration that writes the entire message to the quickstart table in the database`:

{
    "displayName": "Azure data explorer - 71c308",
    "type": "output/dataexplorer@v1",
    "viewOptions": {
        "position": {
            "x": 0,
            "y": 784
        }
    },
    "clusterUrl": "https://clusterurl.region.kusto.windows.net",
    "database": "databaseName",
    "table": "quickstart",
    "authentication": {
        "type": "servicePrincipal",
        "tenantId": "tenantId",
        "clientId": "clientId",
        "clientSecret": "secretReference"
    },
    "batch": {
        "time": "5s",
        "path": ".payload"
    },
    "columns": [
        {
            "name": "Timestamp",
            "path": ".Timestamp"
        },
        {
            "name": "AssetName",
            "path": ".assetName"
        },
        {
            "name": "Customer",
            "path": ".Customer"
        },
        {
            "name": "Batch",
            "path": ".Batch"
        },
        {
            "name": "CurrentTemperature",
            "path": ".CurrentTemperature"
        },
        {
            "name": "LastKnownTemperature",
            "path": ".LastKnownTemperature"
        },
        {
            "name": "Pressure",
            "path": ".Pressure"
        },
        {
            "name": "IsSpare",
            "path": ".IsSpare"
        }
    ],
    "retry": {
        "type": "fixed",
        "interval": "20s",
        "maxRetries": 4
    }
}

The configuration defines that:

  • Messages are batched for 5 seconds.
  • Uses the batch path .payload to locate the data for the columns.

Example

The following example shows a sample input message to the Azure Data Explorer destination stage:

{
  "payload": {
    "Batch": 102,
    "CurrentTemperature": 7109,
    "Customer": "Contoso",
    "Equipment": "Boiler",
    "IsSpare": true,
    "LastKnownTemperature": 7109,
    "Location": "Seattle",
    "Pressure": 7109,
    "Timestamp": "2023-08-10T00:54:58.6572007Z",
    "assetName": "oven"
  },
  "qos": 0,
  "systemProperties": {
    "partitionId": 0,
    "partitionKey": "quickstart",
    "timestamp": "2023-11-06T23:42:51.004Z"
  },
  "topic": "quickstart"
}