Streaming ingestion (Preview)

Streaming ingestion is targeted for scenarios that require low latency with an ingestion time of less than 10 seconds for varied volume data. It's used for optimization of operational processing of many tables, in one or more databases where the stream of data into each table is relatively small (few records per second) but overall data ingestion volume is high (thousands of records per second).

Use the classic (bulk) ingestion instead of streaming ingestion when the amount of data grows to more than 1 MB per second per table. Read Data ingestion overview to learn more about the various methods of ingestion.

Note

Streaming ingestion doesn't support the following features:

Prerequisites

Enable streaming ingestion on your cluster

  1. In the Azure portal, go to your Azure Data Explorer cluster. In Settings, select Configurations.

  2. In the Configurations pane, select On to enable Streaming ingestion.

  3. Select Save.

    streaming ingestion on

  4. In the Web UI, define streaming ingestion policy on table(s) or database(s) that will receive streaming data.

    Note

    • If the policy is defined at the database level, all tables in the database are enabled for streaming ingestion.
    • The applied policy can reference only newly ingested data and not other tables in the database.

Use streaming ingestion to ingest data to your cluster

There are two supported streaming ingestion types:

  • Event Hub used as a data source
  • Custom ingestion requires you to write an application that uses one of Azure Data Explorer client libraries. See streaming ingestion sample for a sample application.

Choose the appropriate streaming ingestion type

Event Hub Custom Ingestion
Data delay between ingestion initiation and the data available for query longer delay shorter delay
Development overhead fast and easy setup, no development overhead high development overhead for application to handle errors and ensure data consistency

Disable streaming ingestion on your cluster

Warning

Disabling streaming ingestion may take a few hours.

  1. Drop streaming ingestion policy from all relevant tables and databases. The streaming ingestion policy removal triggers streaming ingestion data movement from the initial storage to the permanent storage in the column store (extents or shards). The data movement can last between a few seconds to a few hours, depending on the amount of data in the initial storage and the CPU and memory utilization of the cluster.

  2. In the Azure portal, go to your Azure Data Explorer cluster. In Settings, select Configurations.

  3. In the Configurations pane, select Off to disable Streaming ingestion.

  4. Select Save.

    Streaming ingestion off

Limitations

  • Streaming ingestion performance and capacity scales with increased VM and cluster sizes. For a single D14 node, the recommended load is up to 150 requests per second.
  • Currently, support is only for 8 and 16 core SKUs (D13, D14, L8, and L16).
  • The data size limitation per ingestion request is 4 MB.
  • Schema updates, such as creation and modification of tables and ingestion mappings, may take up to 5 minutes for the streaming ingestion service.
  • Enabling streaming ingestion on a cluster, even when data isn't ingested via streaming, uses part of the local SSD disk of the cluster machines for streaming ingestion data and reduces the storage available for hot cache.

Next steps