Azure Stream Analytics Job has water mark delay consistenly over 1 hour

Anand Pangare (Tata Consultancy Services) 1 Reputation point Microsoft Vendor
2020-10-14T19:50:11.527+00:00

Hi,

We are consistently facing a watermark delay issue in one of our stream analytics job in productions, the SU% utilization seems below 10%. Due to this there is a latency of up to 4 hours in our NRT pipeline. The Stream analytic job reads from an Event Hub with 100 partitions and drop events to an ADLS Cosmos end point. The query is partitioned too by partition id.

SELECT
    *
INTO
    ADLS
FROM
    EH
    PARTITION BY PartitionId

The stream analytics job receives almost 9.7TB data every day. The stream analytics job has a very high backlog of input events. The streaming units is also set on a higher side to 186 SU. Could you please let us know how we can resolve the issue?

Azure Stream Analytics
Azure Stream Analytics
An Azure real-time analytics service designed for mission-critical workloads.
334 questions
{count} votes

1 answer

Sort by: Most helpful
  1. JS Azure 76 Reputation points
    2020-10-20T22:56:22.343+00:00

    Hi,
    I see that you have "partition by" in the input and you use ADLS as output (which supports partitioning), so you are on the right track to scale out.
    What can be confusing with the SU% is that it only reflects the memory used. So if the job is CPU-bound, you may not see it from this metric alone. A good indication of CPU limitation is when you see the number of backlogged events growing.
    In this case I will recommend raising the numbers of SUs to see if the job is catching up.
    Let me know if it works, happy to investigate this further.
    Thanks,
    JS