question

urweiss avatar image
0 Votes"
urweiss asked ThomasVandenBossche-1492 commented

ASB + Application Insights - excessive dependency logging

orginally asked in the Azure Service Bus Github repo link

Description


I have a netcore 3.1 Worker Service listening for messages from multiple ASB topics/subscriptions. The service is configured to push telemetry to an Application Insights instance.

There are no messages being sent through the topic => the service just sits idly and waits for messages.
The ASB SDK logs a heartbeat like Application Insights dependency every 60s for every subscription.

The problem is that, if the SubscriptionClient is configured with MaxConcurrentCalls > 1, the SDK will log just as many (= MaxConcurrentCalls) such dependencies every 60s.

If your service (as mine does in the real scenario that prompted this) listens to 5 subscriptions with 20 MaxConcurrentCalls you will get: 5 subscriptions x 20 MaxConcurrentCalls x 60min x 24h = 144000 individual dependency log entries / day / service instance (with the service just sitting idly by).

As Application Insights bills by the GB of data ingested, you can see how this could become a problem.
Another issue is that these completely swamp out other "real" dependencies logged in the same interval.

Is this behavior by design? If so, what is the reasoning behind it? Can it be configured / filtered out / reduced somehow?

I'm using Application Insights to have end-to-end observability over a workflow that spans (via ASB) multiple services. I'm interested to retain the ASB dependencies logged as part of those workflows. How can I do that while still reducing the amount of heartbeat noise?

--- Repro available here - it uses the latest ASB and Application Insights SDKs

Actual Behavior

  1. the ASB SDK loges every 60s , for each subscription it listens to, exactly as many Application Insights dependency entries as the subscription's MaxConcurrentCalls

Expected Behavior

  1. just one dependency log entry per subscription should be enough



LE : seems that Microsoft.Azure.ServiceBus v5.1.3 is not the latest ASB related SDK, but rather Azure.Messaging.ServiceBus v7.2.0

Still, the behavior remains, just that now instead of dependencies, request type log entries are used. The number of log items is the same as with the old SDK.

azure-monitorazure-service-bus
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @urweiss

I'm taking a look at this to see what is happening- thank you for including sample code for easy repro of the issue. A custom initializer or processor would work to set a sampling percentage or a filter on this specific data, but I also want to check to ensure this is expected behavior and see what other options there might be for you.


0 Votes 0 ·

Thanks for the reply.

The sample repo has custom processors in both cases. I do not want to do an across-the-board sampling in order to not lose the end-2-end observability for my system.

In any case, as far as i can tell, only with the old sdk (Microsoft.Azure.ServiceBus) has some discriminator between the dependencies logged from the heartbeat and those logged when an actual message arrives (see CustomTelemetryProcessor.cs).

For the Azure.Messaging.ServiceBus i was not able to find a way to discriminate between the two.

LE : updated sample to use the latest Azure.Messaging.ServiceBus ( v7.2.0 ) - same result




0 Votes 0 ·

1 Answer

SamaraSoucy-MSFT avatar image
0 Votes"
SamaraSoucy-MSFT answered ThomasVandenBossche-1492 commented

I think that sampling is going to be the best way to handle this scenario. I noticed you have adaptive sampling turned off, but even with it on you may not see an effect even with it on at the default 5 items per second- the percentage retained is based on a rolling average, not the number in the specific second.

What I'm suggesting is to set a separate sample rate specifically for the heartbeat events- any exceptions would still be logged if the connection is lost and you can do this without affecting the sampling rate of any other telemetry items. App Insights will also adjust for sampling when calculating metrics- even if you only log one specific request, the metrics would still show that there were 20 made total.

Using the new SDK, I added this telemetry initializer to the code:

     public class MyTelemetryInitializer : ITelemetryInitializer
     {
         public void Initialize(ITelemetry telemetry)
         {
             if (telemetry is RequestTelemetry rt && rt.Name == "ServiceBusReceiver.Receive")
             {
                 ((ISupportSampling)telemetry).SamplingPercentage = 5;
             }
         }
     }

and added it in the Program.cs file

 services.AddSingleton<Microsoft.ApplicationInsights.Extensibility.ITelemetryInitializer, MyTelemetryInitializer>();

With that in place, I get the reduced traffic in the Request channel while keeping everything else.

112176-sampling-percentage.png

The KQL to check sampling rates in App Insights is:

 union requests,dependencies,pageViews,browserTimings,exceptions,traces
 | where timestamp > ago(1d)
 | summarize RetainedPercentage = 100/avg(itemCount) by bin(timestamp, 1h), itemType




· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I'll give it a try but :

With the new SDK, when receiving an actual message, the SDK also logs a ServiceBusReceiver.Receive.
Would the sampling not also be applied to these items? If yes, how would this impact correlated traces from multiple services?

As mentioned in the initial post, my real life scenarion has multiple .net core services communicating via service bus with all the traces corelated.

112450-image.png


Thanks four your support!

Razvan


0 Votes 0 ·
image.png (158.7 KiB)

Regardless of the type of sampling involved or what the sampling rate is, App Insights will always do it's best to keep correlated events so that any data that is kept will also maintain the end-to-end transactions. If you are keeping 100% of the message logs, it should keep 100% of the correlated Receive messages, and will keep fewer of the heartbeat events. This principle should be applied even if it means keeping a higher percentage than what you have set for the request items.

0 Votes 0 ·

I've tried the same approach but I know receive the following warning on startup:

"AI: A Metric Extractor detected a telemetry item with SamplingPercentage < 100. Metrics Extractors should be used before Sampling Processors or any other Telemetry Processors that might filter out Telemetry Items. Otherwise, extracted metrics may be incorrect."

Registered the telemetryinitializer as follows:

services.AddSingleton<ITelemetryInitializer, MyTelemetryInitializer>();
services.AddApplicationInsightsTelemetryWorkerService();

It's a console application running the servicebusprocessor in a Hosted Service.
Any idea?


0 Votes 0 ·