question

RobinHeller-6437 avatar image
0 Votes"
RobinHeller-6437 asked ·

App Insights: Metrics Outage (no-report) alerting rule

I use App Insights metrics to report the delay between an application event and it's processing. The processing time is, per definition, >1s since it uses a cron based scheduler. I've written a bash script to report the time difference to Azure App Insights. This is working fine so far. Now I've configured two alerts:

  • avg(time difference) of the last 5 minutes > 120

  • avg(time difference) of the last 5 minutes <= 1

The first alert is pretty obvious: catch a instance where my application is not processing the event correctly at all.
The second alert might need some more explanation: I want to catch the case in which my bash script is not reporting any data at all (i.e. system downtime, complete application crash…). In theory, the average value would drag down to 0 within 5 minutes after the application crash, thus triggering the alert and sending me an email.

This is not working at all: I can kill my bash script that transfers the data to the custom metrics API and not receive an alert at all (yes, I've waited the 5 minutes). If I manually (/from my bash script that is) report values of 0 for the time difference, the alert fires correctly. If I then change the script to report a value > 0, the alert is deactived properly as well. I have also tested this with avg(td) < 0 (which is my preferred way of doing it), but that doesn't work either. Is this expected / documented behavior? It really doesn't make a whole lot of sense to me. Is there a better way to alert on this"non-reporting" of certain metrics?

azure-monitor
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

kobulloc-MSFT avatar image
0 Votes"
kobulloc-MSFT answered ·

There are a couple features in Application Insights to be aware of if you are looking for a 1:1 mapping of activity and result instead of statistically relevant overviews. Sampling in Application Insights is one of the first things I would look at if you are not seeing specific events that you are expecting. You would also want to be aware of a 5-10 minute delay in the availability of data although that may not be important in your scenario. I would also take a quick look at other similar services, like Stream Analytics to see if they are more in line with your goals for this project.


·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.