question

MartinJrgensen-7072 avatar image
0 Votes"
MartinJrgensen-7072 asked MartinJrgensen-1297 edited

IoT Hub does not allow near real time messaging

We are using IoT Hub in a scenario, where near real time communication is required.
We define near real time communication as less than 2 seconds for a message loop (C2D -> D2C).
We have observed, that even though we are running emulated devices on a virtual server located in Azure data center, some messages are still requiring 10 seconds or more for delivery. Most of the messages are completed in 400ms.
What is causing the huge difference in delivery time, and what can be changed?
Is IoT Hub useable for any near real time communication?
The Microsoft comment: "Avoid making any assumptions about the maximum latency of any IoT Hub operation" (https://docs.microsoft.com/en-us/azure/iot-hub/iot-hub-devguide-quotas-throttling#latency) does not sound promising...

Thanks
Martin

44372-image.png


azure-iot-hubazure-iot
image.png (106.2 KiB)
· 13
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi Martin,
10 seconds is definitely a lot more than what I'm used to seeing from an IoT Hub, but in the end, it's important to design for a scenario where this might be the case. While IoT Hub is usually very fast to respond, it's a cloud service and latency can vary. The docs have little info about what the latency could be, but it's always possible that you hit temporary delays.

I'm sorry to hear that Azure IoT Edge isn't a suitable solution for you, it might still be worth reaching out to the Global Blackbelt team of Microsoft to evaluate your architecture options, like Sander suggested.

And just to be thorough, your VM and IoT Hub are provisioned in what region?

1 Vote 1 ·

IoT Hub and VM are both in "West Europe".
We have logged additional data, and that data still show delay, but not 10s delays. But I would like to get more understanding about, why almost all IoT method calls are completed within 300ms, and when not, the time is typically around 2-3 seconds. Why do we not see more varying delays?
44657-image.png


0 Votes 0 ·
image.png (14.3 KiB)

Hey @SatishBoddu-MSFT , any insights from your side? I think on the questions "should you use IoT Hub for near real-time messaging" is just no, but these regular delays are higher than what I've seen before.

0 Votes 0 ·
Show more comments
asergaz avatar image asergaz MartinJrgensen-7072 ·

1/2

@MartinJrgensen-7072 thank you for your great questions and feedback! I wanted to understand better your scenario, please provide more details about:

1) What MessageSentTicks and MessageResponseTicks means? Are you counting the time it takes when you send a D2C and C2D message and receive a confirmation that message was delivered? How do you validate that a message was delivered and how do you measure it? Example: "Receive delivery feedback"
2) In the original post you mentioned that you are measuring C2D messages but then mentioned "why almost all IoT method calls are completed within 300ms, and when not, the time is typically around 2-3 seconds. " . I want to be sure that we are not referring to Direct Methods as their behaviour is substantially different from C2D messages.




0 Votes 0 ·
Show more comments
asergaz avatar image
1 Vote"
asergaz answered MartinJrgensen-1297 edited

Hello @MartinJrgensen-7072 ,
Thank you so much for your time in the Azure Support Ticket. I am now posting our conclusions and would appreciate if you can verify this as the answer or add any other comments for further explanation.

When analyzing more carefully the logs you provided we realized that two messages in a row never took more than 1 second to be delivered - which is a totally acceptable behavior. The strategy to overcome that is to send the message again if no ack is received after 1 second (or based on any other benchmark for retry interval you define on your own). We do already have a document on no guarantees around the message delivery latency - Reference - IoT Hub quotas and throttling#Latency | Microsoft Docs - and as expected the retry period (example: 1 second if no ack) would vary depending upon many factors, including device’s network connection and device’s processing. Note that, if we setup a very aggressive retry policy we may be throttled, so there needs to be a balance and delay expectations should be well set in customer experience design.

Thank you so much for these great questions and we hope we have provided you the right tools to proceed with your development.

Remember:
- Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
- Want a reminder to come back and check responses? Here is how to subscribe to a notification.





· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I accept the answer, that you can not guarantee any message delivery latency. In real life, if is possible to observe 10s+ delays, and you are right, that 99% of the time, the delay is very short. The delays observed was not depending on any network connection outside Microsoft, because all tests were done inside same Microsoft data center. If delivery time/delay is critical to a solution, IoT Hub is not optimal.

0 Votes 0 ·
asergaz avatar image asergaz MartinJrgensen-1297 ·

@MartinJrgensen-1297 yes you are correct, IoTHub didn't achieve 100% Service Level Agreemen (SLA) yet : https://azure.microsoft.com/en-us/support/legal/sla/iot-hub/v1_2/

"For IoT Hub, we promise that at least 99.9% of the time deployed IoT hubs will be able to send messages to and receive messages from registered devices and the Service will able to perform create, read, update, and delete operations on IoT hubs."

In a 30day billing period the SLA is met even if we have a downtime\delay of around 43 minutes.

Let me know if you have further questions?
Thank you!

1 Vote 1 ·

I understand.
This question was related to "real time messaging".
And all the tests show, that delivery delays of 10 seconds or more is not unusual.
But most of the time, it is less than 200ms.
In a "real time scenario", like sending a command from an APP to a device attached to IoT Hub, a 10 seconds delay is not always acceptable - even though it happens in less than 1% of the time.

Thanks

1 Vote 1 ·
SandervandeVelde42 avatar image
0 Votes"
SandervandeVelde42 answered SandervandeVelde42 edited

Well, if the internet connection is lost, the delivery times will be even larger ;-)

My impression is that the IoT Hub is built for scalability and reliability. Though, the waiting times of 10 seconds seem strange.

Did you look at the partition setting already? Under the covers, the number of parallel processes is set with this setting:

44392-image.png

If you rely on sub-second response times, please consider an Edge solution where the cloud logic you use to make decisions, is put on an edge device. This takes out at least the internet component of your roundtrip.

Update: Check for the nearest region for the IoT Hub (lowest latency)



image.png (49.1 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MartinJrgensen-7072 avatar image
0 Votes"
MartinJrgensen-7072 answered asergaz edited

At the moment the IoT Hub only handles a few messages per second, so it should not be a load issue.
We are using "S1 - Standard" and 4 partitions.
The test setup I have made is using the IoT Hub directly, without use of Azure function triggers etc.
Our complete solution involving several EventHubs/ServiceBusses etc. shows additional latency.
Often the latency in each "message" component in Azure is about 1 second!!! And other times it is perhaps 100ms.
And we have observed even higher delays in "message" components, but Microsoft reply that is just because we are using a "shared ressource", and someone else in Microsoft infrastructure put pressure on "message" components.
The sub-second is definitely not something you can rely on (except if you go for single-tenant deployments. Have not tested it...)...

Our solution is based on a customer initiating an operation on a device. Like turning on light. That functions is difficult to implement as an edge solution...
And it does make a difference, if the light turns on in 1 second or 10 seconds.

· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Also check out if you are actually running on the 'nearest' IoTHub...

Why is edge working in your case regarding customer actions?

Edge computing is a very good solution, especially in this case. If your button/light is connected to the IoTHub (by some direct-connected IoT device) you should definitely look into the transparent gateway option of IoT Edge.

There, child devices push their messages through the edge to the cloud while based on these messages commands (direct methods) can be sent back to these or other devices.

So just move over your cloud logic to the edge and have a much better latency.

I recommend getting in contact with the Global Blackbelt team of Microsoft through your regional Microsoft office to discuss your requirements and architecture.

  • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer.

1 Vote 1 ·

In our system, the "button" and "light" is not on the same network...
My question is regarding the "expected"/"acceptable" delay for IoT Hub, and if IoT Hub is suitable for (near) real time control.
For us edge computing does not solve the problem.
I will leave the question open until someone can confirm, that 10 seconds or more is what to expect from IoT Hub (when test in performed on "internal Microsoft" network).

1 Vote 1 ·
asergaz avatar image asergaz MartinJrgensen-7072 ·

I will leave the question open until someone can confirm, that 10 seconds or more is what to expect from IoT Hub (when test in performed on "internal Microsoft" network).

This is not expected to happen on multiple messages sent in a row, though we need to plan for latency as explained in the answer provided.

0 Votes 0 ·

See this blog for an example of a transparent gateway implementation.


0 Votes 0 ·