Hello again,
After some setup issues, I updated to the latest version via UUU tool which allowed setup to complete (https://learn.microsoft.com/en-us/answers/questions/760976/percept-setup-doesn39t-complete.html)
Device was working great and was accessible in Azure Percept Studio and etc.
However, after about 15-20 minutes device network connection seems to drop and device cannot be contacted. This is not a one off, is consistent now following a restart device will become inaccessible after about 15-20 minutes with the following symptoms:
- Device is powered on
- LED light on device is solid white
- Connection in Azure Percept Studio changes to "Disconnected"
- Connection graph in IOT hub drops from 1 to 0.
- Cannot SSH to device anymore from local WIFI network
- If I press the OFF button it will go through graceful shutdown process
Logs on device itself are still being generated and can access these via SSH following restart so appears has lost network access rather than just full crashed.
Device edgeHub_log.txt shows the following:
<6> 2022-03-07 08:20:11.390 +00:00 [INF] - Processing subscriptions TwinResponse, Methods, ModuleMessages, DesiredPropertyUpdates for client zeperceptdevice1/devmmclientmodule.
<6> 2022-03-07 08:20:11.393 +00:00 [INF] - Set subscriptions from session state for zeperceptdevice1/devmmclientmodule
<4> 2022-03-07 08:20:15.332 +00:00 [WRN] - Error creating cloud connection for client zeperceptdevice1/azureeyemodule
Microsoft.Azure.Devices.Client.Exceptions.IotHubCommunicationException: Transient network error occurred, please retry.
---> System.Net.Sockets.SocketException (11): Resource temporarily unavailable
at Microsoft.Azure.Devices.Client.Transport.AmqpIoT.AmqpIoTTransport.InitializeAsync(TimeSpan timeout)
at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpIoTConnector.OpenConnectionAsync(TimeSpan timeout)
at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpConnectionHolder.EnsureConnectionAsync(TimeSpan timeout)
at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpConnectionHolder.OpenSessionAsync(DeviceIdentity deviceIdentity, TimeSpan timeout)
at Microsoft.Azure.Devices.Client.Transport.AmqpIoT.AmqpUnit.EnsureSessionAsync(TimeSpan timeout)
at Microsoft.Azure.Devices.Client.Transport.AmqpIoT.AmqpUnit.OpenAsync(TimeSpan timeout)
at Microsoft.Azure.Devices.Client.Transport.Amqp.AmqpTransportHandler.OpenAsync(CancellationToken cancellationToken)
at Microsoft.Azure.Devices.Client.Transport.ProtocolRoutingDelegatingHandler.OpenAsync(CancellationToken cancellationToken)
at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.<>c__DisplayClass23_0.<<ExecuteWithErrorHandlingAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.ExecuteWithErrorHandlingAsync[T](Func`1 asyncOperation)
--- End of inner exception stack trace ---
at Microsoft.Azure.Devices.Client.Transport.ErrorDelegatingHandler.ExecuteWithErrorHandlingAsync[T](Func`1 asyncOperation)
at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.<>c__DisplayClass33_0.<<OpenInternalAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.Devices.Client.Transport.RetryDelegatingHandler.EnsureOpenedAsync(CancellationToken cancellationToken)
at Microsoft.Azure.Devices.Client.InternalClient.OpenAsync()
at Microsoft.Azure.Devices.Edge.Util.TaskEx.TimeoutAfter(Task task, TimeSpan timeout) in /home/vsts/work/1/s/edge-util/src/Microsoft.Azure.Devices.Edge.Util/TaskEx.cs:line 144
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ModuleClientWrapper.OpenAsync() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ModuleClientWrapper.cs:line 63
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.<>c__DisplayClass30_0.<<InvokeFunc>b__0>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 187
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.InvokeFunc[T](Func`1 func, String operation, Boolean useForConnectivityCheck) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 180
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.InvokeFunc[T](Func`1 func, String operation, Boolean useForConnectivityCheck) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 180
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.ConnectivityAwareClient.OpenAsync() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/ConnectivityAwareClient.cs:line 64
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.ConnectToIoTHub(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 134
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.CreateNewCloudProxyAsync(ITokenProvider newTokenProvider) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 117
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnection.Create(IIdentity identity, Action`2 connectionStatusChangedHandler, ITransportSettings[] transportSettings, IMessageConverterProvider messageConverterProvider, IClientProvider clientProvider, ICloudListener cloudListener, ITokenProvider tokenProvider, TimeSpan idleTimeout, Boolean closeOnIdleTimeout, TimeSpan operationTimeout, String productInfo, Option`1 modelId) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnection.cs:line 99
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.<>c__DisplayClass17_1.<<Connect>b__2>d.MoveNext() in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 162
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.Devices.Edge.Hub.CloudProxy.CloudConnectionProvider.Connect(IIdentity identity, Action`2 connectionStatusChangedHandler) in /home/vsts/work/1/s/edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.CloudProxy/CloudConnectionProvider.cs:line 178
docket.txt shows that the device is still running and does a graceful shutdown when I reboot:
Mar 07 08:35:40 apd-d8c0a6a82c29 dockerd[1245]: time="2022-03-07T08:35:40.961752744Z" level=warning msg="[resolver] connect failed: dial udp 8.8.8.8:53: connect: network is unreachable"
Mar 07 08:35:40 apd-d8c0a6a82c29 dockerd[1245]: time="2022-03-07T08:35:40.961847739Z" level=warning msg="[resolver] connect failed: dial udp 8.8.8.8:53: connect: network is unreachable"
Mar 07 08:35:41 apd-d8c0a6a82c29 systemd[1]: Stopping Docker Application Container Engine...
Mar 07 08:35:41 apd-d8c0a6a82c29 dockerd[1245]: time="2022-03-07T08:35:41.124130902Z" level=info msg="Processing signal 'terminated'"
Mar 07 08:35:41 apd-d8c0a6a82c29 dockerd[1245]: time="2022-03-07T08:35:41.128800354Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
Mar 07 08:35:41 apd-d8c0a6a82c29 dockerd[1245]: time="2022-03-07T08:35:41.131750129Z" level=info msg="Daemon shutdown complete"
Mar 07 08:35:41 apd-d8c0a6a82c29 systemd[1]: Stopped Docker Application Container Engine.
-- Reboot --
Any thoughts?
UPDATE:
> cat /etc/os-release
NAME="Common Base Linux Mariner"
VERSION="1.0.20220127"
ID=mariner
VERSION_ID=1.0
PRETTY_NAME="CBL-Mariner/Linux"
ANSI_COLOR="1;34"
HOME_URL="https://aka.ms/cbl-mariner"
BUG_REPORT_URL="https://aka.ms/cbl-mariner"
SUPPORT_URL="https://aka.ms/cbl-mariner"
> cat /etc/os-subrelease
BUILDER_NAME=Microsoft.AED
BUILD_DATE="2022-02-09T02:49:50Z"
ID=santacruz
VERSION_ID=165650
NAME="Santa Cruz Mariner"
VERSION="Drop 165650"
BRANCH="refs/heads/releases/public_preview_1.0"
COMMIT_ID="dffb37184b8cb0feb32bf2ad40bc0cd8980ea662"
> cat /etc/adu-version
2022.102.109.102
> Run the following command and provide us the network_log.txt to examine?
Start up:
Mar 08 17:39:48 apd-d8c0a6a82c29 ztpd[709]: set pattern='0 0 0 0' for led '/sys/class/leds/white:wlan'
Mar 08 17:39:48 apd-d8c0a6a82c29 ztpd[709]: set pattern='255 0 255 0' for led '/sys/class/leds/white:wlan'
Mar 08 17:39:48 apd-d8c0a6a82c29 systemd-networkd[488]: wlan0: DHCPv4 address 192.168.86.58/24 via 192.168.86.1
Mar 08 17:39:48 apd-d8c0a6a82c29 systemd-networkd[488]: wlan0: Configured
Below: Only events that happen in the network log between start up and shutdown. Device went offline around this time. Seems to correlate with the device stopping the access point:
Mar 08 17:42:46 apd-d8c0a6a82c29 ztpd[709]: [wlan0] event=<3>CTRL-EVENT-BSS-REMOVED 0 70:3a:cb:97:c8:7f
Mar 08 17:42:46 apd-d8c0a6a82c29 ztpd[709]: [wlan0] event=<3>CTRL-EVENT-BSS-REMOVED 2 12:13:31:da:48:a7
Mar 08 17:42:46 apd-d8c0a6a82c29 ztpd[709]: [wlan0] event=<3>CTRL-EVENT-BSS-REMOVED 3 10:13:31:da:48:9f
Mar 08 17:42:46 apd-d8c0a6a82c29 ztpd[709]: [wlan0] event=<3>CTRL-EVENT-BSS-REMOVED 4 12:13:31:da:48:a9
Mar 08 18:05:19 apd-d8c0a6a82c29 systemd[1]: hostapd.service: Unit not needed anymore. Stopping.
Mar 08 18:05:19 apd-d8c0a6a82c29 hostapd[853]: wlan1: interface state ENABLED->DISABLED
Mar 08 18:05:19 apd-d8c0a6a82c29 hostapd[853]: wlan1: AP-DISABLED
Mar 08 18:05:19 apd-d8c0a6a82c29 hostapd[853]: wlan1: CTRL-EVENT-TERMINATING
Mar 08 18:05:19 apd-d8c0a6a82c29 hostapd[853]: nl80211: deinit ifname=wlan1 disabled_11b_rates=0
Mar 08 18:05:19 apd-d8c0a6a82c29 ztpd[709]: inotify path=/var/run/wpa_supplicant target=hostapd mask=0x40000200
Mar 08 18:05:19 apd-d8c0a6a82c29 systemd[1]: Stopping hostapd...
Mar 08 18:05:19 apd-d8c0a6a82c29 systemd[1]: Stopped hostapd.
Start of shutdown/reboot
Mar 08 18:20:03 apd-d8c0a6a82c29 ztpd[709]: inotify path=/var/run/wpa_supplicant target=sshd.pid mask=0x00000200
Mar 08 18:20:03 apd-d8c0a6a82c29 ztpd[709]: failed to retrieve pending sd-bus message (-104)
Mar 08 18:20:03 apd-d8c0a6a82c29 ztpd[709]: failed to update sd-bus fd settings for epoll (9)