question

IanFerguson-1800 avatar image
0 Votes"
IanFerguson-1800 asked ·

Managed identity not working on 1 instance of a windows VMSS

We have a windows VMSS with system assigned managed identity enabled (the VMSS is used for a service fabric cluster).
We have applications running on the instances, that connect to external dependencies using managed identity.
We observe that apps running on 1 particular instance are unable to authorize with these external dependencies.
Restarting the app on the same instance leads to the same failure. However, on moving the app to another instance, we see that it connects successfully.

I have used azure cloudshell to inspect the managed identity of the vmss, and of the individual instances

 az vmss show --name {redacted} --resource-group {redacted}
 az vmss show --name {redacted} --resource-group {redacted} --instance-id 0 #the faulty instance
 az vmss show --name {redacted} --resource-group {redacted} --instance-id 1
 etc

I see that all instances report the same principal id / managed identity as the vmss. (as it should be)

However running the following powershell command on the VMSS instances while logged in with RDP, to check managed identity against the azure resources REST endpoint
(as described in this article - https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-azure-powershell )

 Invoke-WebRequest -Uri 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F' -Headers @{Metadata="true"}

The above command returns an error on the faulty instance ('Unable to connect to the remote server'). On the other instances, the same command authenticates and obtains a token.
I can't understand why one scaleset instance should exhibit different networking behaviors to the other instances, all are subject to the same networking policies.

Restarting or reimaging may fix it, but I would like to understand what has happened, and what may have caused it, so that I can prevent it from happening again.

Thanks in advance

azure-virtual-machines-scale-set
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

0 Answers