Q: Running Remove-AksHciCluster results in the error: 'Error: unable to delete group clustergroup-spdb:...'

When running Remove-AksHciCluster , the following error occurs because there may be a deadlock: Error: unable to delete group clustergroup-spdb: failed to delete group clustergroup-spdb: rpc error: code = DeadlineExceeded desc = context deadline exceeded To resolve this issue, restart CloudAgent.

Question 1

When running AksHci PowerShell cmdlets, an "Unable to Load DLL" error appears

Accepted Answer

Antivirus software may be causing this error by blocking the execution of PowerShell binaries that are required to perform cluster operations. An example of a similar error is shown below:

To resolve this issue, verify the following processes and folders (which are required to perform AKS cluster operations) are excluded from the antivirus software:

Processes:

kubectl.exe
kvactl.exe
mocctl.exe
nodectl.exe
wssdagent.exe
wssdcloudagent.exe
kubectl-adsso.exe
AksHciHealth.exe

Folders:

C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\
C:\Program Files\WindowsPowerShell\Modules\TraceProvider\
C:\Program Files\WindowsPowerShell\Modules\AksHci\
C:\Program Files\WindowsPowerShell\Modules\Az.Accounts\
C:\Program Files\WindowsPowerShell\Modules\Az.Resources\
C:\Program Files\WindowsPowerShell\Modules\AzureAD\
C:\Program Files\WindowsPowerShell\Modules\DownloadSdk\
C:\Program Files\WindowsPowerShell\Modules\Kva\
C:\Program Files\WindowsPowerShell\Modules\Microsoft.SME.CredSspPolicy\
C:\Program Files\WindowsPowerShell\Modules\Moc\
C:\Program Files\WindowsPowerShell\Modules\PackageManagement\
C:\Program Files\AksHci\
C:\AksHci\

Question 2

Running Remove-AksHciCluster results in the error: 'Error: unable to delete group clustergroup-spdb:...'

Accepted Answer

When running Remove-AksHciCluster, the following error occurs because there may be a deadlock:

Error: unable to delete group clustergroup-spdb: failed to delete group clustergroup-spdb: rpc error: code = DeadlineExceeded desc = context deadline exceeded

To resolve this issue, restart CloudAgent.

Question 3

Error: invalid_client. The provided client secret keys are expired

Accepted Answer

This error usually occurs when service principal (SPN) secret you used when running the PowerShell cmdlet running Enable-AksHciArcConnection expired.

Visit the Azure portal to create a new secret for your service principal (SPN). You can also use certificate credentials for added security. For an example of using the cmdlet, see Enable-AksHciArcConnection.

Question 4

Insufficient privileges to complete the operation

Accepted Answer

This error usually occurs when the service principal (SPN) or your Azure credentials (username and password) used to connect your AKS cluster don't have sufficient privileges in the Azure subscription to perform the operation.

Review the privilege requirements in Azure requirements for Kubernetes clusters in AKS enabled by Azure Arc.

Question 5

Running Remove-AksHciCluster results in the error: 'A workload cluster with the name 'my-workload-cluster' was not found'

Accepted Answer

XXX

If you encounter this error when running Remove-AksHciCluster, you should check to make sure you have used the correct information for removing the cluster.

Question 6

Transport: Error while dialing dial unix /var/run/moc-kms-plugin/kmsPlugin.sock: connect: no such file or directory

Accepted Answer

This error happens when the KMS plugin on your AKS-HCI target cluster has stopped running because of an expired KMS plugin token.

Run Repair-AksHciCerts to fix this issue.

Question 7

In a workload cluster with static IP addresses, all pods in a node are stuck in a 'ContainerCreating' state

Accepted Answer

In a workload cluster with static IP addresses and Windows nodes, all of the pods in a node (including the daemonset pods) are stuck in a ContainerCreating state. When attempting to connect to that node using SSH, the connection fails with a Connection timed out error.

To resolve this issue, use Hyper-V Manager or Failover Cluster Manager to turn off the VM of that node. After 5 to 10 minutes, the node should have been recreated and with all the pods running.

Question 8

Move AKS Arc resources location.

Accepted Answer

Moving resources in AKS Arc isn't currently supported. You must delete the Kubernetes cluster, then re-deploy it to the desired location.

Resolve general issues when using AKS enabled by Azure Arc