Troubleshoot SSH connection issues in Azure Linux VM due to permission and ownership issues

Note

CentOS referenced in this article is a Linux distribution and will reach End Of Life (EOL). Consider your use and plan accordingly. For more information, see CentOS End Of Life guidance.

This article provides solutions to an issue in which connecting to a Linux virtual machine (VM) via Secure Shell (SSH) fails because the /var/empty/sshd directory in RHEL, the /var/lib/empty directory in SUSE, or the /var/run/sshd directory in Ubuntu, doesn't exist, or it isn't owned by the root user, or it's group-writable or world-writable.

Symptoms

When you connect to a Linux virtual machine (VM) via SSH, the connection fails. You may receive the following error message about the affected directory, depending on your Linux distribution.

sudo tail /var/log/messages
sshd: /var/empty/sshd must be owned by root and not group or world-writable.  

Cause

This problem may occur if the affected directory isn't owned by the root user, or if it's group-writable or world-writable.

To resolve this issue, use one of the following resolutions:

Resolution 1: Repair the VM online

Here are two methods to repair the VM offline:

Use the Serial Console

  1. Connect to the Serial Console of the VM from Azure portal.

  2. Sign in to the VM by using a local administrative account and its corresponding credential or password.

  3. Run the following commands to resolve the permission and ownership issue:

    sudo mkdir -p /var/empty/sshd
    sudo chmod 755 /var/empty/sshd
    sudo chown root:root /var/empty/sshd
    

Use the "Run Command" extension

Note

This method relies on the Azure Linux VM Agent (waagent). Therefore, make sure that the agent is installed in the VM and that its service is running.

In the Azure portal, open the Properties window of the VM to check the agent status. If the agent is enabled and has the Ready status, follow these steps to change the permission:

  1. Go to the Azure portal, locate your VM settings, and then select Run Command under Operations.

  2. Execute the following shell script by selecting RunShellScript > Run:

    #!/bin/bash
    
    #Script to change permissions on a file
    mkdir -p /var/empty/sshd;chmod 755 /var/empty/sshd;chown root:root /var/empty/sshd
    

  1. After the script execution finishes, the output console window will show an "Enable succeeded" message.

If you can connect to the VM via SSH, and you want to analyze the details of the Run-command script execution, examine the handler.log file in the /var/log/azure/run-command directory.

Resolution 2: Repair the VM offline

Note

  • Use this resolution if the VM serial console access isn't available and the waagent isn't ready.
  • In Ubuntu, the /var/run/sshd directory runs in memory. Restarting the VM will also fix the issue. Therefore, offline troubleshooting in Ubuntu VMs isn't necessary.

Here are two methods to repair the VM offline:

Use Azure Linux Auto Repair (ALAR)

Azure Linux Auto Repair (ALAR) scripts are a part of the VM repair extension described in Repair a Linux VM by using the Azure Virtual Machine repair commands.

Follow these steps to automate the manual offline process:

Note

In the following steps, replace $RGNAME, $VMNAME, $USERNAME, $PASSWORD, and repairdiskcopy values accordingly.

  1. Use the az vm repair create command to create a repair VM. The repair VM has a copy of the OS disk for the problematic VM attached.

    az vm repair create --verbose -g $RGNAME -n $VMNAME --repair-username $USERNAME --repair-password $PASSWORD --copy-disk-name repairdiskcopy
    
  2. Sign in to the repair VM. Mount and chroot to the filesystem of the attached copy of the OS disk. Follow the detailed chroot instructions.

  3. Run the following commands to resolve the permission and ownership issues:

    mkdir -p /var/empty/sshd
    chmod 755 /var/empty/sshd
    chown root:root /var/empty/sshd
    
  4. Once the changes are applied, run the following az vm repair restore command to perform an automatic OS disk swap with the original VM.

    az vm repair restore --verbose -g $RGNAME -n $VMNAME
    

Use the manual method

If both the serial console and ALAR approach don't apply to you or fail, the repair has to be performed manually. Follow the steps below to manually attach the OS disk to a recovery VM and swap the OS disk back to the original VM:

Once the OS disk is successfully attached to the recovery VM, follow the detailed chroot instructions to mount and chroot to the filesystems of the attached OS disk. Then, follow step 3 in the Use Azure Linux Auto Repair (ALAR) section to resolve the permission and ownership issues.

Contact us for help

If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.