Troubleshooting backup failures on Azure virtual machines
You can troubleshoot errors encountered while using Azure Backup with the information listed below:
This section covers backup operation failure of Azure Virtual machine.
- Ensure that the VM Agent (WA Agent) is the latest version.
- Ensure that the Windows or Linux VM OS version is supported, refer to the IaaS VM Backup Support Matrix.
- Verify that another backup service isn't running.
- To ensure there are no snapshot extension issues, uninstall extensions to force reload and then retry the backup.
- Verify that the VM has internet connectivity.
- Make sure another backup service isn't running.
Services.msc, ensure the Windows Azure Guest Agent service is Running. If the Windows Azure Guest Agent service is missing, install it from Back up Azure VMs in a Recovery Services vault.
- The Event log may show backup failures that are from other backup products, for example, Windows Server backup, and aren't due to Azure Backup. Use the following steps to determine whether the issue is with Azure Backup:
- If there's an error with the entry Backup in the event source or message, check whether Azure IaaS VM Backup backups were successful, and whether a Restore Point was created with the desired snapshot type.
- If Azure Backup is working, then the issue is likely with another backup solution.
- Here is an example of an Event Viewer error 517 where Azure Backup was working fine but "Windows Server Backup" was failing:
- If Azure Backup is failing, then look for the corresponding Error Code in the section Common VM backup errors in this article.
The following are common issues with backup failures on Azure virtual machines.
VMRestorePointInternalError - Antivirus configured in the VM is restricting the execution of backup extension
Error code: VMRestorePointInternalError
If at the time of backup, the Event Viewer Application logs displays the message Faulting application name: IaaSBcdrExtension.exe then it's confirmed that the antivirus configured in the VM is restricting the execution of backup extension. To resolve this issue, exclude the directories below in the antivirus configuration and retry the backup operation.
CopyingVHDsFromBackUpVaultTakingLongTime - Copying backed up data from vault timed out
Error code: CopyingVHDsFromBackUpVaultTakingLongTime
Error message: Copying backed up data from vault timed out
This could happen due to transient storage errors or insufficient storage account IOPS for backup service to transfer data to the vault within the timeout period. Configure VM backup using these best practices and retry the backup operation.
UserErrorVmNotInDesirableState - VM is not in a state that allows backups
Error code: UserErrorVmNotInDesirableState
Error message: VM is not in a state that allows backups.
The backup operation failed because the VM is in Failed state. For a successful backup, the VM state should be Running, Stopped, or Stopped (deallocated).
- If the VM is in a transient state between Running and Shut down, wait for the state to change. Then trigger the backup job.
- If the VM is a Linux VM and uses the Security-Enhanced Linux kernel module, exclude the Azure Linux Agent path /var/lib/waagent from the security policy and make sure the Backup extension is installed.
UserErrorFsFreezeFailed - Failed to freeze one or more mount-points of the VM to take a file-system consistent snapshot
Error code: UserErrorFsFreezeFailed
Error message: Failed to freeze one or more mount-points of the VM to take a file-system consistent snapshot.
- Unmount the devices for which the file system state wasn't cleaned, using the umount command.
- Run a file system consistency check on these devices by using the fsck command.
- Mount the devices again and retry backup operation.
If you can't un-mount the devices then you can update the VM backup configuration to ignore certain mount points. For example, if '/mnt/resource' mount point can't be un-mounted and causing the VM backup failures, you can update the VM backup configuration files with the
MountsToSkip property as follows.
cat /var/lib/waagent/Microsoft.Azure.RecoveryServices.VMSnapshotLinux-1.0.9170.0/main/tempPlugin/vmbackup.conf[SnapshotThread] fsfreeze: True MountsToSkip = /mnt/resource SafeFreezeWaitInSeconds=600
ExtensionSnapshotFailedCOM / ExtensionInstallationFailedCOM / ExtensionInstallationFailedMDTC - Extension installation/operation failed due to a COM+ error
Error code: ExtensionSnapshotFailedCOM
Error message: Snapshot operation failed due to COM+ error
Error code: ExtensionInstallationFailedCOM
Error message: Extension installation/operation failed due to a COM+ error
Error code: ExtensionInstallationFailedMDTC
Error message: Extension installation failed with the error "COM+ was unable to talk to the Microsoft Distributed Transaction Coordinator
The Backup operation failed due to an issue with Windows service COM+ System application. To resolve this issue, follow these steps:
- Try starting/restarting Windows service COM+ System Application (from an elevated command prompt - net start COMSysApp).
- Ensure Distributed Transaction Coordinator service is running as Network Service account. If not, change it to run as Network Service account and restart COM+ System Application.
- If unable to restart the service, then reinstall Distributed Transaction Coordinator service by following the steps below:
- Stop the MSDTC service
- Open a command prompt (cmd)
- Run the command
- Run the command
- Start the MSDTC service
- Start the Windows service COM+ System Application. After the COM+ System Application starts, trigger a backup job from the Azure portal.
ExtensionFailedVssWriterInBadState - Snapshot operation failed because VSS writers were in a bad state
Error code: ExtensionFailedVssWriterInBadState
Error message: Snapshot operation failed because VSS writers were in a bad state.
This error occurs because the VSS writers were in a bad state. Azure Backup extensions interact with VSS Writers to take snapshots of the disks. To resolve this issue, follow these steps:
Step 1: Restart VSS writers that are in a bad state.
- From an elevated command prompt, run
vssadmin list writers.
- The output contains all VSS writers and their state. For every VSS writer with a state that's not  Stable, restart the respective VSS writer's service.
- To restart the service, run the following commands from an elevated command prompt:
net stop serviceName
net start serviceName
Restarting some services can have an impact on your production environment. Ensure the approval process is followed and the service is restarted at the scheduled downtime.
Step 2: If restarting the VSS writers did not resolve the issue, then run the following command from an elevated command-prompt (as an administrator) to prevent the threads from being created for blob-snapshots.
REG ADD "HKLM\SOFTWARE\Microsoft\BcdrAgentPersistentKeys" /v SnapshotWithoutThreads /t REG_SZ /d True /f
Step 3: If steps 1 and 2 did not resolve the issue, then the failure could be due to VSS writers timing out due to limited IOPS.
To verify, navigate to System and Event Viewer Application logs and check for the following error message:
The shadow copy provider timed out while holding writes to the volume being shadow copied. This is probably due to excessive activity on the volume by an application or a system service. Try again later when activity on the volume is reduced.
- Check for possibilities to distribute the load across the VM disks. This will reduce the load on single disks. You can check the IOPs throttling by enabling diagnostic metrics at storage level.
- Change the backup policy to perform backups during off peak hours, when the load on the VM is at its lowest.
- Upgrade the Azure disks to support higher IOPs. Learn more here
ExtensionFailedVssServiceInBadState - Snapshot operation failed due to VSS (Volume Shadow Copy) service in bad state
Error code: ExtensionFailedVssServiceInBadState
Error message: Snapshot operation failed due to VSS (Volume Shadow Copy) service in bad state.
This error occurs because the VSS service was in a bad state. Azure Backup extensions interact with VSS service to take snapshots of the disks. To resolve this issue, follow these steps:
Restart VSS (Volume Shadow Copy) service.
- Navigate to Services.msc and restart 'Volume Shadow Copy service'.
- Run the following commands from an elevated command prompt:
net stop VSS
net start VSS
If the issue still persists, restart the VM at the scheduled downtime.
UserErrorSkuNotAvailable - VM creation failed as VM size selected is not available
Error code: UserErrorSkuNotAvailable Error message: VM creation failed as VM size selected is not available.
This error occurs because the VM size selected during the restore operation is an unsupported size.
UserErrorMarketPlaceVMNotSupported - VM creation failed due to Market Place purchase request being not present
Error code: UserErrorMarketPlaceVMNotSupported Error message: VM creation failed due to Market Place purchase request being not present.
Azure Backup supports backup and restore of VMs which are available in Azure Marketplace. This error occurs when you are trying to restore a VM (with a specific Plan/Publisher setting) which is no longer available in Azure Marketplace, Learn more here.
- To resolve this issue, use the restore disks option during the restore operation and then use PowerShell or Azure CLI cmdlets to create the VM with the latest marketplace information corresponding to the VM.
- If the publisher does not have any Marketplace information, you can use the data disks to retrieve your data and you can attach them to an existing VM.
ExtensionConfigParsingFailure - Failure in parsing the config for the backup extension
Error code: ExtensionConfigParsingFailure
Error message: Failure in parsing the config for the backup extension.
This error happens because of changed permissions on the MachineKeys directory: %systemdrive%\programdata\microsoft\crypto\rsa\machinekeys.
Run the following command and verify that permissions on the MachineKeys directory are default ones:
Default permissions are as follows:
- Everyone: (R,W)
- BUILTIN\Administrators: (F)
If you see permissions in the MachineKeys directory that are different than the defaults, follow these steps to correct permissions, delete the certificate, and trigger the backup:
Fix permissions on the MachineKeys directory. By using Explorer security properties and advanced security settings in the directory, reset permissions back to the default values. Remove all user objects except the defaults from the directory and make sure the Everyone permission has special access as follows:
- List folder/read data
- Read attributes
- Read extended attributes
- Create files/write data
- Create folders/append data
- Write attributes
- Write extended attributes
- Read permissions
Delete all certificates where Issued To is the classic deployment model or Windows Azure CRP Certificate Generator:
- Open certificates on a local computer console.
- Under Personal > Certificates, delete all certificates where Issued To is the classic deployment model or Windows Azure CRP Certificate Generator.
Trigger a VM backup job.
ExtensionStuckInDeletionState - Extension state is not supportive to backup operation
Error code: ExtensionStuckInDeletionState
Error message: Extension state is not supportive to backup operation
The Backup operation failed due to inconsistent state of Backup Extension. To resolve this issue, follow these steps:
- Ensure Guest Agent is installed and responsive
- From the Azure portal, go to Virtual Machine > All Settings > Extensions
- Select the backup extension VmSnapshot or VmSnapshotLinux and select Uninstall.
- After deleting backup extension, retry the backup operation
- The subsequent backup operation will install the new extension in the desired state
ExtensionFailedSnapshotLimitReachedError - Snapshot operation failed as snapshot limit is exceeded for some of the disks attached
Error code: ExtensionFailedSnapshotLimitReachedError
Error message: Snapshot operation failed as snapshot limit is exceeded for some of the disks attached
The snapshot operation failed as the snapshot limit has exceeded for some of the disks attached. Complete the following troubleshooting steps and then retry the operation.
Delete the disk blob-snapshots that aren't required. Be careful to not delete disk blobs. Only snapshot blobs should be deleted.
If Soft-delete is enabled on VM disk Storage-Accounts, configure soft-delete retention so existing snapshots are less than the maximum allowed at any point of time.
If Azure Site Recovery is enabled in the backed-up VM, then perform the steps below:
- Ensure the value of isanysnapshotfailed is set as false in /etc/azure/vmbackup.conf
- Schedule Azure Site Recovery at a different time, so it doesn't conflict the backup operation.
ExtensionFailedTimeoutVMNetworkUnresponsive - Snapshot operation failed due to inadequate VM resources
Error code: ExtensionFailedTimeoutVMNetworkUnresponsive
Error message: Snapshot operation failed due to inadequate VM resources.
Backup operation on the VM failed due to delay in network calls while performing the snapshot operation. To resolve this issue, perform Step 1. If the issue persists, try steps 2 and 3.
Step 1: Create snapshot through Host
From an elevated (admin) command-prompt, run the following command:
REG ADD "HKLM\SOFTWARE\Microsoft\BcdrAgentPersistentKeys" /v SnapshotMethod /t REG_SZ /d firstHostThenGuest /f REG ADD "HKLM\SOFTWARE\Microsoft\BcdrAgentPersistentKeys" /v CalculateSnapshotTimeFromHost /t REG_SZ /d True /f
This will ensure the snapshots are taken through host instead of Guest. Retry the backup operation.
Step 2: Try changing the backup schedule to a time when the VM is under less load (like less CPU or IOPS)
Step 3: Try increasing the size of the VM and retry the operation
320001, ResourceNotFound - Could not perform the operation as VM no longer exists / 400094, BCMV2VMNotFound - The virtual machine doesn't exist / An Azure virtual machine wasn't found
Error code: 320001, ResourceNotFound
Error message: Could not perform the operation as VM no longer exists.
Error code: 400094, BCMV2VMNotFound
Error message: The virtual machine doesn't exist
An Azure virtual machine wasn't found.
This error happens when the primary VM is deleted, but the backup policy still looks for a VM to back up. To fix this error, take the following steps:
- Re-create the virtual machine with the same name and same resource group name, cloud service name,
- Stop protecting the virtual machine with or without deleting the backup data. For more information, see Stop protecting virtual machines.
UserErrorBCMPremiumStorageQuotaError - Could not copy the snapshot of the virtual machine, due to insufficient free space in the storage account
Error code: UserErrorBCMPremiumStorageQuotaError
Error message: Could not copy the snapshot of the virtual machine, due to insufficient free space in the storage account
For premium VMs on VM backup stack V1, we copy the snapshot to the storage account. This step makes sure that backup management traffic, which works on the snapshot, doesn't limit the number of IOPS available to the application using premium disks.
We recommend that you allocate only 50 percent, 17.5 TB, of the total storage account space. Then the Azure Backup service can copy the snapshot to the storage account and transfer data from this copied location in the storage account to the vault.
380008, AzureVmOffline - Failed to install Microsoft Recovery Services extension as virtual machine is not running
Error code: 380008, AzureVmOffline
Error message: Failed to install Microsoft Recovery Services extension as virtual machine is not running
The VM Agent is a prerequisite for the Azure Recovery Services extension. Install the Azure Virtual Machine Agent and restart the registration operation.
- Check if the VM Agent is installed correctly.
- Make sure that the flag on the VM config is set correctly.
ExtensionSnapshotBitlockerError - The snapshot operation failed with the Volume Shadow Copy Service (VSS) operation error
Error code: ExtensionSnapshotBitlockerError
Error message: The snapshot operation failed with the Volume Shadow Copy Service (VSS) operation error This drive is locked by BitLocker Drive Encryption. You must unlock this drive from the Control Panel.
Turn off BitLocker for all drives on the VM and check if the VSS issue is resolved.
VmNotInDesirableState - The VM isn't in a state that allows backups
Error code: VmNotInDesirableState
Error message: The VM isn't in a state that allows backups.
If the VM is in a transient state between Running and Shut down, wait for the state to change. Then trigger the backup job.
If the VM is a Linux VM and uses the Security-Enhanced Linux kernel module, exclude the Azure Linux Agent path /var/lib/waagent from the security policy and make sure the Backup extension is installed.
The VM Agent isn't present on the virtual machine:
Install any prerequisite and the VM Agent. Then restart the operation. |Read more about VM Agent installation and how to validate VM Agent installation.
ExtensionSnapshotFailedNoSecureNetwork - The snapshot operation failed because of failure to create a secure network communication channel
Error code: ExtensionSnapshotFailedNoSecureNetwork
Error message: The snapshot operation failed because of failure to create a secure network communication channel.
- Open the Registry Editor by running regedit.exe in an elevated mode.
- Identify all versions of the .NET Framework present in your system. They're present under the hierarchy of registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft.
- For each .NET Framework present in the registry key, add the following key:
ExtensionVCRedistInstallationFailure - The snapshot operation failed because of failure to install Visual C++ Redistributable for Visual Studio 2012
Error code: ExtensionVCRedistInstallationFailure
Error message: The snapshot operation failed because of failure to install Visual C++ Redistributable for Visual Studio 2012.
- Navigate to
C:\Packages\Plugins\Microsoft.Azure.RecoveryServices.VMSnapshot\agentVersionand install vcredist2013_x64.
Make sure that the registry key value that allows the service installation is set to the correct value. That is, set the Start value in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Msiserver to 3 and not 4.
If you still have issues with installation, restart the installation service by running MSIEXEC /UNREGISTER followed by MSIEXEC /REGISTER from an elevated command prompt.
- Check the event log to verify if you're noticing access related issues. For example: Product: Microsoft Visual C++ 2013 x64 Minimum Runtime - 12.0.21005 -- Error 1401.Could not create key: Software\Classes. System error 5. Verify that you have sufficient access to that key, or contact your support personnel.
Ensure the administrator or user account has sufficient permissions to update the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Classes. Provide sufficient permissions and restart the Windows Azure Guest Agent.
- If you have antivirus products in place, ensure they have the right exclusion rules to allow the installation.
UserErrorRequestDisallowedByPolicy - An invalid policy is configured on the VM which is preventing Snapshot operation
Error code: UserErrorRequestDisallowedByPolicy
Error message: An invalid policy is configured on the VM which is preventing Snapshot operation.
If you have an Azure Policy that governs tags within your environment, either consider changing the policy from a Deny effect to a Modify effect, or create the resource group manually according to the naming schema required by Azure Backup.
|Cancellation isn't supported for this job type:
Wait until the job finishes.
|The job isn't in a cancelable state:
Wait until the job finishes.
The selected job isn't in a cancelable state:
Wait for the job to finish.
|It's likely that the job is almost finished. Wait until the job is finished.|
|Backup can't cancel the job because it isn't in progress:
Cancellation is supported only for jobs in progress. Try to cancel an in-progress job.
|This error happens because of a transitory state. Wait a minute and retry the cancel operation.|
|Backup failed to cancel the job:
Wait until the job finishes.
Disks appear offline after File Restore
If after restore, you notice the disks are offline then:
- Verify if the machine where the script is executed meets the OS requirements. Learn more.
- Ensure you are not restoring to the same source, Learn more.
UserErrorInstantRpNotFound - Restore failed because the Snapshot of the VM was not found
Error code: UserErrorInstantRpNotFound
Error message: Restore failed because the snapshot of the VM was not found. The snapshot could have been deleted, please check.
This error occurs when you are trying to restore from a recovery point that was not transferred to the vault and was deleted in the snapshot phase.
To resolve this issue, try to restore the VM from a different restore point.
|Restore failed with a cloud internal error.||
|The selected DNS name is already taken:
Specify a different DNS name and try again.
|This DNS name refers to the cloud service name, usually ending with .cloudapp.net. This name needs to be unique. If you get this error, you need to choose a different VM name during restore.
This error is shown only to users of the Azure portal. The restore operation through PowerShell succeeds because it restores only the disks and doesn't create the VM. The error will be faced when the VM is explicitly created by you after the disk restore operation.
|The specified virtual network configuration isn't correct:
Specify a different virtual network configuration and try again.
|The specified cloud service is using a reserved IP that doesn't match the configuration of the virtual machine being restored:
Specify a different cloud service that isn't using a reserved IP. Or choose another recovery point to restore from.
|The cloud service has reached its limit on the number of input endpoints:
Retry the operation by specifying a different cloud service or by using an existing endpoint.
|The Recovery Services vault and target storage account are in two different regions:
Make sure the storage account specified in the restore operation is in the same Azure region as your Recovery Services vault.
|The storage account specified for the restore operation isn't supported:
Only Basic or Standard storage accounts with locally redundant or geo-redundant replication settings are supported. Select a supported storage account.
|The type of storage account specified for the restore operation isn't online:
Make sure that the storage account specified in the restore operation is online.
|This error might happen because of a transient error in Azure Storage or because of an outage. Choose another storage account.|
|The resource group quota has been reached:
Delete some resource groups from the Azure portal or contact Azure Support to increase the limits.
|The selected subnet doesn't exist:
Select a subnet that exists.
|The Backup service doesn't have authorization to access resources in your subscription.||To resolve this error, first restore disks by using the steps in Restore backed-up disks. Then use the PowerShell steps in Create a VM from restored disks.|
Backup or restore takes time
Set up the VM Agent
Typically, the VM Agent is already present in VMs that are created from the Azure gallery. But virtual machines that are migrated from on-premises datacenters won't have the VM Agent installed. For those VMs, the VM Agent needs to be installed explicitly.
Windows VMs - Set up the agent
- Download and install the agent MSI. You need Administrator privileges to finish the installation.
- For virtual machines created by using the classic deployment model, update the VM property to indicate that the agent is installed. This step isn't required for Azure Resource Manager virtual machines.
Linux VMs - Set up the agent
- Install the latest version of the agent from the distribution repository. For details on the package name, see the Linux Agent repository.
- For VMs created by using the classic deployment model, update the VM property and verify that the agent is installed. This step isn't required for Resource Manager virtual machines.
Update the VM Agent
Windows VMs - Update the agent
- To update the VM Agent, reinstall the VM Agent binaries. Before you update the agent, make sure no backup operations occur during the VM Agent update.
Linux VMs - Update the agent
To update the Linux VM Agent, follow the instructions in the article Updating the Linux VM Agent.
Always use the distribution repository to update the agent.
Don't download the agent code from GitHub. If the latest agent isn't available for your distribution, contact the distribution support for instructions to acquire the latest agent. You can also check the latest Windows Azure Linux agent information in the GitHub repository.
Validate VM Agent installation
Verify the VM Agent version on Windows VMs:
- Sign in to the Azure virtual machine and navigate to the folder C:\WindowsAzure\Packages. You should find the WaAppAgent.exe file.
- Right-click the file and go to Properties. Then select the Details tab. The Product Version field should be 2.6.1198.718 or higher.
Troubleshoot VM snapshot issues
VM backup relies on issuing snapshot commands to underlying storage. Not having access to storage or delays in a snapshot task run can cause the backup job to fail. The following conditions can cause snapshot task failure:
VMs with SQL Server backup configured can cause snapshot task delay. By default, VM backup creates a VSS full backup on Windows VMs. VMs that run SQL Server, with SQL Server backup configured, can experience snapshot delays. If snapshot delays cause backup failures, set following registry key:
VM status is reported incorrectly because the VM is shut down in RDP. If you used the remote desktop to shut down the virtual machine, verify that the VM status in the portal is correct. If the status isn't correct, use the Shutdown option in the portal VM dashboard to shut down the VM.
If more than four VMs share the same cloud service, spread the VMs across multiple backup policies. Stagger the backup times, so no more than four VM backups start at the same time. Try to separate the start times in the policies by at least an hour.
The VM runs at high CPU or memory. If the virtual machine runs at high memory or CPU usage, more than 90 percent, your snapshot task is queued and delayed. Eventually it times out. If this issue happens, try an on-demand backup.
DHCP must be enabled inside the guest for IaaS VM backup to work. If you need a static private IP, configure it through the Azure portal or PowerShell. Make sure the DHCP option inside the VM is enabled. Get more information on how to set up a static IP through PowerShell: