About disaster recovery of VMware VMs to Azure

This article provides an overview of disaster recovery for on-premises VMware VMs to Azure using the Azure Site Recovery service.

What is BCDR?

A business continuity and disaster recovery (BCDR) strategy helps keep your business up and running. During planned downtime and unexpected outages, BCDR keeps data safe and available, and ensures that apps continue running. In addition to platform BCDR features such as regional pairing, and high availability storage, Azure provides Recovery Services as an integral part of your BCDR solution. Recovery services include:

  • Azure Backup backs up your on-premises and Azure VM data. You can back up a file and folders, specific workloads, or an entire VM.
  • Azure Site Recovery provides resilience and disaster recovery for apps and workloads running on on-premises machines, or Azure IaaS VMs. Site Recovery orchestrates replication, and handles failover to Azure when outages occur. It also handles recovery from Azure to your primary site.

How does Site Recovery do disaster recovery?

  1. After preparing Azure and your on-premises site, you set up and enable replication for your on-premises machines.
  2. Site Recovery orchestrates initial replication of the machine, in accordance with your policy settings.
  3. After the initial replication, Site Recovery replicates delta changes to Azure.
  4. When everything's replicating as expected, you run a disaster recovery drill.
    • The drill helps ensure that failover will work as expected when a real need arises.
    • The drill performs a test failover without impacting your production environment.
  5. If an outage occurs, you run a full failover to Azure. You can fail over a single machine, or you can create a recovery plan that fails over multiple machines at the same time.
  6. On failover, Azure VMs are created from the VM data in Azure Storage. Users can continue accessing apps and workloads from the Azure VM
  7. When your on-premises site is available again, you fail back from Azure.
  8. After you fail back and are working from your primary site once more, you start replicating on-premises VMs to Azure again.

How do I know if my environment is suitable for disaster recovery to Azure?

Site Recovery can replicate any workload running on a supported VMware VM or physical server. Here are the things you need to check in your environment:

  • If you're replicating VMware VMs, are you running the right versions of VMware virtualization servers? Check here.
  • Are the machines you want to replicate running a supported operating system? Check here.
  • For Linux disaster recovery, are machines running a supported file system/guest storage? Check here
  • Do the machines you want to replicate comply with Azure requirements? Check here.
  • Is your network configuration supported? Check here.
  • Is your storage configuration supported? Check here.

What do I need to set up in Azure before I start?

In Azure you need to prepare the following:

  1. Verify that your Azure account has permissions to create VMs in Azure.
  2. Create a storage account to hold images of replicated machines.
  3. Create an Azure network that Azure VMs will join when they're created from storage after failover.
  4. Set up an Azure Recovery Services vault for Site Recovery. The vault resides in the Azure portal, and is used to deploy, configure, orchestrate, monitor, and troubleshoot your Site Recovery deployment.

Need more help?

Learn how to set up Azure by verifying your account, creating a storage account and network, and setting up a vault.

What do I need to set up on-premises before I start?

On-premises here's what you need to do:

  1. You need to set up a couple of accounts:

    • If you're replicating VMware VMs, an account is needed for Site Recovery to access vCenter Server or vSphere ESXi hosts to automatically discover VMs.
    • An account is needed to install the Site Recovery Mobility service agent on each physical machine or VM you want to replicate.
  2. You need to check the compatibility of your VMware infrastructure if you didn't previously do that.

  3. Ensure that you can connect to Azure VMs after a failover. You set up RDP on on-premises Windows machines, or SSH on Linux machines.

Need more help?

How do I set up disaster recovery?

After you have your Azure and on-premises infrastructure in place, you can set up disaster recovery.

  1. To understand the components that you'll need to deploy, review the VMware to Azure architecture, and the physical to Azure architecture. There are a number of components, so it's important to understand how they all fit together.
  2. Source environment: As a first step in deployment, you set up your replication source environment. You specify what you want to replicate, and where you want to replicate to.
  3. Configuration server: You need to set up a configuration server in your on-premises source environment:
    • The configuration server is a single on-premises machine. For VMware disaster recovery, we recommend that you deploy it as a VMware VM that can be deployed from a downloadable OVF template.
    • The configuration server coordinates communications between on-premises and Azure
    • A couple of other components run on the configuration server machine.
      • The process server receives, optimizes, and sends replication data to Azure storage. It also handles automatic installation of the Mobility service on machines you want to replicate, and performs automatic discovery of VMs on VMware servers.
      • The master target server handles replication data during failback from Azure.
    • Set up includes registering the configuration server in the vault, downloading MySQL Server and VMware PowerCLI, and specifying the accounts created for automatic discovery and Mobility service installation.
  4. Target environment: You set up your target Azure environment by specifying your Azure subscription, storage, and network settings.
  5. Replication policy: You specify how replication should occur. Settings include how often recovery points are created and stored, and whether app-consistent snapshots should be created.
  6. Enable replication. You enable replication for on-premises machines. If you created an account to install the Mobility service, then it will be installed when you enable replication for a machine.

Need more help?

Something went wrong, how do I troubleshoot?

  • As a first step, try monitoring your deployment to verify the status of replicated items, jobs, and infrastructure issues, and identify any errors.
  • If you're unable to complete the initial replication, or ongoing replication isn't working as expected, review this article for common errors and troubleshooting tips.
  • If you're having issues with the automatic installation of the Mobility service on machines you want to replicate, review common errors in this article.
  • If failover isn't working as expected, check common errors in this article.
  • If failback isn't working, check whether your issue appears in this article.

Next steps

With replication now in place, you should run a disaster recovery drill to ensure that failover works as expected.