Deploy disaster recovery with VMware Site Recovery Manager

This article explains how to implement disaster recovery for on-premises VMware virtual machines (VMs) or Azure VMware Solution-based VMs. The solution in this article uses VMware Site Recovery Manager (SRM) and vSphere Replication with Azure VMware Solution. Instances of SRM and replication servers are deployed at both the protected and the recovery sites.

SRM is a disaster recovery solution designed to minimize downtime of the virtual machines in an Azure VMware Solution environment if there was a disaster. SRM automates and orchestrates failover and failback, ensuring minimal downtime in a disaster. Also, built-in non-disruptive testing ensures your recovery time objectives are met. Overall, SRM simplifies management through automation and ensures fast and highly predictable recovery times.

vSphere Replication is VMware's hypervisor-based replication technology for vSphere VMs. It protects VMs from partial or complete site failures. In addition, it simplifies DR protection through storage-independent, VM-centric replication. vSphere Replication is configured on a per-VM basis, allowing more control over which VMs are replicated.

In this article, you'll implement disaster recovery for on-premises VMware virtual machines (VMs) or Azure VMware Solution-based VMs.

Supported scenarios

SRM helps you plan, test, and run the recovery of VMs between a protected vCenter Server site and a recovery vCenter Server site. You can use SRM with Azure VMware Solution with the following two DR scenarios:

  • On-premise VMware to Azure VMware Solution private cloud disaster recovery
  • Primary Azure VMware Solution to Secondary Azure VMware Solution private cloud disaster recovery

The diagram shows the deployment of the primary Azure VMware Solution to secondary Azure VMware Solution scenario.

Diagram showing the VMware Site Recovery Manager (SRM) disaster recovery solution in Azure VMware Solution.

You can use SRM to implement different types of recovery, such as:

  • Planned migration commences when both primary and secondary Azure VMware Solution sites are running and fully functional. It's an orderly migration of virtual machines from the protected site to the recovery site where no data loss is expected when migrating workloads in an orderly fashion.

  • Disaster recovery using SRM can be invoked when the protected Azure VMware Solution site goes offline unexpectedly. Site Recovery Manager orchestrates the recovery process with the replication mechanisms to minimize data loss and system downtime.

    In Azure VMware Solution, only individual VMs can be protected on a host by using SRM in combination with vSphere Replication.

  • Bidirectional Protection uses a single set of paired SRM sites to protect VMs in both directions. Each site can simultaneously be a protected site and a recovery site, but for a different set of VMs.

Important

Azure VMware Solution doesn't support:

  • Array-based replication and storage policy protection groups
  • VVOLs Protection Groups
  • SRM IP customization using SRM command-line tools
  • One-to-Many and Many-to-One topology
  • Custom SRM plug-in identifier or extension ID

Deployment workflow

The workflow diagram shows the Primary Azure VMware Solution to secondary workflow. In addition, it shows steps to take within the Azure portal and the VMware environments of Azure VMware Solution to achieve the end-to-end protection of VMs.

Diagram showing the deployment workflow for VMware Site Recovery Manager on Azure VMware Solution.

Prerequisites

Make sure you've explicitly provided the remote user the VRM administrator and SRM administrator roles in the remote vCenter.

Scenario: On-premises to Azure VMware Solution

  • Azure VMware Solution private cloud deployed as a secondary region.

  • DNS resolution to on-premises SRM and virtual cloud appliances.

    Note

    For private clouds created on or after July 1, 2021, you can configure private DNS resolution. For private clouds created before July 1, 2021, that need a private DNS resolution, open a support request to request Private DNS configuration.

  • ExpressRoute connectivity between on-premises and Azure VMware Solution - 2 Gbps.

Scenario: Primary Azure VMware Solution to secondary

  • Azure VMware Solution private cloud must be deployed in the primary and secondary region.

    Screenshot showing two Azure VMware Solution private clouds in separate regions.

  • Connectivity, like ExpressRoute Global Reach, between the source and target Azure VMware Solution private cloud.

    Screenshot showing the connectivity between the source and target private clouds.

Install SRM in Azure VMware Solution

  1. In your on-premises datacenter, install VMware SRM and vSphere.

    Note

    Use the Two-site Topology with one vCenter Server instance per PSC deployment model. Also, make sure that the required vSphere Replication Network ports are opened.

  2. In your Azure VMware Solution private cloud, under Manage, select Add-ons > Disaster recovery.

    The default CloudAdmin user in the Azure VMware Solution private cloud doesn't have sufficient privileges to install VMware SRM or vSphere Replication. The installation process involves multiple steps outlined in the Prerequisites section. Instead, you can install VMware SRM with vSphere Replication as an add-on service from your Azure VMware Solution private cloud.

    Screenshot of Azure VMware Solution private cloud to install VMware SRM with vSphere Replication as an add-on

  3. From the Disaster Recovery Solution drop-down, select VMware Site Recovery Manager (SRM) – vSphere Replication.

    Screenshot showing the Disaster recovery tab under Add-ons with VMware Site Recovery Manager (SRM) - vSphere replication selected.

  4. Provide the License key, select agree with terms and conditions, and then select Install.

    Note

    If you don't provide the license key, SRM is installed in an Evaluation mode. The license is used only to enable VMware SRM.

    Screenshot showing the Disaster recovery tab under Add-ons with the License key field selected.

Install the vSphere Replication appliance

After the SRM appliance installs successfully, you'll need to install the vSphere Replication appliances. Each replication server accommodates up to 200 protected VMs. Scale in or scale out as per your needs.

  1. From the Replication using drop-down, on the Disaster recovery tab, select vSphere Replication.

    Screenshot showing the vSphere Replication selected for the Replication using option.

  2. Move the vSphere server slider to indicate the number of replication servers you want based on the number of VMs to be protected. Then select Install.

    Screenshot showing how to increase or decrease the number of replication servers.

  3. Once installed, verify that both SRM and the vSphere Replication appliances are installed.

    Tip

    The Uninstall button indicates that both SRM and the vSphere Replication appliances are currently installed.

    Screenshot showing that both SRM and the replication appliance are installed.

Configure site pairing in vCenter

After installing VMware SRM and vSphere Replication, you need to complete the configuration and site pairing in vCenter.

  1. Sign in to vCenter as cloudadmin@vsphere.local.

  2. Navigate to Site Recovery, check the status of both vSphere Replication and VMware SRM, and then select OPEN Site Recovery to launch the client.

    Screenshot showing vSphere Client with the vSphere Replication and Site Recovery Manager installation status as OK.

  3. Select NEW SITE PAIR in the Site Recovery (SR) client in the new tab that opens.

    Screenshot showing vSphere Client with the New Site Pair button selected for Site Recovery.

  4. Enter the remote site details, and then select NEXT.

    Note

    An Azure VMware Solution private cloud operates with an embedded Platform Services Controller (PSC), so only one local vCenter can be selected. If the remote vCenter is using an embedded Platform Service Controller (PSC), use the vCenter's FQDN (or its IP address) and port to specify the PSC.

    The remote user must have sufficient permissions to perform the pairings. An easy way to ensure this is to give that user the VRM administrator and SRM administrator roles in the remote vCenter. For a remote Azure VMware Solution private cloud, cloudadmin is configured with those roles.

    Screenshot showing the Site details for the new site pair.

  5. Select CONNECT to accept the certificate for the remote vCenter.

    At this point, the client should discover the VRM and SRM appliances on both sides as services to pair.

  6. Select the appliances to pair and then select NEXT.

    Screenshot showing the vCenter Server and services details for the new site pair.

  7. Select CONNECT to accept the certificates for the remote VMware SRM and the remote vCenter (again).

  8. Select CONNECT to accept the certificates for the local VMware SRM and the local vCenter.

  9. Review the settings and then select FINISH.

    If successful, the client displays another panel for the pairing. However, if unsuccessful, an alarm will be reported.

  10. At the bottom, in the right corner, select the double-up arrow to expand the panel to show Recent Tasks and Alarms.

    Note

    The SR client sometimes takes a long time to refresh. If an operation seems to take too long or appears "stuck", select the refresh icon on the menu bar.

  11. Select VIEW DETAILS to open the panel for remote site pairing, which opens a dialog to sign in to the remote vCenter.

    Screenshot showing the new site pair details for Site Recovery Manager and vSphere Replication.

  12. Enter the username with sufficient permissions to do replication and site recovery and then select LOG IN.

    For pairing, the login, which is often a different user, is a one-time action to establish pairing. The SR client requires this login every time the client is launched to work with the pairing.

    Note

    The user with sufficient permissions should have VRM administrator and SRM administrator roles given to them in the remote vCenter. The user should also have access to the remote vCenter inventory, like folders and datastores. For a remote Azure VMware Solution private cloud, the cloudadmin user has the appropriate permissions and access.

    Screenshot showing the vCenter Server credentials.

    You'll see a warning message indicating that the embedded VRS in the local VRM isn't running. This is because Azure VMware Solution doesn't use the embedded VRS in an Azure VMware Solution private cloud. Instead, it uses VRS appliances.

    Screenshot showing the site pair summary for Site Recovery Manager and vSphere Replication.

SRM protection, reprotection, and failback

After you've created the site pairing, follow the VMware documentation mentioned below for end-to-end protection of VMs from the Azure portal.

Ongoing management of your SRM solution

While Microsoft aims to simplify VMware SRM and vSphere Replication installation on an Azure VMware Solution private cloud, you are responsible for managing your license and the day-to-day operation of the disaster recovery solution.

Scale limitations

Scale limitations are per private cloud.

Configuration Limit
Number of protected Virtual Machines 1000
Number of Virtual Machines per recovery plan 1000
Number of protection groups per recovery plan 250
RPO Values 5 min or higher*
Total number of virtual machines per protection group 500
Total number of recovery plans 250

* For information about Recovery Point Objective (RPO) lower than 15 minutes, see How the 5 Minute Recovery Point Objective Works in the vSphere Replication Administration guide.

SRM licenses

You can install VMware SRM using an evaluation license or a production license. The evaluation license is valid for 60 days. After the evaluation period, you'll be required to obtain a production license of VMware SRM.

You can't use pre-existing on-premises VMware SRM licenses for your Azure VMware Solution private cloud. Work with your sales teams and VMware to acquire a new term-based production license of VMware SRM.

Once a production license of SRM is acquired, you'll be able to use the Azure VMware Solution portal to update SRM with the new production license.

Uninstall SRM

If you no longer require SRM, you must uninstall it in a clean manner. Before you uninstall SRM, you must remove all SRM configurations from both sites in the correct order. If you do not remove all configurations before uninstalling SRM, some SRM components, such as placeholder VMs, might remain in the Azure VMware Solution infrastructure.

  1. In the vSphere Client or the vSphere Web Client, select Site Recovery > Open Site Recovery.

  2. On the Site Recovery home tab, select a site pair and select View Details.

  3. Select the Recovery Plans tab, right-click on a recovery plan and select Delete.

    Note

    You cannot delete recovery plans that are running.

  4. Select the Protection Groups tab, select a protection group, and select the Virtual Machines tab.

  5. Highlight all virtual machines, right-click, and select Remove Protection.

    Removing protection from a VM deletes the placeholder VM from the recovery site. Repeat this operation for all protection groups.

  6. In the Protection Groups tab, right-click a protection group and select Delete.

    Note

    You cannot delete a protection group that is included in a recovery plan. You cannot delete vSphere Replication protection groups that contain virtual machines on which protection is still configured.

  7. Select Site Pair > Configure and remove all inventory mappings.

    a. Select each of the Network Mappings, Folder Mappings, and Resource Mappings tabs.

    b. In each tab, select a site, right-click a mapping, and select Delete.

  8. For both sites, select Placeholder Datastores, right-click the placeholder datastore, and select Remove.

  9. Select Site Pair > Summary, and select Break Site Pair.

    Note

    Breaking the site pairing removes all information related to registering Site Recovery Manager with Site Recovery Manager, vCenter Server, and the Platform Services Controller on the remote site.

  10. In your private cloud, under Manage, select Add-ons > Disaster recovery, and then select Uninstall the replication appliances.

  11. Once replication appliances are uninstalled, from the Disaster recovery tab, select Uninstall for the Site Recovery Manager.

  12. Repeat these steps on the secondary Azure VMware Solution site.

Support

VMware SRM is a Disaster Recovery solution from VMware.

Microsoft only supports install/uninstall of SRM and vSphere Replication Manager and scale up/down of vSphere Replication appliances within Azure VMware Solution.

For all other issues, such as configuration and replication, contact VMware for support.

VMware and Microsoft support teams will engage each other as needed to troubleshoot SRM issues on Azure VMware Solution.

References