High-availability architecture and scenarios for SAP NetWeaver

Terminology definitions

High availability: Refers to a set of technologies that minimize IT disruptions by providing business continuity of IT services through redundant, fault-tolerant, or failover-protected components inside the same data center. In our case, the data center resides within one Azure region.

Disaster recovery: Also refers to the minimizing of IT services disruption and their recovery, but across various data centers that might be hundreds of miles away from one another. In our case, the data centers might reside in various Azure regions within the same geopolitical region or in locations as established by you as a customer.

Overview of high availability

SAP high availability in Azure can be separated into three types:

  • Azure infrastructure high availability:

    For example, high availability can include compute (VMs), network, or storage and its benefits for increasing the availability of SAP applications.

  • Utilizing Azure infrastructure VM restart to achieve higher availability of SAP applications:

    If you decide not to use functionalities such as Windows Server Failover Clustering (WSFC) or Pacemaker on Linux, Azure VM restart is utilized. It protects SAP systems against planned and unplanned downtime of the Azure physical server infrastructure and overall underlying Azure platform.

  • SAP application high availability:

    To achieve full SAP system high availability, you must protect all critical SAP system components. For example:

    • Redundant SAP application servers.
    • Unique components. An example might be a single point of failure (SPOF) component, such as an SAP ASCS/SCS instance or a database management system (DBMS).

SAP high availability in Azure differs from SAP high availability in an on-premises physical or virtual environment. The following paper SAP NetWeaver high availability and business continuity in virtual environments with VMware and Hyper-V on Microsoft Windows describes standard SAP high-availability configurations in virtualized environments on Windows.

There is no sapinst-integrated SAP high-availability configuration for Linux as there is for Windows. For information about SAP high availability on-premises for Linux, see High availability partner information.

Azure infrastructure high availability

SLA for single-instance virtual machines

There is currently a single-VM SLA of 99.9% with premium storage. To get an idea about what the availability of a single VM might be, you can build the product of the various available Azure Service Level Agreements.

The basis for the calculation is 30 days per month, or 43,200 minutes. For example, a 0.05% downtime corresponds to 21.6 minutes. As usual, the availability of the various services is calculated in the following way:

(Availability Service #1/100) * (Availability Service #2/100) * (Availability Service #3/100) *…

For example:

(99.95/100) * (99.9/100) * (99.9/100) = 0.9975 or an overall availability of 99.75%.

Multiple instances of virtual machines in the same availability set

For all virtual machines that have two or more instances deployed in the same availability set, we guarantee that you will have virtual machine connectivity to at least one instance at least 99.95% of the time.

When two or more VMs are part of the same availability set, each virtual machine in the availability set is assigned an update domain and a fault domain by the underlying Azure platform.

  • Update domains guarantee that multiple VMs are not rebooted at the same time during the planned maintenance of an Azure infrastructure. Only one VM is rebooted at a time.

  • Fault domains guarantee that VMs are deployed on hardware components that do not share a common power source and network switch. When servers, a network switch, or a power source undergo an unplanned downtime, only one VM is affected.

For more information, see Manage the availability of Windows virtual machines in Azure.

An availability set is used for achieving high availability of:

  • Redundant SAP application servers.
  • Clusters with two or more nodes (VMs, for example) that protect SPOFs such as an SAP ASCS/SCS instance or a DBMS.

Azure Availability Zones

Azure is in process of rolling out a concepts of Azure Availability Zones throughout different Azure Regions. In Azure regions where Availability Zones are offered, the Azure regions have multiple data centers, which are independent in supply of power source, cooling, and network. Reason for offering different zones within a single Azure region is to enable you to deploy applications across two or three Availability Zones offered. Assuming that issues in power sources and/or network would affect one Availability Zone infrastructure only, your application deployment within an Azure region is still fully functional. Eventually with some reduced capacity since some VMs in one zone might be lost. But VMs in the other two zones are still up and running. The Azure regions that offer zones are listed in Azure Availability Zones.

Using Availability Zones, there are some things to consider. The considerations list like:

  • You can't deploy Azure Availability Sets within an Availability Zone. You need to choose either an Availability Zone or an Availability Set as deployment frame for a VM.
  • You can't use the Basic Load Balancer to create failover cluster solutions based on Windows Failover Cluster Services or Linux Pacemaker. Instead you need to use the Azure Standard Load Balancer SKU
  • Azure Availability Zones are not giving any guarantees of certain distance between the different zones within one region
  • The network latency between different Azure Availability Zones within the different Azure regions might be different from Azure region to region. There will be cases, where you as a customer can reasonably run the SAP application layer deployed across different zones since the network latency from one zone to the active DBMS VM is still acceptable from a business process impact. Whereas there will be customer scenarios where the latency between the active DBMS VM in one zone and an SAP application instance in a VM in another zone can be too intrusive and not acceptable for the SAP business processes. As a result, the deployment architectures need to be different with an active/active architecture for the application or active/passive architecture if latency is too high.
  • Using Azure managed disks is mandatory for deploying into Azure Availability Zones

Planned and unplanned maintenance of virtual machines

Two types of Azure platform events can affect the availability of your virtual machines:

  • Planned maintenance events are periodic updates made by Microsoft to the underlying Azure platform. The updates improve overall reliability, performance, and security of the platform infrastructure that your virtual machines run on.

  • Unplanned maintenance events occur when the hardware or physical infrastructure underlying your virtual machine has failed in some way. It might include local network failures, local disk failures, or other rack level failures. When such a failure is detected, the Azure platform automatically migrates your virtual machine from the unhealthy physical server that hosts your virtual machine to a healthy physical server. Such events are rare, but they might also cause your virtual machine to reboot.

For more information, see Manage the availability of Windows virtual machines in Azure.

Azure Storage redundancy

The data in your storage account is always replicated to ensure durability and high availability, meeting the Azure Storage SLA even in the face of transient hardware failures.

Because Azure Storage keeps three images of the data by default, the use of RAID 5 or RAID 1 across multiple Azure disks is unnecessary.

For more information, see Azure Storage replication.

Azure Managed Disks

Managed Disks is a resource type in Azure Resource Manager that is recommended to be used instead of virtual hard disks (VHDs) that are stored in Azure storage accounts. Managed disks automatically align with an Azure availability set of the virtual machine they are attached to. They increase the availability of your virtual machine and the services that are running on it.

For more information, see Azure Managed Disks overview.

We recommend that you use managed disks because they simplify the deployment and management of your virtual machines.

Utilizing Azure infrastructure high availability to achieve higher availability of SAP applications

If you decide not to use functionalities such as WSFC or Pacemaker on Linux (currently supported only for SUSE Linux Enterprise Server [SLES] 12 and later), Azure VM restart is utilized. It protects SAP systems against planned and unplanned downtime of the Azure physical server infrastructure and overall underlying Azure platform.

For more information about this approach, see Utilize Azure infrastructure VM restart to achieve higher availability of the SAP system.

High availability of SAP applications on Azure IaaS

To achieve full SAP system high availability, you must protect all critical SAP system components. For example:

  • Redundant SAP application servers.
  • Unique components. An example might be a single point of failure (SPOF) component, such as an SAP ASCS/SCS instance or a database management system (DBMS).

The next sections discuss how to achieve high availability for all three critical SAP system components.

High-availability architecture for SAP application servers

This section applies to:

Windows Windows and Linux Linux

You usually don't need a specific high-availability solution for the SAP application server and dialog instances. You achieve high availability by redundancy, and you configure multiple dialog instances in various instances of Azure virtual machines. You should have at least two SAP application instances installed in two instances of Azure virtual machines.

Figure 1: High-availability SAP application server

Figure 1: High-availability SAP application server

You must place all virtual machines that host SAP application server instances in the same Azure availability set. An Azure availability set ensures that:

  • All virtual machines are part of the same update domain.
    An update domain ensures that the virtual machines aren't updated at the same time during planned maintenance downtime.

    The basic functionality, which builds on different update and fault domains within an Azure scale unit, was already introduced in the update domains section.

  • All virtual machines are part of the same fault domain.
    A fault domain ensures that virtual machines are deployed so that no single point of failure affects the availability of all virtual machines.

The number of update and fault domains that can be used by an Azure availability set within an Azure scale unit is finite. If you keep adding VMs to a single availability set, two or more VMs will eventually end up in the same fault or update domain.

If you deploy a few SAP application server instances in their dedicated VMs, assuming that we have five update domains, the following picture emerges. The actual maximum number of update and fault domains within an availability set might change in the future:

Figure 2: High availability of SAP application servers in an Azure availability set Figure 2: High availability of SAP application servers in an Azure availability set

For more information, see Manage the availability of Windows virtual machines in Azure.

For more information, see the Azure availability sets section of the Azure virtual machines planning and implementation for SAP NetWeaver document.

Unmanaged disks only: Because the Azure storage account is a potential single point of failure, it's important to have at least two Azure storage accounts, in which at least two virtual machines are distributed. In an ideal setup, the disks of each virtual machine that is running an SAP dialog instance would be deployed in a different storage account.

Important

We strongly recommend that you use Azure managed disks for your SAP high-availability installations. Because managed disks automatically align with the availability set of the virtual machine they are attached to, they increase the availability of your virtual machine and the services that are running on it.

High-availability architecture for an SAP ASCS/SCS instance on Windows

Windows Windows

You can use a WSFC solution to protect the SAP ASCS/SCS instance. The solution has two variants:

High-availability architecture for an SAP ASCS/SCS instance on Linux

Linux Linux

For more information about clustering the SAP ASCS/SCS instance by using the SLES cluster framework, see High availability for SAP NetWeaver on Azure VMs on SUSE Linux Enterprise Server for SAP applications. For alternative HA architecture on SLES, which doesn't require highly available NFS see High-availability guide for SAP NetWeaver on SUSE Linux Enterprise Server with Azure NetApp Files for SAP applications.

For more information about clustering the SAP ASCS/SCS instance by using the Red Hat cluster framework, see Azure Virtual Machines high availability for SAP NetWeaver on Red Hat Enterprise Linux

SAP NetWeaver multi-SID configuration for a clustered SAP ASCS/SCS instance

Windows Windows

Currently, multi-SID is supported only with WSFC. Multi-SID is supported using file share and shared disk.

For more information about multi-SID high-availability architecture, see:

High-availability DBMS instance

The DBMS also is a single point of contact in an SAP system. You need to protect it by using a high-availability solution. The following figure shows a SQL Server AlwaysOn high-availability solution in Azure, with Windows Server Failover Clustering and the Azure internal load balancer. SQL Server AlwaysOn replicates DBMS data and log files by using its own DBMS replication. In this case, you don't need cluster shared disk, which simplifies the entire setup.

Figure 3: Example of a high-availability SAP DBMS, with SQL Server AlwaysOn

Figure 3: Example of a high-availability SAP DBMS, with SQL Server AlwaysOn

For more information about clustering SQL Server DBMS in Azure by using the Azure Resource Manager deployment model, see these articles:

For more information about clustering SAP HANA DBMS in Azure by using the Azure Resource Manager deployment model, see High availability of SAP HANA on Azure virtual machines (VMs).