High availability for NFS on Azure VMs on SUSE Linux Enterprise Server

This article describes how to deploy the virtual machines, configure the virtual machines, install the cluster framework, and install a highly available NFS server that can be used to store the shared data of a highly available SAP system. This guide describes how to set up a highly available NFS server that is used by two SAP systems, NW1 and NW2. The names of the resources (for example virtual machines, virtual networks) in the example assume that you have used the SAP file server template with resource prefix prod.

Read the following SAP Notes and papers first

Overview

To achieve high availability, SAP NetWeaver requires an NFS server. The NFS server is configured in a separate cluster and can be used by multiple SAP systems.

SAP NetWeaver High Availability overview

The NFS server uses a dedicated virtual hostname and virtual IP addresses for every SAP system that uses this NFS server. On Azure, a load balancer is required to use a virtual IP address. The following list shows the configuration of the load balancer.

  • Frontend configuration
    • IP address 10.0.0.4 for NW1
    • IP address 10.0.0.5 for NW2
  • Backend configuration
    • Connected to primary network interfaces of all virtual machines that should be part of the NFS cluster
  • Probe Port
    • Port 61000 for NW1
    • Port 61001 for NW2
  • Loadbalancing rules
    • 2049 TCP for NW1
    • 2049 UDP for NW1
    • 2049 TCP for NW2
    • 2049 UDP for NW2

Set up a highly available NFS server

You can either use an Azure Template from GitHub to deploy all required Azure resources, including the virtual machines, availability set, and load balancer or you can deploy the resources manually.

Deploy Linux via Azure Template

The Azure Marketplace contains an image for SUSE Linux Enterprise Server for SAP Applications 12 that you can use to deploy new virtual machines. You can use one of the quickstart templates on GitHub to deploy all required resources. The template deploys the virtual machines, the load balancer, availability set etc. Follow these steps to deploy the template:

  1. Open the SAP file server template in the Azure portal
  2. Enter the following parameters
    1. Resource Prefix
      Enter the prefix you want to use. The value is used as a prefix for the resources that are deployed.
    2. SAP System Count
      Enter the number of SAP systems that will use this file server. This will deploy the required amount of frontend configurations, load balancing rules, probe ports, disks etc.
    3. Os Type
      Select one of the Linux distributions. For this example, select SLES 12
    4. Admin Username and Admin Password
      A new user is created that can be used to log on to the machine.
    5. Subnet ID
      If you want to deploy the VM into an existing VNet where you have a subnet defined the VM should be assigned to, name the ID of that specific subnet. The ID usually looks like /subscriptions/<subscription ID>/resourceGroups/<resource group name>/providers/Microsoft.Network/virtualNetworks/<virtual network name>/subnets/<subnet name>

Deploy Linux manually via Azure portal

You first need to create the virtual machines for this NFS cluster. Afterwards, you create a load balancer and use the virtual machines in the backend pools.

  1. Create a Resource Group
  2. Create a Virtual Network
  3. Create an Availability Set
    Set max update domain
  4. Create Virtual Machine 1 Use at least SLES4SAP 12 SP3, in this example the SLES4SAP 12 SP3 BYOS image SLES For SAP Applications 12 SP3 (BYOS) is used
    Select Availability Set created earlier
  5. Create Virtual Machine 2 Use at least SLES4SAP 12 SP3, in this example the SLES4SAP 12 SP3 BYOS image
    SLES For SAP Applications 12 SP3 (BYOS) is used
    Select Availability Set created earlier
  6. Add one data disk for each SAP system to both virtual machines.
  7. Create a Load Balancer (internal)
    1. Create the frontend IP addresses
      1. IP address 10.0.0.4 for NW1
        1. Open the load balancer, select frontend IP pool, and click Add
        2. Enter the name of the new frontend IP pool (for example nw1-frontend)
        3. Set the Assignment to Static and enter the IP address (for example 10.0.0.4)
        4. Click OK
      2. IP address 10.0.0.5 for NW2
        • Repeat the steps above for NW2
    2. Create the backend pools
      1. Connected to primary network interfaces of all virtual machines that should be part of the NFS cluster for NW1
        1. Open the load balancer, select backend pools, and click Add
        2. Enter the name of the new backend pool (for example nw1-backend)
        3. Click Add a virtual machine
        4. Select the Availability Set you created earlier
        5. Select the virtual machines of the NFS cluster
        6. Click OK
      2. Connected to primary network interfaces of all virtual machines that should be part of the NFS cluster for NW2
        • Repeat the steps above to create a backend pool for NW2
    3. Create the health probes
      1. Port 61000 for NW1
        1. Open the load balancer, select health probes, and click Add
        2. Enter the name of the new health probe (for example nw1-hp)
        3. Select TCP as protocol, port 61000, keep Interval 5 and Unhealthy threshold 2
        4. Click OK
      2. Port 61001 for NW2
        • Repeat the steps above to create a health probe for NW2
    4. Loadbalancing rules
      1. 2049 TCP for NW1
        1. Open the load balancer, select load balancing rules and click Add
        2. Enter the name of the new load balancer rule (for example nw1-lb-2049)
        3. Select the frontend IP address, backend pool, and health probe you created earlier (for example nw1-frontend)
        4. Keep protocol TCP, enter port 2049
        5. Increase idle timeout to 30 minutes
        6. Make sure to enable Floating IP
        7. Click OK
      2. 2049 UDP for NW1
        • Repeat the steps above for port 2049 and UDP for NW1
      3. 2049 TCP for NW2
        • Repeat the steps above for port 2049 and TCP for NW2
      4. 2049 UDP for NW2
        • Repeat the steps above for port 2049 and UDP for NW2

Important

Do not enable TCP timestamps on Azure VMs placed behind Azure Load Balancer. Enabling TCP timestamps will cause the health probes to fail. Set parameter net.ipv4.tcp_timestamps to 0. For details see Load Balancer health probes.

Create Pacemaker cluster

Follow the steps in Setting up Pacemaker on SUSE Linux Enterprise Server in Azure to create a basic Pacemaker cluster for this NFS server.

Configure NFS server

The following items are prefixed with either [A] - applicable to all nodes, [1] - only applicable to node 1 or [2] - only applicable to node 2.

  1. [A] Setup host name resolution

    You can either use a DNS server or modify the /etc/hosts on all nodes. This example shows how to use the /etc/hosts file. Replace the IP address and the hostname in the following commands

    sudo vi /etc/hosts
    

    Insert the following lines to /etc/hosts. Change the IP address and hostname to match your environment

    # IP address of the load balancer frontend configuration for NFS
    10.0.0.4 nw1-nfs
    10.0.0.5 nw2-nfs
    
  2. [A] Enable NFS server

    Create the root NFS export entry

    sudo sh -c 'echo /srv/nfs/ *\(rw,no_root_squash,fsid=0\)>/etc/exports'
    
    sudo mkdir /srv/nfs/
    
  3. [A] Install drbd components

    sudo zypper install drbd drbd-kmp-default drbd-utils
    
  4. [A] Create a partition for the drbd devices

    List all available data disks

    sudo ls /dev/disk/azure/scsi1/
    

    Example output

    lun0  lun1
    

    Create partitions for every data disk

    sudo sh -c 'echo -e "n\n\n\n\n\nw\n" | fdisk /dev/disk/azure/scsi1/lun0'
    sudo sh -c 'echo -e "n\n\n\n\n\nw\n" | fdisk /dev/disk/azure/scsi1/lun1'
    
  5. [A] Create LVM configurations

    List all available partitions

    ls /dev/disk/azure/scsi1/lun*-part*
    

    Example output

    /dev/disk/azure/scsi1/lun0-part1  /dev/disk/azure/scsi1/lun1-part1
    

    Create LVM volumes for every partition

    sudo pvcreate /dev/disk/azure/scsi1/lun0-part1  
    sudo vgcreate vg-NW1-NFS /dev/disk/azure/scsi1/lun0-part1
    sudo lvcreate -l 100%FREE -n NW1 vg-NW1-NFS
    
    sudo pvcreate /dev/disk/azure/scsi1/lun1-part1
    sudo vgcreate vg-NW2-NFS /dev/disk/azure/scsi1/lun1-part1
    sudo lvcreate -l 100%FREE -n NW2 vg-NW2-NFS
    
  6. [A] Configure drbd

    sudo vi /etc/drbd.conf
    

    Make sure that the drbd.conf file contains the following two lines

    include "drbd.d/global_common.conf";
    include "drbd.d/*.res";
    

    Change the global drbd configuration

    sudo vi /etc/drbd.d/global_common.conf
    

    Add the following entries to the handler and net section.

    global {
         usage-count no;
    }
    common {
         handlers {
              fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
              after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
              split-brain "/usr/lib/drbd/notify-split-brain.sh root";
              pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
         }
         startup {
              wfc-timeout 0;
         }
         options {
         }
         disk {
              md-flushes yes;
              disk-flushes yes;
              c-plan-ahead 1;
              c-min-rate 100M;
              c-fill-target 20M;
              c-max-rate 4G;
         }
         net {
              after-sb-0pri discard-younger-primary;
              after-sb-1pri discard-secondary;
              after-sb-2pri call-pri-lost-after-sb;
              protocol     C;
              tcp-cork yes;
              max-buffers 20000;
              max-epoch-size 20000;
              sndbuf-size 0;
              rcvbuf-size 0;
         }
    }
    
  7. [A] Create the NFS drbd devices

    sudo vi /etc/drbd.d/NW1-nfs.res
    

    Insert the configuration for the new drbd device and exit

    resource NW1-nfs {
         protocol     C;
         disk {
              on-io-error       detach;
         }
         on prod-nfs-0 {
              address   10.0.0.6:7790;
              device    /dev/drbd0;
              disk      /dev/vg-NW1-NFS/NW1;
              meta-disk internal;
         }
         on prod-nfs-1 {
              address   10.0.0.7:7790;
              device    /dev/drbd0;
              disk      /dev/vg-NW1-NFS/NW1;
              meta-disk internal;
         }
    }
    
    sudo vi /etc/drbd.d/NW2-nfs.res
    

    Insert the configuration for the new drbd device and exit

    resource NW2-nfs {
         protocol     C;
         disk {
              on-io-error       detach;
         }
         on prod-nfs-0 {
              address   10.0.0.6:7791;
              device    /dev/drbd1;
              disk      /dev/vg-NW2-NFS/NW2;
              meta-disk internal;
         }
         on prod-nfs-1 {
              address   10.0.0.7:7791;
              device    /dev/drbd1;
              disk      /dev/vg-NW2-NFS/NW2;
              meta-disk internal;
         }
    }
    

    Create the drbd device and start it

    sudo drbdadm create-md NW1-nfs
    sudo drbdadm create-md NW2-nfs
    sudo drbdadm up NW1-nfs
    sudo drbdadm up NW2-nfs
    
  8. [1] Skip initial synchronization

    sudo drbdadm new-current-uuid --clear-bitmap NW1-nfs
    sudo drbdadm new-current-uuid --clear-bitmap NW2-nfs
    
  9. [1] Set the primary node

    sudo drbdadm primary --force NW1-nfs
    sudo drbdadm primary --force NW2-nfs
    
  10. [1] Wait until the new drbd devices are synchronized

    sudo drbdsetup wait-sync-resource NW1-nfs
    sudo drbdsetup wait-sync-resource NW2-nfs
    
  11. [1] Create file systems on the drbd devices

    sudo mkfs.xfs /dev/drbd0
    sudo mkdir /srv/nfs/NW1
    sudo chattr +i /srv/nfs/NW1
    sudo mount -t xfs /dev/drbd0 /srv/nfs/NW1
    sudo mkdir /srv/nfs/NW1/sidsys
    sudo mkdir /srv/nfs/NW1/sapmntsid
    sudo mkdir /srv/nfs/NW1/trans
    sudo mkdir /srv/nfs/NW1/ASCS
    sudo mkdir /srv/nfs/NW1/ASCSERS
    sudo mkdir /srv/nfs/NW1/SCS
    sudo mkdir /srv/nfs/NW1/SCSERS
    sudo umount /srv/nfs/NW1
    
    sudo mkfs.xfs /dev/drbd1
    sudo mkdir /srv/nfs/NW2
    sudo chattr +i /srv/nfs/NW2
    sudo mount -t xfs /dev/drbd1 /srv/nfs/NW2
    sudo mkdir /srv/nfs/NW2/sidsys
    sudo mkdir /srv/nfs/NW2/sapmntsid
    sudo mkdir /srv/nfs/NW2/trans
    sudo mkdir /srv/nfs/NW2/ASCS
    sudo mkdir /srv/nfs/NW2/ASCSERS
    sudo mkdir /srv/nfs/NW2/SCS
    sudo mkdir /srv/nfs/NW2/SCSERS
    sudo umount /srv/nfs/NW2
    
  12. [A] Setup drbd split-brain detection

    When using drbd to synchronize data from one host to another, a so called split brain can occur. A split brain is a scenario where both cluster nodes promoted the drbd device to be the primary and went out of sync. It might be a rare situation but you still want to handle and resolve a split brain as fast as possible. It is therefore important to be notified when a split brain happened.

    Read the official drbd documentation on how to set up a split brain notification.

    It is also possible to automatically recover from a split brain scenario. For more information, read Automatic split brain recovery policies

Configure Cluster Framework

  1. [1] Add the NFS drbd devices for SAP system NW1 to the cluster configuration

    sudo crm configure rsc_defaults resource-stickiness="200"
    
    # Enable maintenance mode
    sudo crm configure property maintenance-mode=true
    
    sudo crm configure primitive drbd_NW1_nfs \
      ocf:linbit:drbd \
      params drbd_resource="NW1-nfs" \
      op monitor interval="15" role="Master" \
      op monitor interval="30" role="Slave"
    
    sudo crm configure ms ms-drbd_NW1_nfs drbd_NW1_nfs \
      meta master-max="1" master-node-max="1" clone-max="2" \
      clone-node-max="1" notify="true" interleave="true"
    
    sudo crm configure primitive fs_NW1_sapmnt \
      ocf:heartbeat:Filesystem \
      params device=/dev/drbd0 \
      directory=/srv/nfs/NW1  \
      fstype=xfs \
      op monitor interval="10s"
    
    sudo crm configure primitive nfsserver systemd:nfs-server \
      op monitor interval="30s"
    sudo crm configure clone cl-nfsserver nfsserver
    
    sudo crm configure primitive exportfs_NW1 \
      ocf:heartbeat:exportfs \
      params directory="/srv/nfs/NW1" \
      options="rw,no_root_squash,crossmnt" clientspec="*" fsid=1 wait_for_leasetime_on_stop=true op monitor interval="30s"
    
    sudo crm configure primitive vip_NW1_nfs \
      IPaddr2 \
      params ip=10.0.0.4 cidr_netmask=24 op monitor interval=10 timeout=20
    
    sudo crm configure primitive nc_NW1_nfs \
      anything \
      params binfile="/usr/bin/nc" cmdline_options="-l -k 61000" op monitor timeout=20s interval=10 depth=0
    
    sudo crm configure group g-NW1_nfs \
      fs_NW1_sapmnt exportfs_NW1 nc_NW1_nfs vip_NW1_nfs
    
    sudo crm configure order o-NW1_drbd_before_nfs inf: \
      ms-drbd_NW1_nfs:promote g-NW1_nfs:start
    
    sudo crm configure colocation col-NW1_nfs_on_drbd inf: \
      g-NW1_nfs ms-drbd_NW1_nfs:Master
    
  2. [1] Add the NFS drbd devices for SAP system NW2 to the cluster configuration

    # Enable maintenance mode
    sudo crm configure property maintenance-mode=true
    
    sudo crm configure primitive drbd_NW2_nfs \
      ocf:linbit:drbd \
      params drbd_resource="NW2-nfs" \
      op monitor interval="15" role="Master" \
      op monitor interval="30" role="Slave"
    
    sudo crm configure ms ms-drbd_NW2_nfs drbd_NW2_nfs \
      meta master-max="1" master-node-max="1" clone-max="2" \
      clone-node-max="1" notify="true" interleave="true"
    
    sudo crm configure primitive fs_NW2_sapmnt \
      ocf:heartbeat:Filesystem \
      params device=/dev/drbd1 \
      directory=/srv/nfs/NW2  \
      fstype=xfs \
      op monitor interval="10s"
    
    sudo crm configure primitive exportfs_NW2 \
      ocf:heartbeat:exportfs \
      params directory="/srv/nfs/NW2" \
      options="rw,no_root_squash" clientspec="*" fsid=2 wait_for_leasetime_on_stop=true op monitor interval="30s"
    
    sudo crm configure primitive vip_NW2_nfs \
      IPaddr2 \
      params ip=10.0.0.5 cidr_netmask=24 op monitor interval=10 timeout=20
    
    sudo crm configure primitive nc_NW2_nfs \
      anything \
      params binfile="/usr/bin/nc" cmdline_options="-l -k 61001" op monitor timeout=20s interval=10 depth=0
    
    sudo crm configure group g-NW2_nfs \
      fs_NW2_sapmnt exportfs_NW2 nc_NW2_nfs vip_NW2_nfs
    
    sudo crm configure order o-NW2_drbd_before_nfs inf: \
      ms-drbd_NW2_nfs:promote g-NW2_nfs:start
    
    sudo crm configure colocation col-NW2_nfs_on_drbd inf: \
      g-NW2_nfs ms-drbd_NW2_nfs:Master
    
  3. [1] Disable maintenance mode

    sudo crm configure property maintenance-mode=false
    

Next steps