Connect an on-premises network to Azure using a VPN gateway

This reference architecture shows how to extend an on-premises network to Azure, using a site-to-site virtual private network (VPN). Traffic flows between the on-premises network and an Azure Virtual Network (VNet) through an IPSec VPN tunnel. Deploy this solution.

0

Download a Visio file of this architecture.

Architecture

The architecture consists of the following components.

  • On-premises network. A private local-area network running within an organization.

  • VPN appliance. A device or service that provides external connectivity to the on-premises network. The VPN appliance may be a hardware device, or it can be a software solution such as the Routing and Remote Access Service (RRAS) in Windows Server 2012. For a list of supported VPN appliances and information on configuring them to connect to an Azure VPN gateway, see the instructions for the selected device in the article About VPN devices for Site-to-Site VPN Gateway connections.

  • Virtual network (VNet). The cloud application and the components for the Azure VPN gateway reside in the same VNet.

  • Azure VPN gateway. The VPN gateway service enables you to connect the VNet to the on-premises network through a VPN appliance. For more information, see Connect an on-premises network to a Microsoft Azure virtual network. The VPN gateway includes the following elements:

    • Virtual network gateway. A resource that provides a virtual VPN appliance for the VNet. It is responsible for routing traffic from the on-premises network to the VNet.
    • Local network gateway. An abstraction of the on-premises VPN appliance. Network traffic from the cloud application to the on-premises network is routed through this gateway.
    • Connection. The connection has properties that specify the connection type (IPSec) and the key shared with the on-premises VPN appliance to encrypt traffic.
    • Gateway subnet. The virtual network gateway is held in its own subnet, which is subject to various requirements, described in the Recommendations section below.
  • Cloud application. The application hosted in Azure. It might include multiple tiers, with multiple subnets connected through Azure load balancers. For more information about the application infrastructure, see Running Windows VM workloads and Running Linux VM workloads.

  • Internal load balancer. Network traffic from the VPN gateway is routed to the cloud application through an internal load balancer. The load balancer is located in the front-end subnet of the application.

Recommendations

The following recommendations apply for most scenarios. Follow these recommendations unless you have a specific requirement that overrides them.

VNet and gateway subnet

Create an Azure VNet with an address space large enough for all of your required resources. Ensure that the VNet address space has sufficient room for growth if additional VMs are likely to be needed in the future. The address space of the VNet must not overlap with the on-premises network. For example, the diagram above uses the address space 10.20.0.0/16 for the VNet.

Create a subnet named GatewaySubnet, with an address range of /27. This subnet is required by the virtual network gateway. Allocating 32 addresses to this subnet will help to prevent reaching gateway size limitations in the future. Also, avoid placing this subnet in the middle of the address space. A good practice is to set the address space for the gateway subnet at the upper end of the VNet address space. The example shown in the diagram uses 10.20.255.224/27. Here is a quick procedure to calculate the CIDR:

  1. Set the variable bits in the address space of the VNet to 1, up to the bits being used by the gateway subnet, then set the remaining bits to 0.
  2. Convert the resulting bits to decimal and express it as an address space with the prefix length set to the size of the gateway subnet.

For example, for a VNet with an IP address range of 10.20.0.0/16, applying step #1 above becomes 10.20.0b11111111.0b11100000. Converting that to decimal and expressing it as an address space yields 10.20.255.224/27.

Warning

Do not deploy any VMs to the gateway subnet. Also, do not assign an NSG to this subnet, as it will cause the gateway to stop functioning.

Virtual network gateway

Allocate a public IP address for the virtual network gateway.

Create the virtual network gateway in the gateway subnet and assign it the newly allocated public IP address. Use the gateway type that most closely matches your requirements and that is enabled by your VPN appliance:

  • Create a policy-based gateway if you need to closely control how requests are routed based on policy criteria such as address prefixes. Policy-based gateways use static routing, and only work with site-to-site connections.

  • Create a route-based gateway if you connect to the on-premises network using RRAS, support multi-site or cross-region connections, or implement VNet-to-VNet connections (including routes that traverse multiple VNets). Route-based gateways use dynamic routing to direct traffic between networks. They can tolerate failures in the network path better than static routes because they can try alternative routes. Route-based gateways can also reduce the management overhead because routes might not need to be updated manually when network addresses change.

For a list of supported VPN appliances, see About VPN devices for Site-to-Site VPN Gateway connections.

Note

After the gateway has been created, you cannot change between gateway types without deleting and re-creating the gateway.

Select the Azure VPN gateway SKU that most closely matches your throughput requirements. Azure VPN gateway is available in three SKUs shown in the following table.

SKU VPN Throughput Max IPSec Tunnels
Basic 100 Mbps 10
Standard 100 Mbps 10
High Performance 200 Mbps 30

Note

The Basic SKU is not compatible with Azure ExpressRoute. You can change the SKU after the gateway has been created.

You are charged based on the amount of time that the gateway is provisioned and available. See VPN Gateway Pricing.

Create routing rules for the gateway subnet that direct incoming application traffic from the gateway to the internal load balancer, rather than allowing requests to pass directly to the application VMs.

On-premises network connection

Create a local network gateway. Specify the public IP address of the on-premises VPN appliance, and the address space of the on-premises network. Note that the on-premises VPN appliance must have a public IP address that can be accessed by the local network gateway in Azure VPN Gateway. The VPN device cannot be located behind a network address translation (NAT) device.

Create a site-to-site connection for the virtual network gateway and the local network gateway. Select the site-to-site (IPSec) connection type, and specify the shared key. Site-to-site encryption with the Azure VPN gateway is based on the IPSec protocol, using preshared keys for authentication. You specify the key when you create the Azure VPN gateway. You must configure the VPN appliance running on-premises with the same key. Other authentication mechanisms are not currently supported.

Ensure that the on-premises routing infrastructure is configured to forward requests intended for addresses in the Azure VNet to the VPN device.

Open any ports required by the cloud application in the on-premises network.

Test the connection to verify that:

  • The on-premises VPN appliance correctly routes traffic to the cloud application through the Azure VPN gateway.
  • The VNet correctly routes traffic back to the on-premises network.
  • Prohibited traffic in both directions is blocked correctly.

Scalability considerations

You can achieve limited vertical scalability by moving from the Basic or Standard VPN Gateway SKUs to the High Performance VPN SKU.

For VNets that expect a large volume of VPN traffic, consider distributing the different workloads into separate smaller VNets and configuring a VPN gateway for each of them.

You can partition the VNet either horizontally or vertically. To partition horizontally, move some VM instances from each tier into subnets of the new VNet. The result is that each VNet has the same structure and functionality. To partition vertically, redesign each tier to divide the functionality into different logical areas (such as handling orders, invoicing, customer account management, and so on). Each functional area can then be placed in its own VNet.

Replicating an on-premises Active Directory domain controller in the VNet, and implementing DNS in the VNet, can help to reduce some of the security-related and administrative traffic flowing from on-premises to the cloud. For more information, see Extending Active Directory Domain Services (AD DS) to Azure.

Availability considerations

If you need to ensure that the on-premises network remains available to the Azure VPN gateway, implement a failover cluster for the on-premises VPN gateway.

If your organization has multiple on-premises sites, create multi-site connections to one or more Azure VNets. This approach requires dynamic (route-based) routing, so make sure that the on-premises VPN gateway supports this feature.

For details about service level agreements, see SLA for VPN Gateway.

Manageability considerations

Monitor diagnostic information from on-premises VPN appliances. This process depends on the features provided by the VPN appliance. For example, if you are using the Routing and Remote Access Service on Windows Server 2012, RRAS logging.

Use Azure VPN gateway diagnostics to capture information about connectivity issues. These logs can be used to track information such as the source and destinations of connection requests, which protocol was used, and how the connection was established (or why the attempt failed).

Monitor the operational logs of the Azure VPN gateway using the audit logs available in the Azure portal. Separate logs are available for the local network gateway, the Azure network gateway, and the connection. This information can be used to track any changes made to the gateway, and can be useful if a previously functioning gateway stops working for some reason.

2

Monitor connectivity, and track connectivity failure events. You can use a monitoring package such as Nagios to capture and report this information.

Security considerations

Generate a different shared key for each VPN gateway. Use a strong shared key to help resist brute-force attacks.

Note

Currently, you cannot use Azure Key Vault to preshare keys for the Azure VPN gateway.

Ensure that the on-premises VPN appliance uses an encryption method that is compatible with the Azure VPN gateway. For policy-based routing, the Azure VPN gateway supports the AES256, AES128, and 3DES encryption algorithms. Route-based gateways support AES256 and 3DES.

If your on-premises VPN appliance is on a perimeter network (DMZ) that has a firewall between the perimeter network and the Internet, you might have to configure additional firewall rules to allow the site-to-site VPN connection.

If the application in the VNet sends data to the Internet, consider implementing forced tunneling to route all Internet-bound traffic through the on-premises network. This approach enables you to audit outgoing requests made by the application from the on-premises infrastructure.

Note

Forced tunneling can impact connectivity to Azure services (the Storage Service, for example) and the Windows license manager.

Troubleshooting

For general information on troubleshooting common VPN-related errors, see Troubleshooting common VPN related errors.

The following recommendations are useful for determining if your on-premises VPN appliance is functioning correctly.

  • Check any log files generated by the VPN appliance for errors or failures.

    This will help you determine if the VPN appliance is functioning correctly. The location of this information will vary according to your appliance. For example, if you are using RRAS on Windows Server 2012, you can use the following PowerShell command to display error event information for the RRAS service:

    Get-EventLog -LogName System -EntryType Error -Source RemoteAccess | Format-List -Property *
    

    The Message property of each entry provides a description of the error. Some common examples are:

      - Inability to connect, possibly due to an incorrect IP address specified for the Azure VPN gateway in the RRAS VPN network interface configuration.
    
      ```
      EventID            : 20111
      MachineName        : on-prem-vm
      Data               : {41, 3, 0, 0}
      Index              : 14231
      Category           : (0)
      CategoryNumber     : 0
      EntryType          : Error
      Message            : RoutingDomainID- {00000000-0000-0000-0000-000000000000}: A demand dial connection to the remote
                           interface AzureGateway on port VPN2-4 was successfully initiated but failed to complete
                           successfully because of the  following error: The network connection between your computer and
                           the VPN server could not be established because the remote server is not responding. This could
                           be because one of the network devices (for example, firewalls, NAT, routers, and so on) between your computer
                           and the remote server is not configured to allow VPN connections. Please contact your
                           Administrator or your service provider to determine which device may be causing the problem.
      Source             : RemoteAccess
      ReplacementStrings : {{00000000-0000-0000-0000-000000000000}, AzureGateway, VPN2-4, The network connection between
                           your computer and the VPN server could not be established because the remote server is not
                           responding. This could be because one of the network devices (for example, firewalls, NAT, routers, and so on)
                           between your computer and the remote server is not configured to allow VPN connections. Please
                           contact your Administrator or your service provider to determine which device may be causing the
                           problem.}
      InstanceId         : 20111
      TimeGenerated      : 3/18/2016 1:26:02 PM
      TimeWritten        : 3/18/2016 1:26:02 PM
      UserName           :
      Site               :
      Container          :
      ```
    
      - The wrong shared key being specified in the RRAS VPN network interface configuration.
    
      ```
      EventID            : 20111
      MachineName        : on-prem-vm
      Data               : {233, 53, 0, 0}
      Index              : 14245
      Category           : (0)
      CategoryNumber     : 0
      EntryType          : Error
      Message            : RoutingDomainID- {00000000-0000-0000-0000-000000000000}: A demand dial connection to the remote
                           interface AzureGateway on port VPN2-4 was successfully initiated but failed to complete
                           successfully because of the  following error: Internet key exchange (IKE) authentication credentials are unacceptable.
    
      Source             : RemoteAccess
      ReplacementStrings : {{00000000-0000-0000-0000-000000000000}, AzureGateway, VPN2-4, IKE authentication credentials are
                           unacceptable.
                           }
      InstanceId         : 20111
      TimeGenerated      : 3/18/2016 1:34:22 PM
      TimeWritten        : 3/18/2016 1:34:22 PM
      UserName           :
      Site               :
      Container          :
      ```
    

    You can also obtain event log information about attempts to connect through the RRAS service using the following PowerShell command:

    Get-EventLog -LogName Application -Source RasClient | Format-List -Property *
    

    In the event of a failure to connect, this log will contain errors that look similar to the following:

    EventID            : 20227
    MachineName        : on-prem-vm
    Data               : {}
    Index              : 4203
    Category           : (0)
    CategoryNumber     : 0
    EntryType          : Error
    Message            : CoId={B4000371-A67F-452F-AA4C-3125AA9CFC78}: The user SYSTEM dialed a connection named
                         AzureGateway that has failed. The error code returned on failure is 809.
    Source             : RasClient
    ReplacementStrings : {{B4000371-A67F-452F-AA4C-3125AA9CFC78}, SYSTEM, AzureGateway, 809}
    InstanceId         : 20227
    TimeGenerated      : 3/18/2016 1:29:21 PM
    TimeWritten        : 3/18/2016 1:29:21 PM
    UserName           :
    Site               :
    Container          :
    
  • Verify connectivity and routing across the VPN gateway.

    The VPN appliance may not be correctly routing traffic through the Azure VPN Gateway. Use a tool such as PsPing to verify connectivity and routing across the VPN gateway. For example, to test connectivity from an on-premises machine to a web server located on the VNet, run the following command (replacing <<web-server-address>> with the address of the web server):

    PsPing -t <<web-server-address>>:80
    

    If the on-premises machine can route traffic to the web server, you should see output similar to the following:

    D:\PSTools>psping -t 10.20.0.5:80
    
    PsPing v2.01 - PsPing - ping, latency, bandwidth measurement utility
    Copyright (C) 2012-2014 Mark Russinovich
    Sysinternals - www.sysinternals.com
    
    TCP connect to 10.20.0.5:80:
    Infinite iterations (warmup 1) connecting test:
    Connecting to 10.20.0.5:80 (warmup): 6.21ms
    Connecting to 10.20.0.5:80: 3.79ms
    Connecting to 10.20.0.5:80: 3.44ms
    Connecting to 10.20.0.5:80: 4.81ms
    
      Sent = 3, Received = 3, Lost = 0 (0% loss),
      Minimum = 3.44ms, Maximum = 4.81ms, Average = 4.01ms
    

    If the on-premises machine cannot communicate with the specified destination, you will see messages like this:

    D:\PSTools>psping -t 10.20.1.6:80
    
    PsPing v2.01 - PsPing - ping, latency, bandwidth measurement utility
    Copyright (C) 2012-2014 Mark Russinovich
    Sysinternals - www.sysinternals.com
    
    TCP connect to 10.20.1.6:80:
    Infinite iterations (warmup 1) connecting test:
    Connecting to 10.20.1.6:80 (warmup): This operation returned because the timeout period expired.
    Connecting to 10.20.1.6:80: This operation returned because the timeout period expired.
    Connecting to 10.20.1.6:80: This operation returned because the timeout period expired.
    Connecting to 10.20.1.6:80: This operation returned because the timeout period expired.
    Connecting to 10.20.1.6:80:
      Sent = 3, Received = 0, Lost = 3 (100% loss),
      Minimum = 0.00ms, Maximum = 0.00ms, Average = 0.00ms
    
  • Verify that the on-premises firewall allows VPN traffic to pass and that the correct ports are opened.

  • Verify that the on-premises VPN appliance uses an encryption method that is compatible with the Azure VPN gateway. For policy-based routing, the Azure VPN gateway supports the AES256, AES128, and 3DES encryption algorithms. Route-based gateways support AES256 and 3DES.

The following recommendations are useful for determining if there is a problem with the Azure VPN gateway:

  • Examine Azure VPN gateway diagnostic logs for potential issues.

  • Verify that the Azure VPN gateway and on-premises VPN appliance are configured with the same shared authentication key.

    You can view the shared key stored by the Azure VPN gateway using the following Azure CLI command:

    azure network vpn-connection shared-key show <<resource-group>> <<vpn-connection-name>>
    

    Use the command appropriate for your on-premises VPN appliance to show the shared key configured for that appliance.

    Verify that the GatewaySubnet subnet holding the Azure VPN gateway is not associated with an NSG.

    You can view the subnet details using the following Azure CLI command:

    azure network vnet subnet show -g <<resource-group>> -e <<vnet-name>> -n GatewaySubnet
    

    Ensure there is no data field named Network Security Group id. The following example shows the results for an instance of the GatewaySubnet that has an assigned NSG (VPN-Gateway-Group). This can prevent the gateway from working correctly if there are any rules defined for this NSG.

    C:\>azure network vnet subnet show -g profx-prod-rg -e profx-vnet -n GatewaySubnet
        info:    Executing command network vnet subnet show
        + Looking up virtual network "profx-vnet"
        + Looking up the subnet "GatewaySubnet"
        data:    Id                              : /subscriptions/########-####-####-####-############/resourceGroups/profx-prod-rg/providers/Microsoft.Network/virtualNetworks/profx-vnet/subnets/GatewaySubnet
        data:    Name                            : GatewaySubnet
        data:    Provisioning state              : Succeeded
        data:    Address prefix                  : 10.20.3.0/27
        data:    Network Security Group id       : /subscriptions/########-####-####-####-############/resourceGroups/profx-prod-rg/providers/Microsoft.Network/networkSecurityGroups/VPN-Gateway-Group
        info:    network vnet subnet show command OK
    
  • Verify that the virtual machines in the Azure VNet are configured to permit traffic coming in from outside the VNet.

    Check any NSG rules associated with subnets containing these virtual machines. You can view all NSG rules using the following Azure CLI command:

    azure network nsg show -g <<resource-group>> -n <<nsg-name>>
    
  • Verify that the Azure VPN gateway is connected.

    You can use the following Azure PowerShell command to check the current status of the Azure VPN connection. The <<connection-name>> parameter is the name of the Azure VPN connection that links the virtual network gateway and the local gateway.

    Get-AzureRmVirtualNetworkGatewayConnection -Name <<connection-name>> - ResourceGroupName <<resource-group>>
    

    The following snippets highlight the output generated if the gateway is connected (the first example), and disconnected (the second example):

    PS C:\> Get-AzureRmVirtualNetworkGatewayConnection -Name profx-gateway-connection -ResourceGroupName profx-prod-rg
    
    AuthorizationKey           :
    VirtualNetworkGateway1     : Microsoft.Azure.Commands.Network.Models.PSVirtualNetworkGateway
    VirtualNetworkGateway2     :
    LocalNetworkGateway2       : Microsoft.Azure.Commands.Network.Models.PSLocalNetworkGateway
    Peer                       :
    ConnectionType             : IPsec
    RoutingWeight              : 0
    SharedKey                  : ####################################
    ConnectionStatus           : Connected
    EgressBytesTransferred     : 55254803
    IngressBytesTransferred    : 32227221
    ProvisioningState          : Succeeded
    ...
    
    PS C:\> Get-AzureRmVirtualNetworkGatewayConnection -Name profx-gateway-connection2 -ResourceGroupName profx-prod-rg
    
    AuthorizationKey           :
    VirtualNetworkGateway1     : Microsoft.Azure.Commands.Network.Models.PSVirtualNetworkGateway
    VirtualNetworkGateway2     :
    LocalNetworkGateway2       : Microsoft.Azure.Commands.Network.Models.PSLocalNetworkGateway
    Peer                       :
    ConnectionType             : IPsec
    RoutingWeight              : 0
    SharedKey                  : ####################################
    ConnectionStatus           : NotConnected
    EgressBytesTransferred     : 0
    IngressBytesTransferred    : 0
    ProvisioningState          : Succeeded
    ...
    

The following recommendations are useful for determining if there is an issue with Host VM configuration, network bandwidth utilization, or application performance:

  • Verify that the firewall in the guest operating system running on the Azure VMs in the subnet is configured correctly to allow permitted traffic from the on-premises IP ranges.

  • Verify that the volume of traffic is not close to the limit of the bandwidth available to the Azure VPN gateway.

    How to verify this depends on the VPN appliance running on-premises. For example, if you are using RRAS on Windows Server 2012, you can use Performance Monitor to track the volume of data being received and transmitted over the VPN connection. Using the RAS Total object, select the Bytes Received/Sec and Bytes Transmitted/Sec counters:

    3

    You should compare the results with the bandwidth available to the VPN gateway (100 Mbps for the Basic and Standard SKUs, and 200 Mbps for the High Performance SKU):

    4

  • Verify that you have deployed the right number and size of VMs for your application load.

    Determine if any of the virtual machines in the Azure VNet are running slowly. If so, they may be overloaded, there may be too few to handle the load, or the load-balancers may not be configured correctly. To determine this, capture and analyze diagnostic information. You can examine the results using the Azure portal, but many third-party tools are also available that can provide detailed insights into the performance data.

  • Verify that the application is making efficient use of cloud resources.

    Instrument application code running on each VM to determine whether applications are making the best use of resources. You can use tools such as Application Insights.

Deploy the solution

Prequisites. You must have an existing on-premises infrastructure already configured with a suitable network appliance.

To deploy the solution, perform the following steps.

  1. Click the button below:
  2. Wait for the link to open in the Azure portal, then follow these steps:
    • The Resource group name is already defined in the parameter file, so select Create New and enter ra-hybrid-vpn-rg in the text box.
    • Select the region from the Location drop down box.
    • Do not edit the Template Root Uri or the Parameter Root Uri text boxes.
    • Review the terms and conditions, then click the I agree to the terms and conditions stated above checkbox.
    • Click the Purchase button.
  3. Wait for the deployment to complete.