Troubleshoot Always On Availability Groups Configuration (SQL Server)

Applies to: yesSQL Server (all supported versions)

This topic provides information to help you troubleshoot typical problems with configuring server instances for Always On availability groups. Typical configuration problems include Always On availability groups is disabled, accounts are incorrectly configured, the database mirroring endpoint does not exist, the endpoint is inaccessible (SQL Server Error 1418), network access does not exist, and a join database command fails (SQL Server Error 35250).

Note

Ensure that you are meeting the Always On availability groups prerequisites. For more information, see Prerequisites, Restrictions, and Recommendations for Always On Availability Groups (SQL Server).

In This Topic:

Section Description
Always On Availability Groups Is Not Enabled If an instance of SQL Server is not enabled for Always On availability groups, the instance does not support availability group creation and cannot host any availability replicas.
Accounts Discusses requirements for correctly configuring the accounts under which SQL Server is running.
Endpoints Discusses how to diagnose issues with the database mirroring endpoint of a server instance.
Network access Documents the requirement that each server instance that is hosting an availability replica must be able to access the port of each of the other server instances over TCP.
Endpoint Access (SQL Server Error 1418) Contains information about this SQL Server error message.
Join Database Fails (SQL Server Error 35250) Discusses the possible causes and resolution of a failure to join secondary databases to an availability group because the connection to the primary replica is not active.
Read-Only Routing is Not Working Correctly
Related Tasks Contains a list of task-oriented topics in SQL Server Books Online that are particularly relevant to troubleshooting an availability group configuration.
Related Content Contains a list of relevant resources that are external to SQL Server Books Online.

Always On Availability Groups Is Not Enabled

The Always On availability groups feature must be enabled on each of the instances of SQL Server. For more information, see Enable and Disable Always On Availability Groups (SQL Server).

Accounts

The accounts under which SQL Server is running must be correctly configured.

  1. Do the accounts have the correct permissions?

    1. If the partners run under the same domain account, the correct user logins exist automatically in both master databases. This simplifies the security configuration and is recommended.

    2. If two server instances run under different accounts, the each account must be created in master on the remote server instance, and that login must be granted CONNECT permissions to connect to the database mirroring endpoint of that server instance. For more information, seeSet Up Login Accounts for Database Mirroring or Always On Availability Groups (SQL Server). You can use the following query on each instance to check if the logins have CONNECT permissions:

    SELECT 
      perm.class_desc,
      prin.name,
      perm.permission_name,
      perm.state_desc,
      prin.type_desc as PrincipalType,
      prin.is_disabled
    FROM sys.server_permissions perm
      LEFT JOIN sys.server_principals prin ON perm.grantee_principal_id = prin.principal_id
      LEFT JOIN sys.tcp_endpoints tep ON perm.major_id = tep.endpoint_id
    WHERE 
      perm.class_desc = 'ENDPOINT'
      AND perm.permission_name = 'CONNECT'
      AND tep.type = 4    
    
  2. If SQL Server is running under a built-in account, such as Local System, Local Service, or Network Service, or a nondomain account, you must use certificates for endpoint authentication. If your service accounts are using domain accounts in the same domain, you can choose to grant CONNECT access for each service account on all the replica locations or you can use certificates. For more information, seeUse Certificates for a Database Mirroring Endpoint (Transact-SQL).

Endpoints

Endpoints must be correctly configured.

  1. Make sure that each instance of SQL Server that is going to host an availability replica (each replica location) has a database mirroring endpoint. To determine whether a database mirroring endpoint exists on a given server instance, use the sys.database_mirroring_endpoints catalog view:

    SELECT name, state_desc FROM sys.database_mirroring_endpoints  
    

    For more information on creating endpoints, see either Create a Database Mirroring Endpoint for Windows Authentication (Transact-SQL) or Allow a Database Mirroring Endpoint to Use Certificates for Outbound Connections (Transact-SQL).

  2. Check that the port numbers are correct.

    To identify the port currently associated with database mirroring endpoint of a server instance, use the following Transact-SQL statement:

    SELECT type_desc, port FROM sys.tcp_endpoints;  
    GO  
    
  3. For Always On availability groups setup issues that are difficult to explain, we recommend that you inspect each server instance to determine whether it is listening on the correct ports.

  4. Make sure that the endpoints are started (STATE=STARTED). On each server instance, use the following Transact-SQL statement:

    SELECT state_desc FROM sys.database_mirroring_endpoints  
    

    For more information about the state_desc column, see sys.database_mirroring_endpoints (Transact-SQL).

    To start an endpoint, use the following Transact-SQL statement:

    ALTER ENDPOINT Endpoint_Mirroring   
    STATE = STARTED   
    AS TCP (LISTENER_PORT = <port_number>)  
    FOR database_mirroring (ROLE = ALL);  
    GO  
    

    For more information, see ALTER ENDPOINT (Transact-SQL).

  5. Make sure that the login from the other server has CONNECT permission. To determine who has CONNECT permission for an endpoint, on each server instance use the following Transact-SQL statement:

    SELECT 'Metadata Check';  
    SELECT EP.name, SP.STATE,   
       CONVERT(nvarchar(38), suser_name(SP.grantor_principal_id))   
          AS GRANTOR,   
       SP.TYPE AS PERMISSION,  
       CONVERT(nvarchar(46),suser_name(SP.grantee_principal_id))   
          AS GRANTEE   
       FROM sys.server_permissions SP , sys.endpoints EP  
       WHERE SP.major_id = EP.endpoint_id  
       ORDER BY Permission,grantor, grantee;   
    
  6. Ensure correct server name is used in the endpoint URL

    For server name in an endpoint URL, it is recommended to use fully qualified domain name (FQDN), although you can use any name that uniquely identifies the machine. The server address can be a Netbios name (if the systems are in the same domain), a fully qualified domain name (FQDN), or an IP address (preferably, a static IP address). Using the fully qualified domain name is the recommended option.

    If you have already defined an Endpoint URL, you can query it by using:

    select endpoint_url from sys.availability_replicas
    

    Next, compare the endpoint_url output to the server name (Netbios or FQDN). To query the Netbios and FQDN, run the following in a Command Prompt on the replica locally:

    hostname & echo %COMPUTERNAME%.%USERDNSDOMAIN%
    

    For query the server name of a remote computer, run this from a Command Prompt. Then compare the endpoint_url

    ping -a servername_from_endpoint_url
    

    For more information, see Specify the Endpoint URL When Adding or Modifying an Availability Replica (SQL Server).

Network Access

Each server instance that is hosting an availability replica must be able to access the port of each of the other server instance over TCP. This is especially important if the server instances are in different domains that do not trust each other (untrusted domains). Check if you can connect to the endpoints by following these steps:

  • Use Telnet to validate connectivity. Here are examples of commands you can use:

    telnet ServerName Port
    telnet IP_Address Port
    
    
  • If the Endpoint is listening and connection is successful, then you will see a blank screen.¬† If not, you will receive a connection error from Telnet

  • If Telnet connection to the IP address works but to the ServerName it does not, there is likely a DNS or name resolution issue

  • If connection works by ServerName and not by IP address, then there could be more than one endpoint defined on that server (another SQL instance perhaps) that is listening on that port. Though the status of the endpoint on the instance in question shows "STARTED" another instance may actually have the port binding and prevent the correct instance from listening and establishing TCP connections.

  • If Telnet fails to connect, look for Firewall and/or Anti-virus software that may be blocking the endpoint port in question. Check the firewall setting to see if it allows the endpoint port communication between the server instances that host primary replica and the secondary replica (port 5022 by default). Run the following PowerShell script to examine for disabled inbound traffic rules

  • If Telnet fails to connect, look for Firewall and/or antivirus software that may be blocking the endpoint port in question. If you are running SQL Server on Azure VM, additionally you would need to ensure Network Security Group (NSG) allows the traffic to endpoint port. Check the firewall (and NSG, for Azure VM) setting to see if it allows the endpoint port communication between the server instances that host primary replica and the secondary replica (port 5022 by default)

    Get-NetFirewallRule -Action Block -Enabled True -Direction Inbound |Format-Table
    
    
  • Capture a NETSTAT -a output and verify the status is a LISTENING or ESTABLISHED on the IP:Port for the endpoint specified

    netstat -a
    
    
    

Endpoint Access (SQL Server Error 1418)

This SQL Server message indicates that the server network address specified in the endpoint URL cannot be reached or does not exist, and it suggests that you verify the network address name and reissue the command.

Join Database Fails (SQL Server Error 35250)

This section discusses the possible causes and resolution of a failure to join secondary databases to the availability group because the connection to the primary replica is not active. This is the full error message:

Msg 35250 The connection to the primary replica is not active. The command cannot be processed.

Resolution:

Summary of steps is outlined below.

For detailed step-by-step instructions, please refer to Engine error MSSQLSERVER_35250

  1. Ensure the endpoint is created and started.
  2. Check if you can connect to the endpoint via Telnet and ensure no firewall rules are blocking connectivity
  3. Check for errors in the system. You can query the sys.dm_hadr_availability_replica_states for the last_connect_error_number that may help you diagnose the join issue.
  4. Ensure the endpoint is defined so it correctly matches the IP/port that AG is using.
  5. Check whether the network service account has CONNECT permission to the endpoint.
  6. Check for possible name resolution issues
  7. Ensure your SQL Server is running a recent build (preferably the latest build to protect from running into fixed issues.

Read-Only Routing is Not Working Correctly

  1. Ensure that you have set up read-only routing by following Configure read-only routing document.

  2. Ensure Client Driver Support

    The client application must use a client providers that support ApplicationIntent parameter. See Driver and client connectivity support for availability groups

    Note

    If you are connecting to a distributed network name (DNN) Listener, the provider must also support MultiSubnetFailover parameter

  3. Ensure connection string properties are set correctly

    For read-only routing to work properly, your client application must use these properties in the connection string:

    • A database name that belongs to the AG
    • An availability group listener name
      • If you are using DNN, you must specify DNN listener name and DNN port number <DNN name,DNN port>
    • ApplicationIntent set to ReadOnly
    • MultiSubnetFailover set to true is required for Distributed network name (DNN)

    Examples

    This illustrates the connection string for .NET System.Data.SqlClient provider for a virtual network name (VNN) listener:

    Server=tcp:VNN_AgListener,1433;Database=AgDb1;ApplicationIntent=ReadOnly;MultiSubnetFailover=True
    

    This illustrates the connection string for .NET System.Data.SqlClient provider for a distributed network name (DNN) listener:

    Server=tcp:DNN_AgListener,DNN_Port;Database=AgDb1;ApplicationIntent=ReadOnly;MultiSubnetFailover=True
    

    Note

    If you are using command line programs like SQLCMD, ensure that you specify the correct switches for server name. For instance, in SQLCMD you must use the upper case -S switch that specifies server name, not the lower case -s switch which is used for column separator.
    Example: sqlcmd -S AG_Listener,port -E -d AgDb1 -K ReadOnly -M

  4. Ensure that the availability group listener is online. To ensure that the availability group listener is online run the following query on the primary replica:

    SELECT * FROM sys.dm_tcp_listener_states;
    

    If you find the listener is offline you can attempt to bring it online using a command like this:

    ALTER AVAILABILITY GROUP myAG RESTART LISTENER 'AG_Listener';
    
  5. Ensure READ_ONLY_ROUTING_LIST is correctly populated. On Primary replica, ensure that the READ_ONLY_ROUTING_LIST contains only server instances that are hosting readable secondary replicas.

    To view the properties of each replica you can run this query and examine the connectivity endpoint (URL) of the read only replica.

    SELECT replica_id, replica_server_name, secondary_role_allow_connections_desc, read_only_routing_url 
    FROM sys.availability_replicas;   
    

    To view a read-only routing list and compare to the endpoint URL:

    SELECT * FROM sys.availability_read_only_routing_lists;
    

    To change a read-only routing list you can use a query like this:

    ALTER AVAILABILITY GROUP [AG1]   
    MODIFY REPLICA ON  
    N'COMPUTER02' WITH   
    (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('COMPUTER01','COMPUTER02')));  
    

    For more information see Configure read-only routing for an availability group - SQL Server Always On

  6. Check that READ_ONLY_ROUTING_URL port is open. Ensure that the Windows firewall is not blocking the READ_ONLY_ROUTING_URL port. Configure a Windows Firewall for database engine access on every replica in the read_only_routing_list and any for clients that will be connecting to those replicas.

    Note

    If you are running SQL Server on Azure VM, you must take additional configuration steps. Ensure that the network security group (NSG) of each replica VM allows traffic to the endpoint port and the DNN port, if you are using DNN listener. If you are using VNN listener, you must ensure the load balancer is configured correctly.

  7. Ensure that the READ_ONLY_ROUTING_URL (TCP://system-address:port) contains the correct fully-qualified domain name (FQDN) and port number. See:

  8. Ensure proper SQL Server Networking configuration in the SQL Server Configuration Manager.

    Verify on every replica in the read_only_routing_list that:

    • SQL Server remote connectivity is enabled
    • TCP/IP is enabled
    • The IP addresses are configured correctly

    Note

    You can quickly verify all of these are properly configured if you can connect from a remote machine to a target secondary replica's SQL Server instance name using TCP:SQL_Instance syntax.

See: Configure a Server to Listen on a Specific TCP Port (SQL Server Configuration Manager) and View or Change Server Properties (SQL Server)

Related Tasks

Related Content

See Also

Transport Security for Database Mirroring and Always On Availability Groups (SQL Server)
Client Network Configuration
Prerequisites, Restrictions, and Recommendations for Always On Availability Groups (SQL Server)