Cannot Connect to SQL Availability Group Using ApplicationIntent = ReadOnly Get Network Error From Outside AWS

Robert Lubbers 21 Reputation points
2021-02-02T23:07:35.333+00:00

I have what I thought was a very simple Availability Group set up on two EC2 instances in AWS. There are two instances, one running on Windows Server 2016, and another running in Ubuntu 18.04. I created a cluster-less, domain-less Availability Group to allow for read-only queries to be routed to the secondary (Linux) server. Because it has no cluster (type=NONE), it really doesn't have a Listener, although I did create something like a dummy listener using the Private IP address of the primary (Windows) server. Then I set up Read-only routing on both servers.

Rather than DNS, I am depending on Hosts files on all the servers. The Listener and the primary server are set up with the same IP address in the Hosts files. I have a dummy domain as well, .DOMAIN, in the hosts files. It doesn't seem to matter

When I connect to the Availability group from a machine inside my AWS Security Group, I don't have any trouble attaching to the appropriate Availability Group database. The SQLCMD works as I supposed it would:

SQLCMD -S AG_LISTENER,1433 -U sa -P ********* -d AvailabilityDatabaseA -K ReadOnly -Q "SELECT @@SERVERNAME" returns 'LINUX-SQLHOST'
SQLCMD -S AG_LISTENER,1433 -U sa -P ********* -d AvailabilityDatabaseA -Q "SELECT @@SERVERNAME" returns 'WINDOWS-SQLHOST'
SQLCMD -S WINDOWS_SQLHOST,1433 -U sa -P ********* -d AvailabilityDatabaseA -K ReadOnly -Q "SELECT @@SERVERNAME" returns 'WINDOWS-SQLHOST'

Yet, when I attempt to attach using a Public IP (either the DNS AWS gave me or an Elastic IP) I get some strange behavior:

SQLCMD -S PUBLIC_IP_OF_WINDOWS_HOST,1433 -U sa -P ********* -d AvailabilityDatabaseA -Q "SELECT @@SERVERNAME" returns 'WINDOWS-SQLHOST'
SQLCMD -S PUBLIC_IP_OF_WINDOWS_HOST,1433 -U sa -P ********* -d AvailabilityDatabaseA -K ReadOnly -Q "SELECT @@SERVERNAME" times out, then gives a 'network-related or instance-specific error establishing a connection to SQL Server. Server is not found or not accessible.'

And strangely,

SQLCMD -S PUBLIC_IP_OF_LINUX_HOST,1433 -U sa -P ********* -d AvailabilityDatabaseA -K ReadOnly -Q "SELECT @@SERVERNAME" returns 'LINUX-HOST'

and

SQLCMD -S PUBLIC_IP_OF_LINUX_HOST,1433 -U sa -P ********* -d AvailabilityDatabaseA -Q "SELECT @@SERVERNAME" (no Read-Only Intent) returns an error 'the target database ('AvailabilityDatabaseA') is in an availability group and is currently accessible for connections when the application intent is set to read only.'

Is there some DNS hoodoo I need to attend to? That's what it seems like. I am not connecting to the 'Listener'.

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,840 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,559 questions
0 comments No comments
{count} votes

Accepted answer
  1. Sean Gallardy - MSFT 1,886 Reputation points Microsoft Employee
    2021-02-03T00:31:42.497+00:00

    Yet, when I attempt to attach using a Public IP (either the DNS AWS gave me or an Elastic IP) I get some strange behavior:

    Is there some DNS hoodoo I need to attend to? That's what it seems like. I am not connecting to the 'Listener'.

    Let me explain how read only routing works and then I think you'll understand since you've done a fairly good job investigating on your own.

    When you setup read only routing you have an endpoint url and a routing list. The routing list is for the most part, ordinary, it lists who to go to and when, depending on various situations. The endpoint url, however, is the real key here. This is setup to point to a TDS endpoint (which is the protocol SQL Server uses to communicate to clients and other servers).

    The TDS specification defines this behavior, so when read only routing is evaluated and the server decides that this connection is indeed good to be read only routing to a specific replica, it fills in the bits in the TDS packet to send back to the client saying that there will be an environment change, this is read only routing, and it's going to now go to the endpoint url that was previously configured.

    Here's where you run into a problem as you have a private address and a public address. The routing url is most likely configured with the private address, which is what is passed directly back to the client driver. Let's make up some IPv4 addresses (works with IPv6 too), let's say the internal AWS address is 192.168.1.20 and the public address is 20.153.200.5 - great. If the routing url was configured with 192.168.1.20, then this will be passed back to the client driver, the client driver will then disconnect from the current connection to the primary and create a new connection with the exact url it was given, in this case 192.168.1.20. This works when you're inside your private area as this can be resolved quite easily. However, when you're outside AWS, say at your home without VPN or other items, the client driver will still get the same value and still try to connect to the same address, 192.168.1.20 - it's at this point, that's not going to resolve anywhere helpful and you will get the error you've been receiving.

    There isn't a whole lot you can do... you can make your own host entries, aliases, etc., but if the internal servers can resolve the public address and you're going to use it from the outside more than internally, I'd make the routing url the public address.

    1 person found this answer helpful.
    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Abdulhakim M. Elrhumi 351 Reputation points
    2021-02-02T23:50:57.027+00:00

    Hi

    connection string :

    string strcnn = "Data Source=192.168.xxx.xxx,1433; Network Library=DBMSSOCN;Initial Catalog=DBName; User ID=sa;Password=*******";
    

    You must know IP Address:192.168.xxx.xxx
    Port No:1433
    User ID:sa
    Password:******
    Test connection With ODBC Data Source.

    Best Regards.
    Please remember to mark the replies as answers if they help.

    0 comments No comments

  2. Robert Lubbers 21 Reputation points
    2021-02-03T15:12:35.487+00:00

    Thank you, Sean

    So, in order to do read-only routing from outside the VCP, the easiest way to do it would be to give a separate static IP address to both replicas.

    That seems to me to be costly and wasteful if we grow this application to a non-trivial number of read-only replicas (say, six). Could it be done using PAT and 1 IP address? I think I can get the secondary replica SQL server talking to the 'listener' on a different port, or rather, that I could configure the endpoint to use a different port. Then I could route the traffic to the same address, but the 'router' would rewrite the URL to point to a different port.

    That would be a decent workaround for a non-trivial number of secondary replicas (all on separate ports) since going cap in hand to Amazon for the additional IP addresses could be both costly and wasteful. It seems to me like a flaw in Microsoft's design that it would return another URL to the client rather than accepting the responsibility to get the packets to the right SQL instance. Of course, if I had a "real" cluster with a real listener, or even if I sprang for a load balancer for those future instances, the discussion wouldn't matter.

    But, hey...