Deploy SQL Server Big Data Clusters in AD mode on Azure Kubernetes Services (AKS)

SQL Server Big Data Clusters support Active Directory (AD) deployment mode for Identity and Access Management (IAM). IAM for Azure Kubernetes Service (AKS) has been challenging because industry-standard protocols such as OAuth 2.0 and OpenID Connect which is widely supported by Microsoft identity platform is not supported by SQL Server.

This article explains how to deploy a big data cluster in AD mode while deploying in Azure Kubernetes Service (AKS).

Important

The Microsoft SQL Server 2019 Big Data Clusters add-on will be retired. Support for SQL Server 2019 Big Data Clusters will end on February 28, 2025. All existing users of SQL Server 2019 with Software Assurance will be fully supported on the platform and the software will continue to be maintained through SQL Server cumulative updates until that time. For more information, see the announcement blog post and Big data options on the Microsoft SQL Server platform.

Architecture topologies

Active Directory Domain Services (AD DS) runs on an Azure virtual machine (VM) in the same way it runs in many on-premises instances. After promoting the new domain controllers in Azure, set the primary and secondary DNS Servers for the virtual network, demote any on-premises DNS Servers would be demoted to tertiary or later. AD authentication enables domain-joined clients on Linux to authenticate to SQL Server using their domain credentials and the Kerberos protocol.

There are a few ways to deploy a big data cluster in AD mode in AKS. This article introduces two methods, which are easier to implement and integrate with existing enterprise-grade architectures:

  • Extend your on-premises Active Directory domain to Azure. This method enables your Active Directory environment to provide distributed authentication services using Active Directory Domain Services (AD DS) on Azure. You replicate your on-premises Active Directory Domain Services (AD DS) to reduce the latency caused by sending authentication requests from the cloud back to on-premises AD DS. A typical use-case for this solution is when your application is hosted partly on-premises and partly in Azure and your authentication requests need to travel back and forth.

    See how to deploy this solution step by step in this reference architecture.

  • Extend the Active Directory Domain Services (AD DS) resource forest to Azure. In this architecture, you create a new domain in Azure that is trusted by your on-premises AD forest. This architecture shows a one-way trust from the domain in Azure to the on-premises forest.

    The trust allows on-premises users access resources in the domain in Azure. See how to deploy this solution step by step in this reference architecture.

The reference architectures described above allow you to create a landing zone, which has all resources to be deployed from scratch or any additional workaround based on existing architecture. In addition to those reference architectures, you should deploy the big data cluster in an AKS cluster on a separate subnet that stays in your target VNet or a peered VNet.

The following image represents a typical architecture:

AKS cluster with AD and SQL Server Big Data Cluster

Recommendations

The following recommendations apply for most big data cluster deployments in AD mode on AKS. Available options will be mentioned in each component to provide guidance for better integration with enterprise-grade architecture.

Networking recommendations

A few key components can be used to connect your on-premises environment to Azure:

  • Azure VPN Gateway: A VPN gateway is a specific type of virtual network gateway that is used to send encrypted traffic between an Azure virtual network and an on-premises location over the public internet. You'll use both Azure Virtual Network Gateway and local Virtual Network Gateway. For information about how to configure them, see What is VPN Gateway.
  • Azure ExpressRoute: ExpressRoute connections do not go over the public internet, and offer higher security, reliability, and speeds with lower latencies than typical connections over the internet. The choice of your connectivity option will affect the latency, performance, and SLA level of your solution depending on the SKUs. For specific information, see About ExpressRoute virtual network gateways.

Most customers use a jump-box or Azure Bastion to access other Azure infrastructure. Azure Private Link enables you to securely access Azure PaaS Services, including AKS in this scenario as well as and other Azure hosted services over a private endpoint in your virtual network. Traffic between your virtual network and the service traverses over the Microsoft backbone network, eliminating exposure to the public internet. You can also create your own private link service in your virtual network and deliver it privately to your customers.

Active Directory and Azure recommendation

On-premises AD DS stores information about user accounts, and enables other authorized users on the same network to access this information by authenticating identities associated with users, computers, applications, or other resources that are included in a security boundary. In most hybrid scenarios, user authentication runs over a VPN Gateway or ExpressRoute connection to the on-premises AD DS environment.

For a big data cluster deployment in AD mode, the solution to integrate on-premises Active Directory with Azure, must have the following prerequisites:

  • An AD account has specific permission to create users, groups, and machine accounts inside the provided organizational unit (OU) in your on-premises Active directory.
  • A DNS server to resolve internal DNS. It must contain both A (forward lookup) and PTR (reverse lookup) records in the DNS server with names in this domain. Specify the Domain DNS settings in the big data cluster deployment profile.

Next steps

Tutorial: Deploy SQL Server Big Data Clusters in AD mode on Azure Kubernetes Services (AKS)