Create a Kubernetes cluster with Azure Kubernetes Service using Terraform

Article tested with the following Terraform and Terraform provider versions:

Terraform enables the definition, preview, and deployment of cloud infrastructure. Using Terraform, you create configuration files using HCL syntax. The HCL syntax allows you to specify the cloud provider - such as Azure - and the elements that make up your cloud infrastructure. After you create your configuration files, you create an execution plan that allows you to preview your infrastructure changes before they're deployed. Once you verify the changes, you apply the execution plan to deploy the infrastructure. For more information about using Terraform in Azure, see the Azure Terraform developer center

Azure Kubernetes Service (AKS) manages your hosted Kubernetes environment. AKS allows you to deploy and manage containerized applications without container orchestration expertise. AKS also enables you to do many common maintenance operations without taking your app offline. These operations include provisioning, upgrading, and scaling resources on demand.

In this article, you learn how to:

  • Use HCL (HashiCorp Language) to define a Kubernetes cluster
  • Use Terraform and AKS to create a Kubernetes cluster
  • Use the kubectl tool to test the availability of a Kubernetes cluster

1. Configure your environment

  • Azure subscription: If you don't have an Azure subscription, create a free account before you begin.
  • Azure service principal: If you don't have a service principal, create a service principal. Make note of the appId, display_name, password, and tenant.

  • Service principal object ID: Run the following command to get the object ID of the service principal: az ad sp list --display-name "<display_name>" --query "[].{\"Object ID\":objectId}" --output table

  • SSH key pair: Use one of the following articles:

2. Configure Azure storage to store Terraform state

Terraform tracks state locally via the terraform.tfstate file. This pattern works well in a single-person environment. However, in a more practical multi-person environment, you need to track state on the server using Azure storage. In this section, you learn to retrieve the necessary storage account information and create a storage container. The Terraform state information is then stored in that container.

  1. Use one of the following options to create an Azure storage account:

  2. Browse to the Azure portal.

  3. Under Azure services, select Storage accounts. (If the Storage accounts option isn't visible on the main page, select More services to locate the option.)

  4. On the Storage accounts page, On the Storage accounts page, select the storage account where Terraform will store the state information.

  5. On the Storage account page, in the left menu, in the Security + networking section, select Access keys.

    The Storage account page has a menu option to get the access keys.

  6. On the Access keys page, select Show keys to display the key values.

    The Access keys page has an option to display the key values.

  7. Locate the key1 key on the page and select the icon to its right to copy the key value to the clipboard.

    A handy icon button allows you to copy the key values to the clipboard.

  8. From a command line prompt, run az storage container create. This command creates a container in your Azure storage account. Replace the placeholders with the appropriate values for your Azure storage account.

    az storage container create -n tfstate \
       --account-name <storage_account_name> \
       --account-key <storage_account_key>
    
  9. When the command successfully completes, it displays a JSON block with a key of "created" and a value of true. You can also run az storage container list to verify the container was successfully created.

    az storage container list \
       --account-name <storage_account_name> \
       --account-key <storage_account_key>
    

3. Implement the Terraform code

  1. Create a directory in which to test the sample Terraform code and make it the current directory.

  2. Create a file named providers.tf and insert the following code.

    terraform {
    
      required_version = ">=0.12"
    
      required_providers {
        azurerm = {
          source  = "hashicorp/azurerm"
          version = "~>2.0"
        }
      }
      backend "azurerm" {
        resource_group_name  = "<storage_account_resource_group>"
        storage_account_name = "<storage_account_name>"
        container_name       = "tfstate"
        key                  = "codelab.microsoft.tfstate"
      }
    }
    
    provider "azurerm" {
      features {}
    }
    
  3. Create a file named main.tf and insert the following code:

    # Generate random resource group name
    resource "random_pet" "rg-name" {
      prefix    = var.resource_group_name_prefix
    }
    
    resource "azurerm_resource_group" "rg" {
      name      = random_pet.rg-name.id
      location  = var.resource_group_location
    }
    
    resource "random_id" "log_analytics_workspace_name_suffix" {
        byte_length = 8
    }
    
    resource "azurerm_log_analytics_workspace" "test" {
        # The WorkSpace name has to be unique across the whole of azure, not just the current subscription/tenant.
        name                = "${var.log_analytics_workspace_name}-${random_id.log_analytics_workspace_name_suffix.dec}"
        location            = var.log_analytics_workspace_location
        resource_group_name = azurerm_resource_group.k8s.name
        sku                 = var.log_analytics_workspace_sku
    }
    
    resource "azurerm_log_analytics_solution" "test" {
        solution_name         = "ContainerInsights"
        location              = azurerm_log_analytics_workspace.test.location
        resource_group_name   = azurerm_resource_group.k8s.name
        workspace_resource_id = azurerm_log_analytics_workspace.test.id
        workspace_name        = azurerm_log_analytics_workspace.test.name
    
        plan {
            publisher = "Microsoft"
            product   = "OMSGallery/ContainerInsights"
        }
    }
    
    resource "azurerm_kubernetes_cluster" "k8s" {
        name                = var.cluster_name
        location            = azurerm_resource_group.k8s.location
        resource_group_name = azurerm_resource_group.k8s.name
        dns_prefix          = var.dns_prefix
    
        linux_profile {
            admin_username = "ubuntu"
    
            ssh_key {
                key_data = file(var.ssh_public_key)
            }
        }
    
        default_node_pool {
            name            = "agentpool"
            node_count      = var.agent_count
            vm_size         = "Standard_D2_v2"
        }
    
        service_principal {
            client_id     = var.aks_service_principal_app_id
            client_secret = var.aks_service_principal_client_secret
        }
    
        addon_profile {
            oms_agent {
            enabled                    = true
            log_analytics_workspace_id = azurerm_log_analytics_workspace.test.id
            }
        }
    
        network_profile {
            load_balancer_sku = "Standard"
            network_plugin = "kubenet"
        }
    
        tags = {
            Environment = "Development"
        }
    }
    
  4. Create a file named variables.tf and insert the following code:

    variable "resource_group_name_prefix" {
      default       = "rg"
      description   = "Prefix of the resource group name that's combined with a random ID so name is unique in your Azure subscription."
    }
    
    variable "resource_group_location" {
      default       = "eastus"
      description   = "Location of the resource group."
    }
    
    variable "agent_count" {
        default = 3
    }
    
    variable "ssh_public_key" {
        default = "~/.ssh/id_rsa.pub"
    }
    
    variable "dns_prefix" {
        default = "k8stest"
    }
    
    variable cluster_name {
        default = "k8stest"
    }
    
    variable resource_group_name {
        default = "azure-k8stest"
    }
    
    variable location {
        default = "Central US"
    }
    
    variable log_analytics_workspace_name {
        default = "testLogAnalyticsWorkspaceName"
    }
    
    # refer https://azure.microsoft.com/global-infrastructure/services/?products=monitor for log analytics available regions
    variable log_analytics_workspace_location {
        default = "eastus"
    }
    
    # refer https://azure.microsoft.com/pricing/details/monitor/ for log analytics pricing 
    variable log_analytics_workspace_sku {
        default = "PerGB2018"
    }
    
  5. Create a file named output.tf and insert the following code.

    output "resource_group_name" {
      value = azurerm_resource_group.rg.name
    }
    
    output "client_key" {
        value = azurerm_kubernetes_cluster.k8s.kube_config.0.client_key
    }
    
    output "client_certificate" {
        value = azurerm_kubernetes_cluster.k8s.kube_config.0.client_certificate
    }
    
    output "cluster_ca_certificate" {
        value = azurerm_kubernetes_cluster.k8s.kube_config.0.cluster_ca_certificate
    }
    
    output "cluster_username" {
        value = azurerm_kubernetes_cluster.k8s.kube_config.0.username
    }
    
    output "cluster_password" {
        value = azurerm_kubernetes_cluster.k8s.kube_config.0.password
    }
    
    output "kube_config" {
        value = azurerm_kubernetes_cluster.k8s.kube_config_raw
        sensitive = true
    }
    
    output "host" {
        value = azurerm_kubernetes_cluster.k8s.kube_config.0.host
    }
    
  6. Create a file named terraform.tfvars and insert the following code.

    aks_service_principal_app_id = "<service_principal_app_id>"
    
    aks_service_principal_client_secret = "<service_principal_password>"
    
    aks_service_principal_object_id = "<service_principal_object_id>"
    

    Key points:

    • Set aks_service_principal_app_id to the service principal appId value.
    • Set aks_service_principal_client_secret to the service principal password value.
    • Set aks_service_principal_object_id to the service principal object ID. (The Azure CLI command for obtaining this value is in the Configure your environment section.)

4. Initialize Terraform

Run terraform init to initialize the Terraform deployment. This command downloads the Azure modules required to manage your Azure resources.

terraform init

5. Create a Terraform execution plan

Run terraform plan to create an execution plan.

terraform plan -out main.tfplan

Key points:

  • The terraform plan command creates an execution plan, but doesn't execute it. Instead, it determines what actions are necessary to create the configuration specified in your configuration files. This pattern allows you to verify whether the execution plan matches your expectations before making any changes to actual resources.
  • The optional -out parameter allows you to specify an output file for the plan. Using the -out parameter ensures that the plan you reviewed is exactly what is applied.
  • To read more about persisting execution plans and security, see the security warning section.

6. Apply a Terraform execution plan

Run terraform apply to apply the execution plan to your cloud infrastructure.

terraform apply main.tfplan

Key points:

  • The terraform apply command above assumes you previously ran terraform plan -out main.tfplan.
  • If you specified a different filename for the -out parameter, use that same filename in the call to terraform apply.
  • If you didn't use the -out parameter, simply call terraform apply without any parameters.

7. Verify the results

  1. Get the resource group name.

    echo "$(terraform output resource_group_name)"
    
  2. Browse to the Azure portal.

  3. Under Azure services, select Resource groups and locate your new resource group to see the following resources created in this demo:

    • Log Analytics Solution: By default, the demo names this solution ContainerInsights. The portal will show the solutions workspace in parenthesis.
    • Log Analytics Workspace: By default, the demo names this workspace with a prefix of TestLogAnalyticsWorkspaceName- followed by a random number.
    • Kubernetes service: By default, the demo names this service k8stest. (A Managed Kubernetes Cluster is also known as an AKS / Azure Kubernetes Service.)
  4. Get the Kubernetes configuration from the Terraform state and store it in a file that kubectl can read.

    echo "$(terraform output kube_config)" > ./azurek8s
    
  5. Verify the previous command didn't add an ASCII EOT character.

    cat ./azurek8s
    

    *Key points:

    • If you see << EOT at the beginning and EOT at the end, edit the content of the file to remove these characters. Otherwise, you could receive the following error message: error: error loading config file "./azurek8s": yaml: line 2: mapping values are not allowed in this context
  6. Set an environment variable so that kubectl picks up the correct config.

    export KUBECONFIG=./azurek8s
    
  7. Verify the health of the cluster.

    kubectl get nodes
    

    The kubectl tool allows you to verify the health of your Kubernetes cluster

Key points:

  • When the AKS cluster was created, monitoring was enabled to capture health metrics for both the cluster nodes and pods. These health metrics are available in the Azure portal. For more information on container health monitoring, see Monitor Azure Kubernetes Service health.
  • Several key values were output when you applied the Terraform execution plan. For example, the host address, AKS cluster user name, and AKS cluster password are output.
  • To view all of the output values, run terraform output.
  • To view a specific output value, run echo "$(terraform output <output_value_name>)".

8. Clean up resources

Delete AKS resources

When you no longer need the resources created via Terraform, do the following steps:

  1. Run terraform plan and specify the destroy flag.

    terraform plan -destroy -out main.destroy.tfplan
    

    Key points:

    • The terraform plan command creates an execution plan, but doesn't execute it. Instead, it determines what actions are necessary to create the configuration specified in your configuration files. This pattern allows you to verify whether the execution plan matches your expectations before making any changes to actual resources.
    • The optional -out parameter allows you to specify an output file for the plan. Using the -out parameter ensures that the plan you reviewed is exactly what is applied.
    • To read more about persisting execution plans and security, see the security warning section.
  2. Run terraform apply to apply the execution plan.

    terraform apply main.destroy.tfplan
    

Delete storage account

Caution

Only delete the resource group containing storage account you used in this demo if you're not using either for anything else.

Run az group delete to delete the resource group (and its storage account you used in this demo).

az group delete --name <storage_resource_group_name> --yes

Key points:

  • Replace the storage_resource_group_name placeholder with the resource_group_name value in the providers.tf file.

Delete service principal

Caution

Only delete the service principal you used in this demo if you're not using it for anything else.

az ad sp delete --id <service_principal_object_id>

Troubleshoot Terraform on Azure

Troubleshoot common problems when using Terraform on Azure

Next steps