Create Azure Arc-enabled data controller using Kubernetes tools

A data controller manages Azure Arc-enabled data services for a Kubernetes cluster. This article describes how to use Kubernetes tools to create a data controller.

Creating the data controller has the following high level steps:

  1. Create the namespace and bootstrapper service
  2. Create the data controller

Note

For simplicity, the steps below assume that you are a Kubernetes cluster administrator. For production deployments or more secure environments, it is recommended to follow the security best practices of "least privilege" when deploying the data controller by granting only specific permissions to users and service accounts involved in the deployment process.

See the topic Operate Arc-enabled data services with least privileges for detailed instructions.

Prerequisites

Review the topic Plan an Azure Arc-enabled data services deployment for overview information.

To create the data controller using Kubernetes tools you will need to have the Kubernetes tools installed. The examples in this article will use kubectl, but similar approaches could be used with other Kubernetes tools such as the Kubernetes dashboard, oc, or helm if you are familiar with those tools and Kubernetes yaml/json.

Install the kubectl tool

Create the namespace and bootstrapper service

The bootstrapper service handles incoming requests for creating, editing, and deleting custom resources such as a data controller.

Save a copy of bootstrapper-unified.yaml, and replace the placeholder {{NAMESPACE}} in all the places in the file with the desired namespace name, for example: arc.

Important

The bootstrapper-unified.yaml template file defaults to pulling the bootstrapper container image from the Microsoft Container Registry (MCR). If your environment can't directly access the Microsoft Container Registry, you can do the following:

Run the following command to create the namespace and bootstrapper service with the edited file.

kubectl apply --namespace arc -f bootstrapper-unified.yaml

Verify that the bootstrapper pod is running using the following command.

kubectl get pod --namespace arc -l app=bootstrapper

If the status is not Running, run the command a few times until the status is Running.

Create the data controller

Now you are ready to create the data controller itself.

First, create a copy of the template file locally on your computer so that you can modify some of the settings.

Create the metrics and logs dashboards user names and passwords

At the top of the file, you can specify a user name and password that is used to authenticate to the metrics and logs dashboards as an administrator. Choose a secure password and share it with only those that need to have these privileges.

A Kubernetes secret is stored as a base64 encoded string - one for the username and one for the password.

You can use an online tool to base64 encode your desired username and password or you can use built in CLI tools depending on your platform.

PowerShell

[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes('<your string to encode here>'))

#Example
#[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes('example'))

Linux/macOS

echo -n '<your string to encode here>' | base64

#Example
# echo -n 'example' | base64

Create certificates for logs and metrics dashboards

Optionally, you can create SSL/TLS certificates for the logs and metrics dashboards. Follow the instructions at Specify SSL/TLS certificates during Kubernetes native tools deployment.

Edit the data controller configuration

Edit the data controller configuration as needed:

REQUIRED

  • location: Change this to be the Azure location where the metadata about the data controller will be stored. Review the list of available regions.
  • resourceGroup: the Azure resource group where you want to create the data controller Azure resource in Azure Resource Manager. Typically this resource group should already exist, but it is not required until the time that you upload the data to Azure.
  • subscription: the Azure subscription GUID for the subscription that you want to create the Azure resources in.

RECOMMENDED TO REVIEW AND POSSIBLY CHANGE DEFAULTS

  • storage..className: the storage class to use for the data controller data and log files. If you are unsure of the available storage classes in your Kubernetes cluster, you can run the following command: kubectl get storageclass. The default is default which assumes there is a storage class that exists and is named default not that there is a storage class that is the default. Note: There are two className settings to be set to the desired storage class - one for data and one for logs.
  • serviceType: Change the service type to NodePort if you are not using a LoadBalancer.
  • Security For Azure Red Hat OpenShift or Red Hat OpenShift Container Platform, replace the security: settings with the following values in the data controller yaml file.
  security:
    allowDumps: false
    allowNodeMetricsCollection: false
    allowPodMetricsCollection: false

OPTIONAL

  • name: The default name of the data controller is arc, but you can change it if you want.
  • displayName: Set this to the same value as the name attribute at the top of the file.
  • logsui-certificate-secret: The name of the secret created on the Kubernetes cluster for the logs UI certificate.
  • metricsui-certificate-secret: The name of the secret created on the Kubernetes cluster for the metrics UI certificate.

The following example shows a completed data controller yaml.

apiVersion: v1
data:
  password: <your base64 encoded password>
  username: <your base64 encoded username>
kind: Secret
metadata:
  name: metricsui-admin-secret
type: Opaque

---

apiVersion: v1
data:
  password: <your base64 encoded password>
  username: <your base64 encoded username>
kind: Secret
metadata:
  name: logsui-admin-secret
type: Opaque

---

apiVersion: arcdata.microsoft.com/v5
kind: DataController
metadata:
  name: arc-dc
spec:
  credentials:
    dockerRegistry: arc-private-registry # Create a registry secret named 'arc-private-registry' if you are going to pull from a private registry instead of MCR.
    serviceAccount: sa-arc-controller
  docker:
    imagePullPolicy: Always
    imageTag: v1.29.0_2024-04-09
    registry: mcr.microsoft.com
    repository: arcdata
  infrastructure: other # Must be a value in the array [alibaba, aws, azure, gcp, onpremises, other]
  security:
    allowDumps: true # Set this to false if deploying on OpenShift
    allowNodeMetricsCollection: true # Set this to false if deploying on OpenShift
    allowPodMetricsCollection: true # Set this to false if deploying on OpenShift
  services:
  - name: controller
    port: 30080
    serviceType: LoadBalancer # Modify serviceType based on your Kubernetes environment
  settings:
    ElasticSearch:
      vm.max_map_count: "-1"
    azure:
      connectionMode: indirect # Only indirect is supported for Kubernetes-native deployment for now.
      location: eastus # Choose a different Azure location if you want
      resourceGroup: <your resource group>
      subscription: <your subscription GUID>
    controller:
      displayName: arc-dc
      enableBilling: true
      logs.rotation.days: "7"
      logs.rotation.size: "5000"
  storage:
    data:
      accessMode: ReadWriteOnce
      className: default # Use default configured storage class or modify storage class based on your Kubernetes environment
      size: 15Gi
    logs:
      accessMode: ReadWriteOnce
      className: default # Use default configured storage class or modify storage class based on your Kubernetes environment
      size: 10Gi

Save the edited file on your local computer and run the following command to create the data controller:

kubectl create --namespace arc -f <path to your data controller file>

#Example
kubectl create --namespace arc -f data-controller.yaml

Monitoring the creation status

Creating the controller will take a few minutes to complete. You can monitor the progress in another terminal window with the following commands:

kubectl get datacontroller --namespace arc
kubectl get pods --namespace arc

You can also check on the creation status or logs of any particular pod by running a command like below. This is especially useful for troubleshooting any issues.

kubectl describe pod/<pod name> --namespace arc
kubectl logs <pod name> --namespace arc

#Example:
#kubectl describe pod/control-2g7bl --namespace arc
#kubectl logs control-2g7b1 --namespace arc

Troubleshooting creation problems

If you encounter any troubles with creation, please see the troubleshooting guide.