Bring your own key for Apache Kafka on Azure HDInsight (Preview)

Azure HDInsight includes Bring Your Own Key (BYOK) support for Apache Kafka. This capability lets you own and manage the keys used to encrypt data at rest.

All managed disks in HDInsight are protected with Azure Storage Service Encryption (SSE). By default, the data on those disks is encrypted using Microsoft-managed keys. If you enable BYOK, you provide the encryption key for HDInsight to use and manage it using Azure Key Vault.

BYOK encryption is a one-step process handled during cluster creation at no additional cost. All you need to do is register HDInsight as a managed identity with Azure Key Vault and add the encryption key when you create your cluster.

All messages to the Kafka cluster (including replicas maintained by Kafka) are encrypted with a symmetric Data Encryption Key (DEK). The DEK is protected using the Key Encryption Key (KEK) from your key vault. The encryption and decryption processes are handled entirely by Azure HDInsight.

You can use the Azure portal or Azure CLI to safely rotate the keys in the key vault. When a key rotates, the HDInsight Kafka cluster starts using the new key within minutes. Enable the "Do Not Purge" and "Soft Delete" key protection features to protect against ransomware scenarios and accidental deletion. Keys without these protection features are not supported.

Get started with BYOK

  1. Create managed identities for Azure resources.

    To authenticate to Key Vault, create a user-assigned managed identity using the Azure Portal, Azure PowerShell, Azure Resource Manager, or Azure CLI. While Azure Active directory is required for managed identities and BYOK to Kafka, Enterprise Security Package (ESP) isn't a requirement. Be sure to save the managed identity resource ID for when you add it to the Key Vault access policy.

    Create user-assigned managed identity in Azure portal

  2. Import an existing key vault or create a new one.

    HDInsight only supports Azure Key Vault. If you have your own key vault, you can import your keys into Azure Key Vault. Remember that the keys must have "Soft Delete" and "Do Not Purge" enabled. The "Soft Delete" and "Do Not Purge" features are available through the REST, .NET/C#, PowerShell, and Azure CLI interfaces.

    To create a new key vault, follow the Azure Key Vault quickstart. For more information about importing existing keys, visit About keys, secrets, and certificates.

    To create a new key, select Generate/Import from the Keys menu under Settings.

    Generate a new key in Azure Key Vault

    Set Options to Generate and give the key a name.

    Generate a new key in Azure Key Vault

    Select the key you created from the list of keys.

    Azure Key Vault key list

    When you use your own key for Kafka cluster encryption, you need to provide the key URI. Copy the Key identifier and save it somewhere until you're ready to create your cluster.

    Copy key identifier

  3. Add managed identity to the key vault access policy.

    Create a new Azure Key Vault access policy.

    Create new Azure Key Vault access policy

    Under Select Principal, choose the user-assigned managed identity you created.

    Set Select Principal for Azure Key Vault access policy

    Set Key Permissions to Get, Unwrap Key, and Wrap Key.

    Set Key Permissions for Azure Key Vault access policy

    Set Secret Permissions to Get, Set, and Delete.

    Set Key Permissions for Azure Key Vault access policy

  4. Create HDInsight cluster

    You're now ready to create a new HDInsight cluster. BYOK can only be applied to new clusters during cluster creation. Encryption can't be removed from BYOK clusters, and BYOK can't be added to existing clusters.

    Kafka disk encryption in Azure portal

    During cluster creation, provide the full key URL, including the key version. For example, You also need to assign the managed identity to the cluster and provide the key URI.

FAQ for BYOK to Apache Kafka

How does the Kafka cluster access my key vault?

Associate a managed identity with the HDInsight Kafka cluster during cluster creation. This managed identity can be created before or during cluster creation. You also need to grant the managed identity access to the key vault where the key is stored.

Is this feature available for all Kafka clusters on HDInsight?

BYOK encryption is only possible for Kafka 1.1 and above clusters.

Can I have different keys for different topics/partitions?

No, all managed disks in the cluster are encrypted by the same key.

How can I recover the cluster if the keys are deleted?

Since only “Soft Delete” enabled keys are supported, if the keys are restored in the key vault, the cluster should regain access to the keys. To restore an Azure Key Vault key, see Restore-AzureKeyVaultKey.

Can I have producer/consumer applications working with a BYOK cluster and a non-BYOK cluster simultaneously?

Yes. The use of BYOK is transparent to producer/consumer applications. Encryption happens at the OS layer. No changes need to be made to existing producer/consumer Kafka applications.

Are OS disks/Resource disks also encrypted?

No. OS disks and Resource disks are not encrypted.

If a cluster is scaled up, will the new brokers support BYOK seamlessly?

Yes. The cluster needs access to the key in the key vault during scale up. The same key is used to encrypt all managed disks in the cluster.

Is BYOK available in my location?

Kafka BYOK is available in all public clouds.

Next steps