Azure security baseline for HDInsight

This security baseline applies guidance from the Microsoft cloud security benchmark version 1.0 to HDInsight. The Microsoft cloud security benchmark provides recommendations on how you can secure your cloud solutions on Azure. The content is grouped by the security controls defined by the Microsoft cloud security benchmark and the related guidance applicable to HDInsight.

You can monitor this security baseline and its recommendations using Microsoft Defender for Cloud. Azure Policy definitions will be listed in the Regulatory Compliance section of the Microsoft Defender for Cloud portal page.

When a feature has relevant Azure Policy Definitions, they are listed in this baseline to help you measure compliance with the Microsoft cloud security benchmark controls and recommendations. Some recommendations may require a paid Microsoft Defender plan to enable certain security scenarios.

Note

Features not applicable to HDInsight have been excluded. To see how HDInsight completely maps to the Microsoft cloud security benchmark, see the full HDInsight security baseline mapping file.

Security profile

The security profile summarizes high-impact behaviors of HDInsight, which may result in increased security considerations.

Service Behavior Attribute Value
Product Category Analytics
Customer can access HOST / OS Read Only
Service can be deployed into customer's virtual network True
Stores customer content at rest True

Network security

For more information, see the Microsoft cloud security benchmark: Network security.

NS-1: Establish network segmentation boundaries

Features

Virtual Network Integration

Description: Service supports deployment into customer's private Virtual Network (VNet). Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Perimeter security in Azure HDInsight is achieved through virtual networks. An enterprise administrator can create a cluster inside a virtual network and use a network security group (NSG) to restrict access to the virtual network.

Configuration Guidance: Deploy the service into a virtual network. Assign private IPs to the resource (where applicable) unless there is a strong reason to assign public IPs directly to the resource.

Note: Based on your applications and enterprise segmentation strategy, restrict or allow traffic between internal resources based on your NSG rules. For specific, well-defined applications like a three-tier app, this can be a highly secure deny-by-default.

Reference: Plan a virtual network for Azure HDInsight

Network Security Group Support

Description: Service network traffic respects Network Security Groups rule assignment on its subnets. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Perimeter security in Azure HDInsight is achieved through virtual networks. An enterprise administrator can create a cluster inside a virtual network and use a network security group (NSG) to restrict access to the virtual network. Only the allowed IP addresses in the inbound NSG rules can communicate with the Azure HDInsight cluster. This configuration provides perimeter security. All clusters deployed in a virtual network will also have a private endpoint. The endpoint will resolve to a private IP address inside the Virtual Network. It provides private HTTP access to the cluster gateways.

Based on your applications and enterprise segmentation strategy, restrict or allow traffic between internal resources based on your NSG rules. For specific, well-defined applications like a three-tier app, this can be a highly secure deny-by-default.

Ports required generally across all types of clusters:

22-23 - SSH access to the cluster resources

443 - Ambari, WebHCat REST API, HiveServer ODBC, and JDBC

Configuration Guidance: Use network security groups (NSG) to restrict or monitor traffic by port, protocol, source IP address, or destination IP address. Create NSG rules to restrict your service's open ports (such as preventing management ports from being accessed from untrusted networks). Be aware that by default, NSGs deny all inbound traffic but allow traffic from virtual network and Azure Load Balancers.

Reference: Control network traffic in Azure HDInsight

NS-2: Secure cloud services with network controls

Features

Description: Service native IP filtering capability for filtering network traffic (not to be confused with NSG or Azure Firewall). Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Use Azure Private Link to enable private access to HDInsight from your virtual networks without crossing the internet. Private access adds a defense-in-depth measure to Azure authentication and traffic security.

Configuration Guidance: Deploy private endpoints for all Azure resources that support the Private Link feature, to establish a private access point for the resources.

Note: Use Azure Private Link to enable private access to HDInsight from your virtual networks without crossing the internet. Private access adds a defense-in-depth measure to Azure authentication and traffic security.

Reference: Enable Private Link on an HDInsight cluster

Disable Public Network Access

Description: Service supports disabling public network access either through using service-level IP ACL filtering rule (not NSG or Azure Firewall) or using a 'Disable Public Network Access' toggle switch. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Configuration Guidance: Disable public network access either using the service-level IP ACL filtering rule or a toggling switch for public network access.

Reference: Restrict public connectivity in Azure HDInsight

Identity management

For more information, see the Microsoft cloud security benchmark: Identity management.

IM-1: Use centralized identity and authentication system

Features

Azure AD Authentication Required for Data Plane Access

Description: Service supports using Azure AD authentication for data plane access. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Configuration Guidance: Use Azure Active Directory (Azure AD) as the default authentication method to control your data plane access.

Reference: Overview of enterprise security in Azure HDInsight

Local Authentication Methods for Data Plane Access

Description: Local authentications methods supported for data plane access, such as a local username and password. Learn more.

Supported Enabled By Default Configuration Responsibility
True True Microsoft

Feature notes: When an HDI cluster is created, two local admin accounts are created in the data plane (Apache Ambari). One corresponding to the user for which credential is passed by the cluster creator. The other is created by the HDI control plane. The HDI control plane uses this account to make data plane calls. Avoid the usage of local authentication methods or accounts, these should be disabled wherever possible. Instead use Azure AD to authenticate where possible.

Configuration Guidance: No additional configurations are required as this is enabled on a default deployment.

IM-3: Manage application identities securely and automatically

Features

Managed Identities

Description: Data plane actions support authentication using managed identities. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Service Principals

Description: Data plane supports authentication using service principals. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

IM-7: Restrict resource access based on conditions

Features

Conditional Access for Data Plane

Description: Data plane access can be controlled using Azure AD Conditional Access Policies. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

IM-8: Restrict the exposure of credential and secrets

Features

Service Credential and Secrets Support Integration and Storage in Azure Key Vault

Description: Data plane supports native use of Azure Key Vault for credential and secrets store. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Privileged access

For more information, see the Microsoft cloud security benchmark: Privileged access.

PA-1: Separate and limit highly privileged/administrative users

Features

Local Admin Accounts

Description: Service has the concept of a local administrative account. Learn more.

Supported Enabled By Default Configuration Responsibility
True True Microsoft

Feature notes: When an HDI cluster is created, two local admin accounts are created in the data plane (Apache Ambari). One corresponding to the user for which credential is passed by the cluster creator. The other is created by the HDI control plane. The HDI control plane uses this account to make data plane calls. Avoid the usage of local authentication methods or accounts, these should be disabled wherever possible. Instead use Azure AD to authenticate where possible.

Configuration Guidance: No additional configurations are required as this is enabled on a default deployment.

PA-7: Follow just enough administration (least privilege) principle

Features

Azure RBAC for Data Plane

Description: Azure Role-Based Access Control (Azure RBAC) can be used to managed access to service's data plane actions. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Feature notes: Data plane supports only Ambari based roles. Fine grained ACL is done via Ranger.

Configuration Guidance: This feature is not supported to secure this service.

PA-8: Determine access process for cloud provider support

Features

Customer Lockbox

Description: Customer Lockbox can be used for Microsoft support access. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: In support scenarios where Microsoft needs to access customer data, HDInsight supports Customer Lockbox. It provides an interface for you to review customer data access requests and approve or reject them.

Configuration Guidance: In support scenarios where Microsoft needs to access your data, use Customer Lockbox to review, then approve or reject each of Microsoft's data access requests.

Reference: Customer Lockbox for Microsoft Azure

Data protection

For more information, see the Microsoft cloud security benchmark: Data protection.

DP-1: Discover, classify, and label sensitive data

Features

Sensitive Data Discovery and Classification

Description: Tools (such as Azure Purview or Azure Information Protection) can be used for data discovery and classification in the service. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Use tags on resources related to your Azure HDInsight deployments to help tracking Azure resources that store or process sensitive information. Classify and identify sensitive data using Microsoft Purview. Use the service for any data stored in SQL databases or Azure Storage accounts associated to your HDInsight cluster.

For the underlying platform, which Microsoft manages, Microsoft treats all customer content as sensitive. Microsoft goes to great lengths to guard against customer data loss and exposure. To ensure customer data within Azure remains secure, Microsoft has implemented and maintains a suite of robust data protection controls and capabilities.

Configuration Guidance: Use tools such as Azure Purview, Azure Information Protection, and Azure SQL Data Discovery and Classification to centrally scan, classify and label any sensitive data that resides in Azure, on-premises, Microsoft 365, or other locations.

Reference: Azure customer data protection

DP-2: Monitor anomalies and threats targeting sensitive data

Features

Data Leakage/Loss Prevention

Description: Service supports DLP solution to monitor sensitive data movement (in customer's content). Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

DP-3: Encrypt sensitive data in transit

Features

Data in Transit Encryption

Description: Service supports data in-transit encryption for data plane. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Shared

Feature notes: HDInsight supports data encryption in transit with TLS v1.2 or greater. Encrypt all sensitive information in transit. Make sure that any clients connecting to your Azure HDInsight cluster or cluster data stores (Azure Storage Accounts or Azure Data Lake Storage Gen1/Gen2) can negotiate TLS 1.2 or greater. Microsoft Azure resources will negotiate TLS 1.2 by default.

To complement access controls, protect data in transit against "out of band" attacks like traffic capture. Use encryption to make sure that attackers can't easily read or modify the data.

For remote management, use SSH (for Linux) or RDP/TLS (for Windows) instead of an unencrypted protocol. Obsolete SSL, TLS, SSH versions and protocols, and weak ciphers should be disabled.

Configuration Guidance: Enable secure transfer in services where there is a native data in transit encryption feature built in. Enforce HTTPS on any web applications and services and ensure TLS v1.2 or later is used. Legacy versions such as SSL 3.0, TLS v1.0 should be disabled. For remote management of Virtual Machines, use SSH (for Linux) or RDP/TLS (for Windows) instead of an unencrypted protocol.

Note: HDInsight supports data encryption in transit with TLS v1.2 or greater. Encrypt all sensitive information in transit. Make sure that any clients connecting to your Azure HDInsight cluster or cluster data stores (Azure Storage Accounts or Azure Data Lake Storage Gen1/Gen2) can negotiate TLS 1.2 or greater. Microsoft Azure resources will negotiate TLS 1.2 by default.

To complement access controls, protect data in transit against "out of band" attacks like traffic capture. Use encryption to make sure that attackers can't easily read or modify the data.

For remote management, use SSH (for Linux) or RDP/TLS (for Windows) instead of an unencrypted protocol. Obsolete SSL, TLS, SSH versions and protocols, and weak ciphers should be disabled.

By default, Azure provides encryption for data in transit between Azure data centers.

DP-4: Enable data at rest encryption by default

Features

Data at Rest Encryption Using Platform Keys

Description: Data at-rest encryption using platform keys is supported, any customer content at rest is encrypted with these Microsoft managed keys. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Shared

Feature notes: If using Azure SQL Database to store Apache Hive and Apache Oozie metadata, ensure SQL data always remains encrypted. For Azure Storage Accounts and Data Lake Storage (Gen1 or Gen2), it's recommended to allow Microsoft to manage your encryption keys, however, you can manage your own keys.

HDInsight supports multiple types of encryption in two different layers:

Server Side Encryption (SSE) - SSE is performed by the storage service. In HDInsight, SSE is used to encrypt OS disks and data disks. It's enabled by default. SSE is a layer 1 encryption service.

Encryption at host using platform-managed key - Similar to SSE, this type of encryption is performed by the storage service. However, it's only for temporary disks and isn't enabled by default. Encryption at host is also a layer 1 encryption service.

Encryption at rest using customer managed key - This type of encryption can be used on data and temporary disks. It isn't enabled by default and requires the customer to provide their own key through Azure key vault. Encryption at rest is a layer 2 encryption service.

Configuration Guidance: Enable data at rest encryption using platform managed (Microsoft managed) keys where not automatically configured by the service.

Note: If using Azure SQL Database to store Apache Hive and Apache Oozie metadata, ensure SQL data always remains encrypted. For Azure Storage Accounts and Data Lake Storage (Gen1 or Gen2), it's recommended to allow Microsoft to manage your encryption keys, however, you can manage your own keys.

HDInsight supports multiple types of encryption in two different layers:

Server Side Encryption (SSE) - SSE is performed by the storage service. In HDInsight, SSE is used to encrypt OS disks and data disks. It's enabled by default. SSE is a layer 1 encryption service.

Encryption at host using platform-managed key - Similar to SSE, this type of encryption is performed by the storage service. However, it's only for temporary disks and isn't enabled by default. Encryption at host is also a layer 1 encryption service.

Encryption at rest using customer managed key - This type of encryption can be used on data and temporary disks. It isn't enabled by default and requires the customer to provide their own key through Azure key vault. Encryption at rest is a layer 2 encryption service.

Reference: Azure HDInsight double encryption for data at rest

DP-5: Use customer-managed key option in data at rest encryption when required

Features

Data at Rest Encryption Using CMK

Description: Data at-rest encryption using customer-managed keys is supported for customer content stored by the service. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Shared

Feature notes: If using Azure SQL Database to store Apache Hive and Apache Oozie metadata, ensure SQL data always remains encrypted. For Azure Storage Accounts and Data Lake Storage (Gen1 or Gen2), it's recommended to allow Microsoft to manage your encryption keys, however, you can manage your own keys.

HDInsight supports multiple types of encryption in two different layers:

Server Side Encryption (SSE) - SSE is performed by the storage service. In HDInsight, SSE is used to encrypt OS disks and data disks. It's enabled by default. SSE is a layer 1 encryption service.

Encryption at host using platform-managed key - Similar to SSE, this type of encryption is performed by the storage service. However, it's only for temporary disks and isn't enabled by default. Encryption at host is also a layer 1 encryption service.

Encryption at rest using customer managed key - This type of encryption can be used on data and temporary disks. It isn't enabled by default and requires the customer to provide their own key through Azure key vault. Encryption at rest is a layer 2 encryption service.

Configuration Guidance: If required for regulatory compliance, define the use case and service scope where encryption using customer-managed keys are needed. Enable and implement data at rest encryption using customer-managed key for those services.

Note: If using Azure SQL Database to store Apache Hive and Apache Oozie metadata, ensure SQL data always remains encrypted. For Azure Storage Accounts and Data Lake Storage (Gen1 or Gen2), it's recommended to allow Microsoft to manage your encryption keys, however, you can manage your own keys.

HDInsight supports multiple types of encryption in two different layers:

Server Side Encryption (SSE) - SSE is performed by the storage service. In HDInsight, SSE is used to encrypt OS disks and data disks. It's enabled by default. SSE is a layer 1 encryption service.

Encryption at host using platform-managed key - Similar to SSE, this type of encryption is performed by the storage service. However, it's only for temporary disks and isn't enabled by default. Encryption at host is also a layer 1 encryption service.

Encryption at rest using customer managed key - This type of encryption can be used on data and temporary disks. It isn't enabled by default and requires the customer to provide their own key through Azure key vault. Encryption at rest is a layer 2 encryption service.

Reference: Azure HDInsight double encryption for data at rest

DP-6: Use a secure key management process

Features

Key Management in Azure Key Vault

Description: The service supports Azure Key Vault integration for any customer keys, secrets, or certificates. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Shared

Feature notes: If using Azure SQL Database to store Apache Hive and Apache Oozie metadata, ensure SQL data always remains encrypted. For Azure Storage Accounts and Data Lake Storage (Gen1 or Gen2), it's recommended to allow Microsoft to manage your encryption keys, however, you can manage your own keys.

HDInsight supports multiple types of encryption in two different layers:

Server Side Encryption (SSE) - SSE is performed by the storage service. In HDInsight, SSE is used to encrypt OS disks and data disks. It's enabled by default. SSE is a layer 1 encryption service.

Encryption at host using platform-managed key - Similar to SSE, this type of encryption is performed by the storage service. However, it's only for temporary disks and isn't enabled by default. Encryption at host is also a layer 1 encryption service.

Encryption at rest using customer managed key - This type of encryption can be used on data and temporary disks. It isn't enabled by default and requires the customer to provide their own key through Azure key vault. Encryption at rest is a layer 2 encryption service.

Configuration Guidance: Use Azure Key Vault to create and control the life cycle of your encryption keys, including key generation, distribution, and storage. Rotate and revoke your keys in Azure Key Vault and your service based on a defined schedule or when there is a key retirement or compromise. When there is a need to use customer-managed key (CMK) in the workload, service, or application level, ensure you follow the best practices for key management: Use a key hierarchy to generate a separate data encryption key (DEK) with your key encryption key (KEK) in your key vault. Ensure keys are registered with Azure Key Vault and referenced via key IDs from the service or application. If you need to bring your own key (BYOK) to the service (such as importing HSM-protected keys from your on-premises HSMs into Azure Key Vault), follow recommended guidelines to perform initial key generation and key transfer.

Note: If you're using Azure Key Vault with your Azure HDInsight deployment, periodically test restoration of backed up customer-managed keys.

Reference: Azure HDInsight double encryption for data at rest

DP-7: Use a secure certificate management process

Features

Certificate Management in Azure Key Vault

Description: The service supports Azure Key Vault integration for any customer certificates. Learn more.

Supported Enabled By Default Configuration Responsibility
True True Microsoft

Configuration Guidance: No additional configurations are required as this is enabled on a default deployment.

Reference: Azure HDInsight double encryption for data at rest

Asset management

For more information, see the Microsoft cloud security benchmark: Asset management.

AM-2: Use only approved services

Features

Azure Policy Support

Description: Service configurations can be monitored and enforced via Azure Policy. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Use Azure Policy aliases in the "Microsoft.HDInsight" namespace to create custom policies. Configure the policies to audit or enforce the network configuration of your HDInsight cluster.

If you have a Rapid7, Qualys, or any other vulnerability management platform subscription, you have options. You can use script actions to install vulnerability assessment agents on your Azure HDInsight cluster nodes and manage the nodes through the respective portal.

With Azure HDInsight ESP, you can use Apache Ranger to create and manage fine-grained access control and data obfuscation policies. You can do so for your data stored in: Files/Folders/Databases/Tables/Rows/Columns.

The Hadoop admin can configure Azure RBAC to secure Apache Hive, HBase, Kafka, and Spark using those plugins in Apache Ranger.

Configuration Guidance: Use Microsoft Defender for Cloud to configure Azure Policy to audit and enforce configurations of your Azure resources. Use Azure Monitor to create alerts when there is a configuration deviation detected on the resources. Use Azure Policy [deny] and [deploy if not exists] effects to enforce secure configuration across Azure resources.

Reference: Azure Policy built-in definitions for Azure HDInsight

AM-5: Use only approved applications in virtual machine

Features

Microsoft Defender for Cloud - Adaptive Application Controls

Description: Service can limit what customer applications run on the virtual machine using Adaptive Application Controls in Microsoft Defender for Cloud. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Feature notes: Azure HDInsight doesn't support defender natively; however, it does use ClamAV. Additionally, when using the ESP for HDInsight, you can use some of the Microsoft Defender for Cloud built-in threat detection capability. You can also enable Microsoft Defender for your VMs associated to HDInsight.

Configuration Guidance: This feature is not supported to secure this service.

Logging and threat detection

For more information, see the Microsoft cloud security benchmark: Logging and threat detection.

LT-1: Enable threat detection capabilities

Features

Microsoft Defender for Service / Product Offering

Description: Service has an offering-specific Microsoft Defender solution to monitor and alert on security issues. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

LT-4: Enable logging for security investigation

Features

Azure Resource Logs

Description: Service produces resource logs that can provide enhanced service-specific metrics and logging. The customer can configure these resource logs and send them to their own data sink like a storage account or log analytics workspace. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Activity logs are available automatically. The logs contain all PUT, POST, and DELETE, but not GET, operations for your HDInsight resources except read operations (GET). You can use activity logs to find errors when troubleshooting, or to monitor how users in your organization modified resources.

Enable Azure resource logs for HDInsight. You can use Microsoft Defender for Cloud and Azure Policy to enable resource logs and log data collecting. These logs can be critical for investigating security incidents and carrying out forensic exercises.

HDInsight also produces security audit logs for the local administer accounts. Enable these local admin audit logs.

Configuration Guidance: Enable resource logs for the service. For example, Key Vault supports additional resource logs for actions that get a secret from a key vault or and Azure SQL has resource logs that track requests to a database. The content of resource logs varies by the Azure service and resource type.

Reference: Manage logs for an HDInsight cluster

Posture and vulnerability management

For more information, see the Microsoft cloud security benchmark: Posture and vulnerability management.

PV-3: Define and establish secure configurations for compute resources

Features

Azure Automation State Configuration

Description: Azure Automation State Configuration can be used to maintain the security configuration of the operating system. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Azure HDInsight Operating System Images are managed and maintained by Microsoft. However, the customer is responsible for implementing OS-level state configuration for that image. Microsoft VM templates combined with Azure Automation State Configuration can help meet and maintain security requirements.

Configuration Guidance: Use Azure Automation State Configuration to maintain the security configuration of the operating system.

Reference: Azure Automation State Configuration overview

Azure Policy Guest Configuration Agent

Description: Azure Policy guest configuration agent can be installed or deployed as an extension to compute resources. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Configuration Guidance: There is no current Microsoft guidance for this feature configuration. Please review and determine if your organization wants to configure this security feature.

Reference: Understand the machine configuration feature of Azure Automanage

Custom VM Images

Description: Service supports using user-supplied VM images or pre-built images from the marketplace with certain baseline configurations pre-applied. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Custom Containers Images

Description: Service supports using user-supplied container images or pre-built images from the marketplace with certain baseline configurations pre-applied. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

PV-5: Perform vulnerability assessments

Features

Vulnerability Assessment using Microsoft Defender

Description: Service can be scanned for vulnerability scan using Microsoft Defender for Cloud or other Microsoft Defender services embedded vulnerability assessment capability (including Microsoft Defender for server, container registry, App Service, SQL, and DNS). Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: Azure HDInsight doesn't support Microsoft Defender for vulnerability assessment natively, it uses ClamAV for malware protection. However, when using the ESP for HDInsight, you can use some of the Microsoft Defender for Cloud built-in threat detection capability. You can also enable Microsoft Defender for your VMs associated to HDInsight.

Forward any logs from HDInsight to your SIEM, which can be used to set up custom threat detections. Ensure that you're monitoring different types of Azure assets for potential threats and anomalies. Focus on getting high-quality alerts to reduce false positives for analysts to sort through. Alerts can be sourced from log data, agents, or other data.

Configuration Guidance: Follow recommendations from Microsoft Defender for Cloud for performing vulnerability assessments on your Azure virtual machines, container images, and SQL servers.

Note: Azure HDInsight doesn't support defender natively, it uses ClamAV. However, when using the ESP for HDInsight, you can use some of the Microsoft Defender for Cloud built-in threat detection capability. You can also enable Microsoft Defender for your VMs associated to HDInsight.

Forward any logs from HDInsight to your SIEM, which can be used to set up custom threat detections. Ensure that you're monitoring different types of Azure assets for potential threats and anomalies. Focus on getting high-quality alerts to reduce false positives for analysts to sort through. Alerts can be sourced from log data, agents, or other data.

PV-6: Rapidly and automatically remediate vulnerabilities

Features

Azure Automation Update Management

Description: Service can use Azure Automation Update Management to deploy patches and updates automatically. Learn more.

Supported Enabled By Default Configuration Responsibility
True False Shared

Feature notes: Ubuntu images become available for new Azure HDInsight cluster creation within three months of being published. Running clusters are not autopatched. Customers must use script actions or other mechanisms to patch a running cluster. As a best practice, you can run these script actions and apply security updates right after the cluster creation.

Configuration Guidance: Use Azure Automation Update Management or a third-party solution to ensure that the most recent security updates are installed on your Windows and Linux VMs. For Windows VMs, ensure Windows Update has been enabled and set to update automatically.

Note: Ubuntu images become available for new Azure HDInsight cluster creation within three months of being published. Running clusters aren't autopatched. Customers must use script actions or other mechanisms to patch a running cluster. As a best practice, you can run these script actions and apply security updates right after the cluster creation.

Reference: Update Management overview

Endpoint security

For more information, see the Microsoft cloud security benchmark: Endpoint security.

ES-1: Use Endpoint Detection and Response (EDR)

Features

EDR Solution

Description: Endpoint Detection and Response (EDR) feature such as Azure Defender for servers can be deployed into the endpoint. Learn more.

Supported Enabled By Default Configuration Responsibility
True True Microsoft

Feature notes: Azure HDInsight doesn’t support Microsoft Defender for Endpoint natively, it uses ClamAV for malware protection.

Configuration Guidance: No additional configurations are required as this is enabled on a default deployment.

Reference: Can I disable Clamscan on my cluster?

ES-2: Use modern anti-malware software

Features

Anti-Malware Solution

Description: Anti-malware feature such as Microsoft Defender Antivirus, Microsoft Defender for Endpoint can be deployed on the endpoint. Learn more.

Supported Enabled By Default Configuration Responsibility
True True Microsoft

Feature notes: Azure HDInsight uses ClamAV. Forward the ClamAV logs to a centralized SIEM or other detection and alerting system.

Configuration Guidance: No additional configurations are required as this is enabled on a default deployment.

Reference: Security and Certificates

ES-3: Ensure anti-malware software and signatures are updated

Features

Anti-Malware Solution Health Monitoring

Description: Anti-malware solution provides health status monitoring for platform, engine, and automatic signature updates. Learn more.

Supported Enabled By Default Configuration Responsibility
True True Microsoft

Feature notes: Azure HDInsight comes with Clamscan pre-installed and enabled for the cluster node images. Clamscan will perform engine and definition updates automatically and update its anti-malware signatures based on ClamAV’s official virus signature database.

Configuration Guidance: No additional configurations are required as this is enabled on a default deployment.

Reference: Security and Certificates

Backup and recovery

For more information, see the Microsoft cloud security benchmark: Backup and recovery.

BR-1: Ensure regular automated backups

Features

Azure Backup

Description: The service can be backed up by the Azure Backup service. Learn more.

Supported Enabled By Default Configuration Responsibility
False Not Applicable Not Applicable

Configuration Guidance: This feature is not supported to secure this service.

Service Native Backup Capability

Description: Service supports its own native backup capability (if not using Azure Backup). Learn more.

Supported Enabled By Default Configuration Responsibility
True False Customer

Feature notes: HBase Export and HBase Replication are common ways of enabling business continuity between HDInsight HBase clusters.

HBase Export is a batch replication process that uses the HBase Export Utility to export tables from the primary HBase cluster to its underlying Azure Data Lake Storage Gen 2 storage. The exported data can then be accessed from the secondary HBase cluster and imported into tables which must preexist in the secondary. While HBase Export does offer table level granularity, in incremental update situations, the export automation engine controls the range of incremental rows to include in each run.

Configuration Guidance: There is no current Microsoft guidance for this feature configuration. Please review and determine if your organization wants to configure this security feature.

Reference: Set up backup and replication for Apache HBase and Apache Phoenix on HDInsight

Next steps