HDInsight cluster management best practices

Learn best practices for managing HDInsight clusters.

How do I create HDInsight clusters?

Option Documents
Azure Data Factory Create on-demand Apache Hadoop clusters in HDInsight using Azure Data Factory
Custom Resource Manager template Create Apache Hadoop clusters in HDInsight by using Resource Manager templates
Quickstart templates HDInsight Quickstart templates
Azure samples HDInsight Azure samples
Azure portal Create Linux-based clusters in HDInsight by using the Azure portal
Azure CLI Create HDInsight clusters using the Azure CLI
Azure PowerShell Create Linux-based clusters in HDInsight using Azure PowerShell
cURL Create Apache Hadoop clusters using the Azure REST API
SDKs (.NET, Python, Java) .NET, Python, Java, Go

Note

If you are creating a cluster and re-using the cluster name from a previously created cluster, wait until the previous cluster deletion is completed before creating your cluster.

How do I customize HDInsight clusters?

Option Documents
Script actions Customize Azure HDInsight clusters by using script actions
Bootstrap Customize HDInsight clusters using Bootstrap
External metastores Use external metadata stores in Azure HDInsight
Custom Ambari DB Set up HDInsight clusters with a custom Ambari DB

What are some errors I might face when creating clusters?

Error More information
No quota There are quotas for the number of cores that you can create on your subscription in each region. For more information, see Capacity planning: quotas.
No more IP addresses available Each VNet has a limited number of IP addresses. When you create a HDInsight cluster, each node (including zookeeper and gateway nodes) uses some of these allotted IP addresses. When all of the IP addresses are in use, you will encounter this error.
Network security group (NSG) rules don't allow communication with HDInsight resource providers If you use NSGs or user-defined routes (UDRs) to control inbound traffic to your HDInsight cluster, you must ensure that your cluster can communicate with critical Azure health and management services. For more information, see Network security group (NSG) service tags for Azure HDInsight
Reuse of cluster name When you use a cluster name that you have used before, you need to wait X number of minutes before recreating the cluster. Otherwise you will see a message that the resource already exists.

How do I manage running HDInsight clusters?

Option Documents
Autoscale Automatically scale Azure HDInsight clusters
Manual scaling Scale Azure HDInsight clusters
Monitoring with Ambari Monitor cluster performance in Azure HDInsight
Monitoring with Azure Monitor logs Use Azure Monitor logs to monitor HDInsight clusters
Service issues, planned maintenance, health & security advisories Subscribe to subscription specific service health alerts

How do I check on deleted HDInsight clusters?

Azure Monitor logs

You can use the following query with Azure Monitor logs to monitor deleted clusters.

AzureActivity
| where ResourceProvider == "Microsoft.HDInsight" and (OperationName == "Create or Update Cluster" or OperationName == "Delete Cluster") and ActivityStatus == "Succeeded"

Next steps