Microsoft.HDInsight clusters template reference

Template format

To create a Microsoft.HDInsight/clusters resource, add the following JSON to the resources section of your template.

{
  "name": "string",
  "type": "Microsoft.HDInsight/clusters",
  "apiVersion": "2018-06-01-preview",
  "location": "string",
  "tags": {},
  "properties": {
    "clusterVersion": "string",
    "osType": "string",
    "tier": "string",
    "clusterDefinition": {
      "kind": "string",
      "componentVersion": {},
      "configurations": {}
    },
    "kafkaRestProperties": {
      "clientGroupInfo": {
        "groupName": "string",
        "groupId": "string"
      }
    },
    "securityProfile": {
      "directoryType": "ActiveDirectory",
      "domain": "string",
      "organizationalUnitDN": "string",
      "ldapsUrls": [
        "string"
      ],
      "domainUsername": "string",
      "domainUserPassword": "string",
      "clusterUsersGroupDNs": [
        "string"
      ],
      "aaddsResourceId": "string",
      "msiResourceId": "string"
    },
    "computeProfile": {
      "roles": [
        {
          "name": "string",
          "minInstanceCount": "integer",
          "targetInstanceCount": "integer",
          "autoscale": {
            "capacity": {
              "minInstanceCount": "integer",
              "maxInstanceCount": "integer"
            },
            "recurrence": {
              "timeZone": "string",
              "schedule": [
                {
                  "days": [
                    "string"
                  ],
                  "timeAndCapacity": {
                    "time": "string",
                    "minInstanceCount": "integer",
                    "maxInstanceCount": "integer"
                  }
                }
              ]
            }
          },
          "hardwareProfile": {
            "vmSize": "string"
          },
          "osProfile": {
            "linuxOperatingSystemProfile": {
              "username": "string",
              "password": "string",
              "sshProfile": {
                "publicKeys": [
                  {
                    "certificateData": "string"
                  }
                ]
              }
            }
          },
          "virtualNetworkProfile": {
            "id": "string",
            "subnet": "string"
          },
          "dataDisksGroups": [
            {
              "disksPerNode": "integer"
            }
          ],
          "scriptActions": [
            {
              "name": "string",
              "uri": "string",
              "parameters": "string"
            }
          ]
        }
      ]
    },
    "storageProfile": {
      "storageaccounts": [
        {
          "name": "string",
          "isDefault": "boolean",
          "container": "string",
          "fileSystem": "string",
          "key": "string",
          "resourceId": "string",
          "msiResourceId": "string"
        }
      ]
    },
    "diskEncryptionProperties": {
      "vaultUri": "string",
      "keyName": "string",
      "keyVersion": "string",
      "encryptionAlgorithm": "string",
      "msiResourceId": "string"
    },
    "minSupportedTlsVersion": "string"
  },
  "identity": {
    "type": "string",
    "userAssignedIdentities": {}
  }
}

Property values

The following tables describe the values you need to set in the schema.

Microsoft.HDInsight/clusters object

Name Type Required Value
name string Yes The name of the cluster.
type enum Yes Microsoft.HDInsight/clusters
apiVersion enum Yes 2018-06-01-preview
location string No The location of the cluster.
tags object No The resource tags.
properties object Yes The cluster create parameters. - ClusterCreateProperties object
identity object No The identity of the cluster, if configured. - ClusterIdentity object

ClusterCreateProperties object

Name Type Required Value
clusterVersion string No The version of the cluster.
osType enum No The type of operating system. - Windows or Linux
tier enum No The cluster tier. - Standard or Premium
clusterDefinition object No The cluster definition. - ClusterDefinition object
kafkaRestProperties object No The cluster kafka rest proxy configuration. - KafkaRestProperties object
securityProfile object No The security profile. - SecurityProfile object
computeProfile object No The compute profile. - ComputeProfile object
storageProfile object No The storage profile. - StorageProfile object
diskEncryptionProperties object No The disk encryption properties. - DiskEncryptionProperties object
minSupportedTlsVersion string No The minimal supported tls version.

ClusterIdentity object

Name Type Required Value
type enum No The type of identity used for the cluster. The type 'SystemAssigned, UserAssigned' includes both an implicitly created identity and a set of user assigned identities. - SystemAssigned, UserAssigned, SystemAssigned, UserAssigned, None
userAssignedIdentities object No The list of user identities associated with the cluster. The user identity dictionary key references will be ARM resource ids in the form: '/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identityName}'.

ClusterDefinition object

Name Type Required Value
kind string No The type of cluster.
componentVersion object No The versions of different services in the cluster.
configurations object No The cluster configurations.

KafkaRestProperties object

Name Type Required Value
clientGroupInfo object No The information of AAD security group. - ClientGroupInfo object

SecurityProfile object

Name Type Required Value
directoryType enum No The directory type. - ActiveDirectory
domain string No The organization's active directory domain.
organizationalUnitDN string No The organizational unit within the Active Directory to place the cluster and service accounts.
ldapsUrls array No The LDAPS protocol URLs to communicate with the Active Directory. - string
domainUsername string No The domain user account that will have admin privileges on the cluster.
domainUserPassword string No The domain admin password.
clusterUsersGroupDNs array No Optional. The Distinguished Names for cluster user groups - string
aaddsResourceId string No The resource ID of the user's Azure Active Directory Domain Service.
msiResourceId string No User assigned identity that has permissions to read and create cluster-related artifacts in the user's AADDS.

ComputeProfile object

Name Type Required Value
roles array No The list of roles in the cluster. - Role object

StorageProfile object

Name Type Required Value
storageaccounts array No The list of storage accounts in the cluster. - StorageAccount object

DiskEncryptionProperties object

Name Type Required Value
vaultUri string No Base key vault URI where the customers key is located eg. https://myvault.vault.azure.net
keyName string No Key name that is used for enabling disk encryption.
keyVersion string No Specific key version that is used for enabling disk encryption.
encryptionAlgorithm enum No Algorithm identifier for encryption, default RSA-OAEP. - RSA-OAEP, RSA-OAEP-256, RSA1_5
msiResourceId string No Resource ID of Managed Identity that is used to access the key vault.

ClientGroupInfo object

Name Type Required Value
groupName string No The AAD security group name.
groupId string No The AAD security group id.

Role object

Name Type Required Value
name string No The name of the role.
minInstanceCount integer No The minimum instance count of the cluster.
targetInstanceCount integer No The instance count of the cluster.
autoscale object No The autoscale configurations. - Autoscale object
hardwareProfile object No The hardware profile. - HardwareProfile object
osProfile object No The operating system profile. - OsProfile object
virtualNetworkProfile object No The virtual network profile. - VirtualNetworkProfile object
dataDisksGroups array No The data disks groups for the role. - DataDisksGroups object
scriptActions array No The list of script actions on the role. - ScriptAction object

StorageAccount object

Name Type Required Value
name string No The name of the storage account.
isDefault boolean No Whether or not the storage account is the default storage account.
container string No The container in the storage account, only to be specified for WASB storage accounts.
fileSystem string No The filesystem, only to be specified for Azure Data Lake Storage Gen 2.
key string No The storage account access key.
resourceId string No The resource ID of storage account, only to be specified for Azure Data Lake Storage Gen 2.
msiResourceId string No The managed identity (MSI) that is allowed to access the storage account, only to be specified for Azure Data Lake Storage Gen 2.

Autoscale object

Name Type Required Value
capacity object No Parameters for load-based autoscale - AutoscaleCapacity object
recurrence object No Parameters for schedule-based autoscale - AutoscaleRecurrence object

HardwareProfile object

Name Type Required Value
vmSize string No The size of the VM

OsProfile object

Name Type Required Value
linuxOperatingSystemProfile object No The Linux OS profile. - LinuxOperatingSystemProfile object

VirtualNetworkProfile object

Name Type Required Value
id string No The ID of the virtual network.
subnet string No The name of the subnet.

DataDisksGroups object

Name Type Required Value
disksPerNode integer No The number of disks per node.

ScriptAction object

Name Type Required Value
name string Yes The name of the script action.
uri string Yes The URI to the script.
parameters string Yes The parameters for the script provided.

AutoscaleCapacity object

Name Type Required Value
minInstanceCount integer No The minimum instance count of the cluster
maxInstanceCount integer No The maximum instance count of the cluster

AutoscaleRecurrence object

Name Type Required Value
timeZone string No The time zone for the autoscale schedule times
schedule array No Array of schedule-based autoscale rules - AutoscaleSchedule object

LinuxOperatingSystemProfile object

Name Type Required Value
username string No The username.
password string No The password.
sshProfile object No The SSH profile. - SshProfile object

AutoscaleSchedule object

Name Type Required Value
days array No Days of the week for a schedule-based autoscale rule - Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday
timeAndCapacity object No Time and capacity for a schedule-based autoscale rule - AutoscaleTimeAndCapacity object

SshProfile object

Name Type Required Value
publicKeys array No The list of SSH public keys. - SshPublicKey object

AutoscaleTimeAndCapacity object

Name Type Required Value
time string No 24-hour time in the form xx:xx
minInstanceCount integer No The minimum instance count of the cluster
maxInstanceCount integer No The maximum instance count of the cluster

SshPublicKey object

Name Type Required Value
certificateData string No The certificate for SSH.

Quickstart templates

The following quickstart templates deploy this resource type.

Template Description
HDInsight with Loadbased Autoscale Enabled

Deploy to Azure
This template allows you to create an HDInsight Spark cluster with loadbased Autoscale enabled.
HDInsight with schedule-based Autoscale Enabled

Deploy to Azure
This template allows you to create an HDInsight Spark cluster with schedule-based Autoscale enabled.
HDInsight with custom Ambari + Hive Metastore DB in VNET

Deploy to Azure
This template allows you to create an HDInsight cluster in an existing virtual network with a new SQL DB that serves as both a custom Ambari DB and Hive Metastore. You must have an exising SQL Sever, storage account, and VNET.
Deploy Linux HBase cluster with enhanced writes in HDInsight

Deploy to Azure
This template allows you to create a Linux-based HBase cluster with enhanced writes in Azure HDInsight.
Deploy a Linux-based HBase cluster in HDInsight

Deploy to Azure
This template allows you to create a Linux-based HBase cluster in Azure HDInsight.
Deploy a VNet, and a HBase cluster within the VNet

Deploy to Azure
This template allows you to create an Azure VNet and an HDInsight HBase cluster running Linux within the VNet.
Deploy an Azure VNet and two HBase clusters within the VNet

Deploy to Azure
This template allows you to configure an HBase environment with two HBase clusters within a VNet for configuring HBase replication.
Deploy HBase replication with two VNets in one region

Deploy to Azure
This template allows you to configure aN HBase environment with two HBase clusters within two VNets in the same region for configuring HBase replication.
Deploy an Interative Hive cluster in HDInsight.

Deploy to Azure
This template allows you to create an Interative Hive (LLAP) cluster in HDInsight and the dependent Azure Storage account. The SSH authentication method for the cluster is username and password. For a template using SSH public key authentication, see https://azure.microsoft.com/resources/templates/101-hdinsight-linux-ssh-publickey/
Deploy Kafka on HDInsight in a virtual network

Deploy to Azure
This template allows you to create an Azure Virtual Network and a Kafka on HDInsight cluster in the virtual network. The SSH authentication method for the cluster is username and password. For a template using SSH public key authentication, see https://azure.microsoft.com/resources/templates/101-hdinsight-linux-ssh-publickey/
Deploy HDInsight cluster + Confluent Schema Registry node

Deploy to Azure
This template allows you to create an HDInsight cluster running Linux with a schema registry edge node. For more information, see https://docs.microsoft.com/azure/hdinsight/hdinsight-apps-use-edge-node
Deploy HDInsight cluster with Storage and SSH password

Deploy to Azure
This template allows you to create a Linux-based Hadoop cluster in HDInsight and the dependent Azure Storage account. The SSH authentication method for the cluster is username and password. For a template using SSH public key authentication, see https://azure.microsoft.com/resources/templates/101-hdinsight-linux-ssh-publickey/
Deploy HDInsight on Linux (w/ Azure Storage, SSH key)

Deploy to Azure
This template allows you to create an HDInsight cluster running Linux. This template also creates an Azure Storage account. The SSH authentication method for the cluster is username / public key.
HDInsight (Linux on existing Hive metastore, SSH, vnet)

Deploy to Azure
This template allows you to create an HDInsight cluster running Linux, on an existing Hive metastore and virtual network. The SSH authentication method for the cluster is username / password.
Deploy a HDInsight cluster with an edge node

Deploy to Azure
This template allows you to create an HDInsight cluster running Linux with an empty edge node. For more information, see https://docs.microsoft.com/azure/hdinsight/hdinsight-apps-use-edge-node
Deploy HDInsight cluster with existing default storage

Deploy to Azure
This template allows you to create an Hadoop cluster in HDInsight. The cluster uses an existing storage account as the default storage accout.
Deploy HDInsight cluster with existing linked storage

Deploy to Azure
This template allows you to create an Hadoop cluster in HDInsight and the dependent default storage account. The tempalte also links an existing storage account. The linked storage account usually contains the business data.
Deploy a HDInsight cluster and a SQL database

Deploy to Azure
This template allows you to create a HDInsight cluster and a SQL Database for testing Sqoop.
HDInsight cluster with TLS version 1.2 or newer

Deploy to Azure
This template allows you to create a HDInsight cluster with TLS protocol enforced to 1.2 or newer.
Deploy an R-server HDInsight cluster

Deploy to Azure
This template allows you to create an HDInsight cluster running Linux with R Server for HDInsight. This template also creates an Azure Storage account. The SSH authentication method for the cluster is username / password.
Deploy a secure VNet and a HDInsight cluster within the VNet

Deploy to Azure
This template allows you to create an Azure VNet and an HDInsight Hadoop cluster running Linux within the VNet.
Deploy a Spark cluster in Azure HDInsight

Deploy to Azure
This template allows you to create a Spark cluster in Azure HDInsight.
Deploy a Spark cluster in a VNet

Deploy to Azure
This template allows you to create an Azure VNet and an HDInsight Spark cluster within the VNet.
Deploy HDInsight on new Data Lake Store and Storage

Deploy to Azure
This template allows you to deploy a new Linux HDInsight cluster with new Data Lake Store and Storage accounts.
Creates an HDInsight cluster running Apache Spark 1.4.1.

Deploy to Azure
Creates an HDInsight linux cluster running Apache Spark 1.4.1.
Creates an HDInsight cluster running ADAM

Deploy to Azure
Creates an HDInsight linux cluster running the genomics analysis platform ADAM
Create HDInsight Linux Cluster and run a script action

Deploy to Azure
Template creates an HDInsight Linux cluster in a virtual network and then runs a custom script action on every node and sets environment var.