Create Linux-based clusters in HDInsight using the Azure portal

The Azure portal is a web-based management tool for services and resources hosted in the Microsoft Azure cloud. In this article, you learn how to create Linux-based HDInsight clusters using the portal.



Billing for HDInsight clusters is prorated per minute, whether you are using them or not. Be sure to delete your cluster after you have finished using it. For more information, see How to delete an HDInsight cluster.

  • An Azure subscription. See Get Azure free trial.
  • A modern web browser. The Azure portal uses HTML5 and Javascript, and may not function correctly in older web browsers.

Create clusters

The Azure portal exposes most of the cluster properties. Using Azure Resource Manager template, you can hide many details. For more information, see Create Linux-based Hadoop clusters in HDInsight using Azure Resource Manager templates.


The Secure transfer required feature enforces all requests to your account through a secure connection. This feature is only supported by HDInsight cluster version 3.6 or newer. For more information, see Create Hadoop cluster with secure transfer storage accounts in Azure HDInsight.

  1. Sign in to the Azure portal.
  2. Click +, click Intelligence + Analytics, and then click HDInsight.

    Creating a new cluster in the Azure portal

  3. In the HDInsight blade, click Custom (size, settings, apps), click Basics, and then enter the following information.

    Creating a new cluster in the Azure portal

    • Enter Cluster Name: This name must be globally unique.

    • From the Subscription drop-down, select the Azure subscription that is used for the cluster.

    • Click Cluster type, and then select the type of cluster (Hadoop, Spark, etc.) you want to create. For Operating system, click Linux and then select a version. Use the default version if you don't know what to choose. For more information, see HDInsight cluster versions.

      For Hadoop, Spark, and Interactive Query cluster types, you can also select to install the Enterprise Security Package. Enterprise Security Package enables security features such as Azure Active Directory integration and Apache Ranger for the clusters. For more information, see Enterprise Security Package in Azure HDInsight.

      Enable Enterprise Security Package


      HDInsight clusters come in a variety of types, which correspond to the workload or technology that the cluster is tuned for. There is no supported method to create a cluster that combines multiple types, such as Storm and HBase on one cluster.

    • For Cluster login username and Cluster login password, provide the username and password for the admin user.

    • Enter an SSH Username and if you want to have the SSH password same as the admin password you specified earlier, select the Use same password as cluster login check box. If not, provide either a PASSWORD or PUBLIC KEY, which will be used to authenticate the SSH user. Using a public key is the recommended approach. Click Select at the bottom to save the credentials configuration.

      For information, see Use SSH with HDInsight.

    • For Resource group, specify whether you want to create a new resource group or use an existing one.

    • Specify a data center location where the cluster is created.

    • Click Next.

  4. For Storage, specify whether you want Azure Storage (WASB) or Data Lake Storage as your default storage. Look at the table below for more information.

    Creating a new cluster in the Azure portal

    Storage Description
    Azure Storage Blobs as default storage
    • For Primary Storage type, select Azure Storage. After that, for Selection method, you can choose My subscriptions if you want to specify a storage account that is part of your Azure subscription and then select the storage account. Otherwise, click Access key and provide the information for the storage account that you want to choose from outside your Azure subscription.
    • For Default container, you can choose to go with the default container name suggested by the portal or specify your own.
    • If you are using WASB as default storage, you can (optionally) click Additional Storage Accounts to specify additional storage accounts to associate with the cluster. For Azure Storage Keys, click Add a storage key, and then you can provide a storage account from your Azure subscriptions or from other subscriptions (by providing the storage account access key).
    • If you are using WASB as default storage, you can (optionally) click Data Lake Store access to specify Azure Data Lake Storage as additional storage. For more information, see Quickstart: Set up clusters in HDInsight.
    Azure Data Lake Storage as default storage For Primary storage type, select Azure Data Lake Storage Gen1 or Azure Data Lake Storage Gen2 (Preview) and then refer to the article Quickstart: Set up clusters in HDInsight for instructions.
    External metastores Optionally, you can specify a SQL database to save Hive and Oozie metadata associated with the cluster. For Select a SQL database for Hive select a SQL database, and then provide the username/password for the database. Repeat these steps for Oozie metadata.

    Some considerations while using Azure SQL database for metastores.
    • The Azure SQL database used for the metastore must allow connectivity to other Azure services, including Azure HDInsight. On the Azure SQL database dashboard, on the right side, click the server name. This is the server on which the SQL database instance is running. Once you are on the server view, click Configure, and then for Azure Services, click Yes, and then click Save.
    • When creating a metastore, do not use a database name that contains dashes or hyphens, as this can cause the cluster creation process to fail.

    Click Next.


    Using an additional storage account in a different location than the HDInsight cluster is not supported.

  5. Optionally, click Applications to install applications that work with HDInsight clusters. These applications can be developed by Microsoft, independent software vendors (ISV) or by yourself. For more information, see Install HDInsight applications.

  6. Click Cluster size to display information about the nodes that are used for this cluster. Set the number of worker nodes that you need for the cluster. The estimated cost of running the cluster is also shown.

    Node pricing tiers


    If you plan on more than 32 worker nodes, either at cluster creation or by scaling the cluster after creation, then you must select a head node size with at least 8 cores and 14 GB RAM.

    For more information on node sizes and associated costs, see HDInsight pricing.

    Click Next to save the node pricing configuration.

  7. Click Advanced settings to configure other optional settings such as using Script Actions to customize a cluster to install custom components or joining a Virtual Network. Look at the table below for more information.

    Node pricing tiers

    Option Description
    Script Actions Use this option if you want to use a custom script to customize a cluster, as the cluster is being created. For more information about script actions, see Customize HDInsight clusters using Script Action.
    Virtual Network Select an Azure virtual network and the subnet if you want to place the cluster into a virtual network. For information on using HDInsight with a Virtual Network, including specific configuration requirements for the Virtual Network, see Extend HDInsight capabilities by using an Azure Virtual Network.

    Click Next.

  8. For Summary, verify the information you entered earlier and then click Create.

    Node pricing tiers


    It takes some time for the cluster to be created, usually around 15 minutes. Use the tile on the Startboard, or the Notifications entry on the left of the page to check on the provisioning process.

  9. Once the creation process completes, click the tile for the cluster from the Startboard. The cluster window provides the following information.

    Cluster interface

    Use the following to understand the icons at the top.

    • Overview tab provides all the essential information about the cluster such as the name, the resource group it belongs to, the location, the operating system, URL for the cluster dashboard, etc.
    • Dashboard directs you to the Ambari portal associated with the cluster.
    • Secure Shell: Information needed to access the cluster using SSH.
    • Scale cluster lets you increase the number of worker nodes associated with the cluster.
    • Delete: Deletes the HDInsight cluster.

Customize clusters

Delete the cluster


Billing for HDInsight clusters is prorated per minute, whether you are using them or not. Be sure to delete your cluster after you have finished using it. For more information, see How to delete an HDInsight cluster.


If you run into issues with creating HDInsight clusters, see access control requirements.

Next steps

Now that you have successfully created an HDInsight cluster, use the following to learn how to work with your cluster:

Hadoop clusters

HBase clusters

Storm clusters

Spark clusters