Tutorial: Create Azure HDInsight clusters with Azure Automation

Azure Automation allows you to create scripts that run in the cloud and manage Azure resources on-demand or based on a schedule. This article describes how to create PowerShell runbooks to create and delete Azure HDInsight clusters.

In this tutorial, you learn how to:

  • Install modules necessary for interacting with HDInsight.
  • Create and store credentials needed during cluster creation.
  • Create a new Azure Automation runbook to create an HDInsight cluster.

If you don’t have an Azure subscription, create a free account before you begin.

Prerequisites

Install HDInsight modules

  1. Sign in to the Azure portal.

  2. Select your Azure Automation Accounts.

  3. Select Modules gallery under Shared Resources.

  4. Type AzureRM.Profile in the box and hit enter to search. Select the available search result.

  5. On the AzureRM.profile screen, select Import. Check the box to update Azure modules and then select OK.

    import AzureRM.profile module.

  6. Return to the modules gallery by selecting Modules gallery under Shared Resources.

  7. Type HDInsight. Select AzureRM.HDInsight.

    browse HDInsight modules.

  8. On the AzureRM.HDInsight panel, select Import and OK.

    import AzureRM.HDInsight module.

Create credentials

  1. Under Shared Resources, select Credentials.

  2. Select Add a credential.

  3. Enter the required information on the New Credential panel. This credential is to store the cluster password, which enables you to log in to Ambari.

    Property Value
    Name cluster-password
    User name admin
    Password SECURE_PASSWORD
    Confirm password SECURE_PASSWORD
  4. Select Create.

  5. Repeat the same process for a new credential ssh-password with username sshuser and a password of your choice. Select Create. This credential is to store the SSH password for your cluster.

    create credential.

Create a runbook to create a cluster

  1. Select Runbooks under Process Automation.

  2. Select Create a runbook.

  3. On the Create a runbook panel, enter a name for the runbook, such as hdinsight-cluster-create. Select PowerShell from the Runbook type dropdown.

  4. Select Create.

    create runbook.

  5. Enter the following code on the Edit PowerShell Runbook screen and select Publish:

    publish runbook.

    Param
    (
      [Parameter (Mandatory= $true)]
      [String] $subscriptionID,
    
      [Parameter (Mandatory= $true)]
      [String] $resourceGroup,
    
      [Parameter (Mandatory= $true)]
      [String] $storageAccount,
    
      [Parameter (Mandatory= $true)]
      [String] $containerName,
    
      [Parameter (Mandatory= $true)]
      [String] $clusterName
    )
    ### Authenticate to Azure 
    $Conn = Get-AutomationConnection -Name 'AzureRunAsConnection'
    Add-AzureRMAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint
    
    # Set cluster variables
    $storageAccountKey = (Get-AzureRmStorageAccountKey –Name $storageAccount –ResourceGroupName $resourceGroup)[0].value 
    
    # Setting cluster credentials
    
    #Automation credential for Cluster Admin
    $clusterCreds = Get-AutomationPSCredential –Name 'cluster-password'
    
    #Automation credential for user to SSH into cluster
    $sshCreds = Get-AutomationPSCredential –Name 'ssh-password' 
    
    $clusterType = "Hadoop" #Use any supported cluster type (Hadoop, HBase, etc.)
    $clusterOS = "Linux"
    $clusterWorkerNodes = 3
    $clusterNodeSize = "Standard_D3_v2"
    $location = Get-AzureRmStorageAccount –StorageAccountName $storageAccount –ResourceGroupName $resourceGroup | %{$_.Location}
    
    ### Provision HDInsight cluster
    New-AzureRmHDInsightCluster –ClusterName $clusterName –ResourceGroupName $resourceGroup –Location $location –DefaultStorageAccountName "$storageAccount.blob.core.windows.net" –DefaultStorageAccountKey $storageAccountKey -DefaultStorageContainer $containerName –ClusterType $clusterType –OSType $clusterOS –Version “3.6” –HttpCredential $clusterCreds –SshCredential $sshCreds –ClusterSizeInNodes $clusterWorkerNodes –HeadNodeSize $clusterNodeSize –WorkerNodeSize $clusterNodeSize
    

Create a runbook to delete a cluster

  1. Select Runbooks under Process Automation.

  2. Select Create a runbook.

  3. On the Create a runbook panel, enter a name for the runbook, such as hdinsight-cluster-delete. Select PowerShell from the Runbook type dropdown.

  4. Select Create.

  5. Enter the following code on the Edit PowerShell Runbook screen and select Publish:

    Param
    (
      [Parameter (Mandatory= $true)]
      [String] $clusterName
    )
    
    ### Authenticate to Azure 
    $Conn = Get-AutomationConnection -Name 'AzureRunAsConnection'
    Add-AzureRMAccount -ServicePrincipal -Tenant $Conn.TenantID -ApplicationId $Conn.ApplicationID -CertificateThumbprint $Conn.CertificateThumbprint
    
    Remove-AzureRmHDInsightCluster -ClusterName $clusterName
    

Execute Runbooks

Create a cluster

  1. View the list of Runbooks for your Automation account, by selecting Runbooks under Process Automation.

  2. Select hdinsight-cluster-create, or the name that you used when creating your cluster creation runbook.

  3. Select Start to execute the runbook immediately. You can also schedule runbooks to run periodically. See Scheduling a runbook in Azure Automation

  4. Enter the required parameters for the script and select OK. This creates a new HDInsight cluster with the name that you specified in the CLUSTERNAME parameter.

    execute create cluster runbook.

Delete a cluster

Delete the cluster by selecting the hdinsight-cluster-delete runbook that you created. Select Start, enter the CLUSTERNAME parameter and select OK.

Clean up resources

When no longer needed, delete the Azure Automation Account that was created to avoid unintended charges. To do so, navigate to the Azure portal, select the resource group where you created the Azure Automation Account, select the Automation Account and then select Delete.

Next steps