Invoke-AzureRmHDInsightHiveJob

Submits a Hive query to an HDInsight cluster and retrieves query results in one operation.

Warning

The AzureRM PowerShell module has been officially deprecated as of February 29, 2024. Users are advised to migrate from AzureRM to the Az PowerShell module to ensure continued support and updates.

Although the AzureRM module may still function, it's no longer maintained or supported, placing any continued use at the user's discretion and risk. Please refer to our migration resources for guidance on transitioning to the Az module.

Syntax

Invoke-AzureRmHDInsightHiveJob
      [-Arguments <String[]>]
      [-Files <String[]>]
      [-StatusFolder <String>]
      [-Defines <Hashtable>]
      [-File <String>]
      [-JobName <String>]
      [-Query <String>]
      [-RunAsFileJob]
      [-DefaultContainer <String>]
      [-DefaultStorageAccountName <String>]
      [-DefaultStorageAccountKey <String>]
      [-DefaultProfile <IAzureContextContainer>]
      [<CommonParameters>]

Description

The Invoke-AzureRmHDInsightHiveJob cmdlet submits a Hive query to an Azure HDInsight cluster and retrieves query results in one operation. Use the Use-AzureRmHDInsightCluster cmdlet before calling Invoke-AzureRmHDInsightHiveJob to specify which cluster will be used for the query.

Examples

Example 1: Submit a Hive query to an Azure HDInsight cluster

PS C:\># Primary storage account info
PS C:\> $storageAccountResourceGroupName = "Group"
PS C:\> $storageAccountName = "yourstorageacct001"
PS C:\> $storageAccountKey = (Get-AzureRmStorageAccountKey -ResourceGroupName $storageAccountResourceGroupName -Name $storageAccountName)[0].value


PS C:\> $storageContainer = "container001"

# Cluster info
PS C:\> $clusterName = "your-hadoop-001"
PS C:\> $clusterCreds = Get-Credential

# Hive job details
PS C:\> $statusFolder = "tempStatusFolder/"
PS C:\> $query = "SHOW TABLES"

PS C:\> Use-AzureRmHDInsightCluster `
            -ClusterCredential $clusterCreds `
            -ClusterName $clusterName

PS C:\> Invoke-AzureRmHDInsightHiveJob -StatusFolder $statusFolder `
            -Query $query `
            -DefaultContainer $storageAccountContainer `
            -DefaultStorageAccountName "$storageAccountName.blob.core.windows.net" `
            -DefaultStorageAccountKey $storageAccountKey

This command submits the query SHOW TABLES to the cluster named your-hadoop-001.

Parameters

-Arguments

Specifies an array of arguments for the job. The arguments are passed as command-line arguments to each task.

Type:String[]
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-DefaultContainer

Specifies the name of the default container in the default Azure Storage account that an HDInsight cluster uses.

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-DefaultProfile

The credentials, account, tenant, and subscription used for communication with azure

Type:IAzureContextContainer
Aliases:AzureRmContext, AzureCredential
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-DefaultStorageAccountKey

Specifies the account key for the default storage account that the HDInsight cluster uses.

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-DefaultStorageAccountName

Specifies the name of the default storage account that the HDInsight cluster uses.

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-Defines

Specifies Hadoop configuration values to set when a job runs.

Type:Hashtable
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-File

Specifies the path to a file in Azure Storage that contains the query to run. You can use this parameter instead of the Query parameter.

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-Files

Specifies a collection of files that are required for a Hive job.

Type:String[]
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-JobName

Specifies the name of a Hive job. If you do not specify this parameter, this cmdlet uses the default value: "Hive: <first 100 characters of Query>".

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-Query

Specifies the Hive query.

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-RunAsFileJob

Indicates that this cmdlet creates a file in the default Azure storage account in which to store a query. This cmdlet submits the job that references this file as a script to run. You can use this functionality to handle special characters such as percent sign (%) that would fail on a job submission through Templeton, because Templeton interprets a query with a percent sign as a URL parameter.

Type:SwitchParameter
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

-StatusFolder

Specifies the location of the folder that contains standard outputs and error outputs for a job.

Type:String
Position:Named
Default value:None
Required:False
Accept pipeline input:False
Accept wildcard characters:False

Inputs

None

Outputs

String