New-​Azure​HD​Insight​Map​Reduce​Job​Definition

Defines a new MapReduce job.

Syntax

New-AzureHDInsightMapReduceJobDefinition
   [-Arguments <String[]>]
   -ClassName <String>
   [-Defines <Hashtable>]
   [-Files <String[]>]
   -JarFile <String>
   [-JobName <String>]
   [-LibJars <String[]>]
   [-Profile <AzureSMProfile>]
   [-StatusFolder <String>]
   [<CommonParameters>]

Description

This version of Azure PowerShell HDInsight is deprecated. These cmdlets will be removed by January 1, 2017. Please use the newer version of Azure PowerShell HDInsight.

For information about how to use the new HDInsight to create a cluster, see Create Linux-based clusters in HDInsight using Azure PowerShell (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-create-linux-clusters-azure-powershell/). For information about how to submit jobs by using Azure PowerShell and other approaches, see Submit Hadoop jobs in HDInsight (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-submit-hadoop-jobs-programmatically/). For reference information about Azure PowerShell HDInsight, see Azure HDInsight Cmdlets (https://msdn.microsoft.com/en-us/library/mt438705.aspx).

The New-AzureHDInsightMapReduceJobDefinition cmdlet defines a new MapReduce job to run on an Azure HDInsight cluster.

Examples

Example 1: Define a MapReduce job, run the job, and get the output

PS C:\>$SubId = (Get-AzureSubscription -Current).SubscriptionId
PS C:\> $ClusterName = "MyCluster" 
PS C:\> $WordCountJob = New-AzureHDInsightMapReduceJobDefinition -JarFile "/Example/Apps/Hadoop-examples.jar" -ClassName "WordCount" -Defines @{ "mapred.map.tasks" = "3" } -Arguments "/Example/Data/Gutenberg/Davinci.txt", "/Example/Output/WordCount" 
PS C:\> $WordCountJob | Start-AzureHDInsightJob -Cluster $ClusterName 
    | Wait-AzureHDInsightJob -Subscription $SubId -WaitTimeoutInSeconds 3600 
    | Get-AzureHDInsightJobOutput -Cluster $ClusterName -Subscription $SubId -StandardError

The first command gets the ID of the current subscription, and then stores it in the $SubId variable.

The second command assigns the name MyCluster to the $Clustername variable.

The third command uses the New-AzureHDInsightMapReduceJobDefinition cmdlet to create a MapReduce job definition, and then store it in the $WordCountJob variable.

The fourth command performs a sequence of operations by using these cmdlets:

- Start-AzureHDInsightJob to start the job on $ClusterName.

  • Wait-AzureHDInsightJob to wait for the job to finish and to display the progress toward completion.
  • Get-AzureHDInsightJobOutput to get the job output.

Required Parameters

-ClassName

Specifies the name of the job class in the Java Archive (JAR) file.

Type:String
Aliases:Class
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-JarFile

Specifies the fully qualified name of a JAR file that contains the code and dependencies of a MapReduce job.

Type:String
Aliases:Jar
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False

Optional Parameters

-Arguments

Specifies an array of arguments for a Hadoop job. The arguments are passed as command-line arguments to each task.

Type:String[]
Aliases:Args
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Defines

Specifies Hadoop configuration values to set when the job runs.

Type:Hashtable
Aliases:Params
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Files

Specifies an array of WASB files that are required for a job.

Type:String[]
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-JobName

Specifies the name of a MapReduce job. This parameter is optional. If you do not specify this parameter, the value of the ClassName parameter is used.

Type:String
Aliases:Name
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-LibJars

Specifies an array of LibJar references of the job.

Type:String[]
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Profile

Specifies the Azure profile from which this cmdlet reads. If you do not specify a profile, this cmdlet reads from the local default profile.

Type:AzureSMProfile
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-StatusFolder

Specifies the location of the folder that contains standard outputs and error outputs for a job, including its exit code and task logs.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False