New-​Azure​HD​Insight​Streaming​Map​Reduce​Job​Definition

Defines a new streaming MapReduce job.

Syntax

New-AzureHDInsightStreamingMapReduceJobDefinition
   [-Arguments <String[]>]
   [-CmdEnv <String[]>]
   [-Combiner <String>]
   [-Defines <Hashtable>]
   [-Files <String[]>]
   [-InputPath <String>]
   [-JobName <String>]
   [-Mapper <String>]
   [-OutputPath <String>]
   [-Profile <AzureSMProfile>]
   [-Reducer <String>]
   [-StatusFolder <String>]
   [<CommonParameters>]

Description

This version of Azure PowerShell HDInsight is deprecated. These cmdlets will be removed by January 1, 2017. Please use the newer version of Azure PowerShell HDInsight.

For information about how to use the new HDInsight to create a cluster, see Create Linux-based clusters in HDInsight using Azure PowerShell (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-create-linux-clusters-azure-powershell/). For information about how to submit jobs by using Azure PowerShell and other approaches, see Submit Hadoop jobs in HDInsight (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-submit-hadoop-jobs-programmatically/). For reference information about Azure PowerShell HDInsight, see Azure HDInsight Cmdlets (https://msdn.microsoft.com/en-us/library/mt438705.aspx).

The New-AzureHDInsightStreamingMapReduceJobDefinition cmdlet defines a new job definition object that represents the parameters of a Hadoop streaming job.

Examples

Example 1: Create a streaming MapReduce job definition

PS C:\>$StreamingWordCount = New-AzureHDInsightStreamingMapReduceJobDefinition -Files "/Example/Apps/WordCount.exe", "/Example/Apps/Cat.exe" -InputPath "/Example/Data/Gutenberg/Davinci.txt" -OutputPath "/Example/Data/StreamingOutput/WordCount.txt" -Mapper "Cat.exe" -Reducer "WordCount.exe"

This command creates the specified streaming MapReduce job definition, and then stores it in the $StreamingWordCount variable.

Optional Parameters

-Arguments

Specifies an array of arguments for a Hadoop job. The arguments are passed as command-line arguments to each task.

Type:String[]
Aliases:Args
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-CmdEnv

Specifies an array of command-line environment variables to set when a job runs on data nodes.

Type:String[]
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Combiner

Specifies a Combiner file name.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Defines

Specifies Hadoop configuration values to set when the job runs.

Type:Hashtable
Aliases:Params
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Files

Specifies an array of files that are required for a job.

Type:String[]
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-InputPath

Specifies the WASB path to the input files.

Type:String
Aliases:Input
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-JobName

Specifies the name of the new MapReduce job definition. This parameter is optional.

Type:String
Aliases:Name
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Mapper

Specifies a Mapper file name.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-OutputPath

Specifies the WASB path for the job output.

Type:String
Aliases:Output
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Profile

Specifies the Azure profile from which this cmdlet reads. If you do not specify a profile, this cmdlet reads from the local default profile.

Type:AzureSMProfile
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Reducer

Specifies a Reducer file name.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-StatusFolder

Specifies the folder that contains the standard outputs and error outputs for the job, including its exit code and task logs.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False