New-​Azure​Rm​HD​Insight​Streaming​Map​Reduce​Job​Definition

Creates a Streaming MapReduce job object.

Syntax

New-AzureRmHDInsightStreamingMapReduceJobDefinition
   [-Arguments <String[]>]
   [-CommandEnvironment <Hashtable>]
   [-Defines <Hashtable>]
   [-File <String>]
   [-Files <String[]>]
   -InputPath <String>
   [-Mapper <String>]
   [-OutputPath <String>]
   [-Reducer <String>]
   [-StatusFolder <String>]
   [<CommonParameters>]

Description

The New-AzureRmHDInsightStreamingMapReduceJobDefinition cmdlet defines a Streaming MapReduce job object for use with an Azure HDInsight cluster.

Examples

Example 1: Create a Streaming MapReduce job definition

PS C:\># Cluster info
PS C:\>$clusterName = "your-hadoop-001"
PS C:\>$clusterCreds = Get-Credential

# Streaming MapReduce job details
PS C:\>$statusFolder = "tempStatusFolder/"
PS C:\>$query = "SHOW TABLES"

PS C:\>New-AzureRmHDInsightStreamingMapReduceJobDefinition -StatusFolder $statusFolder `
            -Query $query `
        | Start-AzureRmHDInsightJob `
            -ClusterName $clusterName `
            -ClusterCredential $clusterCreds

This command creates a Streaming MapReduce job definition.

Required Parameters

-InputPath

Specifies the path to the input files.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False

Optional Parameters

-Arguments

Specifies an array of arguments for the job. The arguments are passed as command-line arguments to each task.

Type:String[]
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-CommandEnvironment

Specifies an array of command-line environment variables to set when a job runs on worker nodes.

Type:Hashtable
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Defines

Specifies Hadoop configuration values to set for when the job runs.

Type:Hashtable
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-File

Specifies the path to a file that contains a query to run. You can use this parameter instead of the Query parameter.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Files

Specifies a collection of files that are associated with a Hive job.

Type:String[]
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Mapper

Specifies a Mapper file name.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-OutputPath

Specifies the path for the job output.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-Reducer

Specifies a Reducer file name.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False
-StatusFolder

Specifies the location of the folder that contains standard outputs and error outputs for a job.

Type:String
Position:Named
Default value:None
Accept pipeline input:False
Accept wildcard characters:False