New-AzureHDInsightPigJobDefinition

[This topic is pre-release documentation and is subject to change in future releases. Blank topics are included as placeholders.]

New-AzureHDInsightPigJobDefinition

Defines a new pig job for an Azure HDInsight service.

Syntax

Parameter Set: Default
New-AzureHDInsightPigJobDefinition [-Arguments <String[]> ] [-File <String> ] [-Files <String[]> ] [-Query <String> ] [-StatusFolder <String> ] [ <CommonParameters>]

Detailed Description

Defines a new pig job for an Azure HDInsight service.

This topic describes the cmdlet in the 0.8.1 version of the Microsoft Azure PowerShell module. To find out the version of the module you're using, from the Azure PowerShell console, type (get-module azure).version.

Parameters

-Arguments<String[]>

Arguments of the Hadoop job. The arguments will be passed as command line arguments to each task.

Aliases

Args

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-File<String>

The path to a file that contains the query to be executed. Use this File parameter in place of a query parameter.

Aliases

QueryFile

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-Files<String[]>

A collections of files associated with the pig job.

Aliases

none

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-Query<String>

The query to be executed by the pig job.

Aliases

QueryText

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-StatusFolder<String>

Contains the standard and error outputs for the job, including its exit code and task logs.

Aliases

none

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

<CommonParameters>

This cmdlet supports the common parameters: -Verbose, -Debug, -ErrorAction, -ErrorVariable, -OutBuffer, and -OutVariable. For more information, see  about_CommonParameters (https://go.microsoft.com/fwlink/p/?LinkID=113216).

Inputs

The input type is the type of the objects that you can pipe to the cmdlet.

Outputs

The output type is the type of the objects that the cmdlet emits.

Examples

PS C:\> 

Define a Pig job

Define a new pig job for an Azure HDInsight service.

PS C:\> # Provide the HDInsight cluster name
$subscriptionName = "<SubscriptionName>"
$clusterName = "<HDInsightClusterName>"

# Create the Pig job definition
$0 = '$0';
$QueryString =  "LOGS = LOAD 'wasb:///example/data/sample.log';" +
                "LEVELS = foreach LOGS generate REGEX_EXTRACT($0, '(TRACE|DEBUG|INFO|WARN|ERROR|FATAL)', 1)  as LOGLEVEL;" +
                "FILTEREDLEVELS = FILTER LEVELS by LOGLEVEL is not null;" +
                "GROUPEDLEVELS = GROUP FILTEREDLEVELS by LOGLEVEL;" +
                "FREQUENCIES = foreach GROUPEDLEVELS generate group as LOGLEVEL, COUNT(FILTEREDLEVELS.LOGLEVEL) as COUNT;" +
                "RESULT = order FREQUENCIES by COUNT desc;" +
                "DUMP RESULT;" 

$pigJobDefinition = New-AzureHDInsightPigJobDefinition -Query $QueryString