Share via


New-AzureHDInsightSqoopJobDefinition

[This topic is pre-release documentation and is subject to change in future releases. Blank topics are included as placeholders.]

New-AzureHDInsightSqoopJobDefinition

Defines a new Sqoop job.

Syntax

Parameter Set: Default
New-AzureHDInsightSqoopJobDefinition [-Command <String> ] [-File <String> ] [-Files <String[]> ] [-StatusFolder <String> ] [ <CommonParameters>]

Detailed Description

Defines a Sqoop job to be run on an HDInsight cluster.

This topic describes the cmdlet in the 0.8.1 version of the Microsoft Azure PowerShell module. To find out the version of the module you're using, from the Azure PowerShell console, type (get-module azure).version.

Parameters

-Command<String>

Specifies a Sqoop command and its arguments.

Aliases

none

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-File<String>

The path to a script file that contains the commands to be executed. The script file must be located on wasb.

Aliases

QueryFile

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-Files<String[]>

The collection of files required for the execution of the job. Use wasb file references here.

Aliases

none

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

-StatusFolder<String>

Location of the status folder where the standard and error outputs of the Sqoop job will be stored, including its exit code and task logs.

Aliases

none

Required?

false

Position?

named

Default Value

none

Accept Pipeline Input?

false

Accept Wildcard Characters?

false

<CommonParameters>

This cmdlet supports the common parameters: -Verbose, -Debug, -ErrorAction, -ErrorVariable, -OutBuffer, and -OutVariable. For more information, see  about_CommonParameters (https://go.microsoft.com/fwlink/p/?LinkID=113216).

Inputs

The input type is the type of the objects that you can pipe to the cmdlet.

Outputs

The output type is the type of the objects that the cmdlet emits.

Notes

  • Sqoop is a tool designed to transfer data between Hadoop clusters and relational databases. You can use Sqoop to import data from a SQL database into an Hadoop Distributed File System (HDFS), transform the data with Hadoop MapReduce, and then export the data from the HDFS back into the SQL database.

Examples

Import data

Define a Sqoop job that that imports all the rows of a table from a Azure SQL Database to an HDInsight cluster.

PS C:\> $sqoopJobDef = New-AzureHDInsightSqoopJobDefinition -Command "import --connect jdbc:sqlserver://<SQLDatabaseServerName>.database.windows.net:1433;username=<SQLDatabasUsername>@<SQLDatabaseServerName>; password=<SQLDatabasePassword>; database=<SQLDatabaseDatabaseName> --table <TableName> --target-dir wasb://<ContainerName>@<WindowsAzureStorageAccountName>.blob.core.windows.net/<Path>"