HDFSOutputDatasetConfig Class

Represent how to output to a HDFS path and be promoted as a FileDataset.

Initialize a HDFSOutputDatasetConfig.

Inheritance
HDFSOutputDatasetConfig
HDFSOutputDatasetConfig

Constructor

HDFSOutputDatasetConfig(name=None, destination=None)

Parameters

name
str
Required

The name of the output specific to this run. This is generally used for lineage purposes. If set to None, we will automatically generate a name.

destination
tuple
Required

The destination of the output. If set to None, it will be outputted to workspaceblobstore datastore, under the path /dataset/{run-id}/{output-name}, where run-id is the Run's ID and the output-name is the output name from the name parameter above. The destination is a tuple where the first item is the datastore and the second item is the path within the datastore.

Remarks

You can pass the HDFSOutputDatasetConfig as an argument of a run and it will be automatically translated to HDFS path.

Methods

as_input

Specify how to consume the output as an input in subsequent pipeline steps.

as_input

Specify how to consume the output as an input in subsequent pipeline steps.

as_input(name=None)

Parameters

name
str
Required

The name of the input specific to the run.

Returns

A DatasetConsumptionConfig instance describing how to deliver the input data.

Return type