InputPortBinding Class
Defines a binding from a source to an input of a pipeline step.
An InputPortBinding can be used as an input to a step. The source can be a PipelineData, PortDataReference, DataReference, PipelineDataset, or OutputPortBinding.
InputPortBinding is useful to specify the name of the step input, if it should be different than the name of the bind object (i.e. to avoid duplicate input/output names or because the step script needs an input to have a certain name). It can also be used to specify the bind_mode for PythonScriptStep inputs.
Initialize InputPortBinding.
- Inheritance
-
builtins.objectInputPortBinding
Constructor
InputPortBinding(name, bind_object=None, bind_mode='mount', path_on_compute=None, overwrite=None, is_resource=False, additional_transformations=None, **kwargs)
Parameters
- name
- str
Name of the input port to bind, which can contain only letters, digits, and underscores.
- bind_object
- Union[PortDataReference, DataReference, PipelineData, OutputPortBinding, PipelineDataset]
The object to bind to the input port.
- bind_mode
- str
Specifies whether the consuming step will use "download" or "mount" method to access the data.
- path_on_compute
- str
For "download" mode, the local path the step will read the data from.
- overwrite
- bool
For "download" mode, indicate whether to overwrite existing data.
- is_resource
- bool
Indicated whether input is a resource. Resources are downloaded to the script folder and provide a way to change the behavior of script at run-time.
- additional_transformations
- <xref:azureml.dataprep.Dataflow>
Additional transformations to apply to the input. This will only be applied if the output of the previous step is an Azure Machine Learning Dataset.
- name
- str
Name of the input port to bind, which can contain only letters, digits, and underscores.
- bind_object
- Union[PortDataReference, DataReference, PipelineData, OutputPortBinding, PipelineDataset]
The object to bind to the input port.
- bind_mode
- str
Specifies whether the consuming step will use "download" or "mount" or "direct" method to access the data.
- is_resource
- bool
Indicate whether input is a resource. Resources are downloaded to the script folder and provide a way to change the behavior of script at run-time.
- additional_transformations
- <xref:azureml.dataprep.Dataflow>
Additional transformations to apply to the input. This will only be applied if the output of the previous step is an Azure Machine Learning Dataset.
Remarks
InputPortBinding is used to specify data dependencies in a Pipeline, it represents an input which a step requires for execution. InputPortBindings have a source, called bind_object, which specifies how the input data is produced.
PipelineData and OutputPortBinding can be used as the bind_object for an InputPortBinding to specify that the input to the step will be produced by another step in the Pipeline.
An example to build a Pipeline using InputPortBinding and PipelineData is as follows:
from azureml.pipeline.core import PipelineData, InputPortBinding, Pipeline
from azureml.pipeline.steps import PythonScriptStep
step_1_output = PipelineData("output", datastore=datastore, output_mode="mount")
step_1 = PythonScriptStep(
name='prepare data',
script_name="prepare_data.py",
compute_target=compute,
arguments=["--output", step_1_output],
outputs=[step_1_output]
)
step_2_input = InputPortBinding("input", bind_object=step_1_output)
step_2 = PythonScriptStep(
name='train',
script_name="train.py",
compute_target=compute,
arguments=["--input", step_2_input],
inputs=[step_2_input]
)
pipeline = Pipeline(workspace=workspace, steps=[step_1, step_2])
In this example the "train" step requires the output of the "prepare data" step as an input.
PortDataReference, DataReference, or PipelineDataset can be used as the bind_object for an InputPortBinding to specify that the input to the step already exists at a specified location.
An example to build a Pipeline using InputPortBinding and DataReference is as follows:
from azureml.data.data_reference import DataReference
from azureml.pipeline.core import InputPortBinding, Pipeline
from azureml.pipeline.steps import PythonScriptStep
data_reference = DataReference(datastore=datastore, path_on_datastore='sample_data.txt', mode="mount")
step_1_input = InputPortBinding("input", bind_object=data_reference)
step_1 = PythonScriptStep(
name='train',
script_name="train.py",
compute_target=compute,
arguments=["--input", step_1_input],
inputs=[step_1_input]
)
pipeline = Pipeline(workspace=workspace, steps=[step_1])
In this example the "train" step requires the "sample_data.txt" file specified by the DataReference as an input.
Methods
as_resource |
Get a duplicate input port binding which can be used as a resource. |
get_bind_object_data_type |
Get the data type of the bind object. |
get_bind_object_name |
Get the name of the bind object. |
as_resource
Get a duplicate input port binding which can be used as a resource.
as_resource()
Returns
InputPortBinding with is_resource property set a True.
Return type
get_bind_object_data_type
Get the data type of the bind object.
get_bind_object_data_type()
Returns
The data type name.
Return type
get_bind_object_name
Get the name of the bind object.
get_bind_object_name()
Returns
The bind object name.
Return type
Attributes
additional_transformations
Get the additional transformations to apply to the input data.
Returns
The additional transformations to apply to the input data.
Return type
bind_mode
Get the mode ("download" or "mount" or "direct", "hdfs") the consuming step will use to access the data.
Returns
The bind mode ("download" or "mount" or "direct" or "hdfs").
Return type
bind_object
Get the object the InputPort will be bound to.
Returns
The bind object.
Return type
data_reference_name
Get the name of the data reference associated with the InputPortBinding.
Returns
The data reference name.
Return type
data_type
is_resource
name
overwrite
For "download" mode, indicate whether to overwrite existing data.
Returns
The overwrite property.
Return type
path_on_compute
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for