Graph class

Definition

A class to define a pipeline run graph.

Graph(name, context)
Inheritance
builtins.object
Graph

Parameters

name
str

Name of the graph.

context
_GraphContext

The current graph context.

Methods

add_datasource_node(name, datasource=None, datasource_builder=None, datapath_param_name=None)

Add a datasource node to the graph.

add_module_node(name, input_bindings, output_bindings=None, param_bindings=None, module=None, module_builder=None, module_wiring=None)

Add a module node to the graph.

connect(source_port, dest_port)

Connect two ports and creates an Edge.

delete_node(node_id)

Delete a node from the graph.

finalize(dry_run=None, regenerate_outputs=False)

Finalize resources for nodes in the graph.

generate_yaml()

Generate the yaml representation of the graph.

get_node(node_id)

Get a node by id.

sequence(nodes)

Configure a list of nodes to run in a sequence following the first node in the list.

submit(pipeline_parameters=None, continue_on_step_failure=False, regenerate_outputs=False, parent_run_id=None, **kwargs)

Submit the graph to run in the cloud.

validate()

Validate graph. Returns a list of errors.

add_datasource_node(name, datasource=None, datasource_builder=None, datapath_param_name=None)

Add a datasource node to the graph.

add_datasource_node(name, datasource=None, datasource_builder=None, datapath_param_name=None)

Parameters

name
str

Name of the node.

datasource
DataSource

Datasource for this node.

default value: None
datasource_builder
_DatasourceBuilder

_DatasourceBuilder for this node.

default value: None
datapath_param_name
str

Datapath parameter name.

default value: None

Returns

node

Return type

add_module_node(name, input_bindings, output_bindings=None, param_bindings=None, module=None, module_builder=None, module_wiring=None)

Add a module node to the graph.

add_module_node(name, input_bindings, output_bindings=None, param_bindings=None, module=None, module_builder=None, module_wiring=None)

Parameters

name
str

Name of the node

input_bindings
list

List of input port bindings.

output_bindings
list

List of output port bindings.

default value: None
param_bindings
dict

Dictionary of name-value pairs for parameter assignments.

default value: None
module
Module

Module for this node

default value: None
module_builder
_ModuleBuilder

_ModuleBuilder for this node

default value: None
module_wiring
{}

mapping between the node's inputs/outputs and the module inputs/outputs. Holds two keys,inputs and outputs, each mapped to a dict whose keys are the module's inputs/outputs names, and the values are the node's ports

default value: None

Returns

node

Return type

connect(source_port, dest_port)

Connect two ports and creates an Edge.

connect(source_port, dest_port)

Parameters

source_port
OutputPort

Output port from the node that is the source of the connection

dest_port
InputPort

Input port from the node that is the destination of the connection

Returns

edge

Return type

delete_node(node_id)

Delete a node from the graph.

delete_node(node_id)

Parameters

node_id
str

node id

finalize(dry_run=None, regenerate_outputs=False)

Finalize resources for nodes in the graph.

finalize(dry_run=None, regenerate_outputs=False)

Parameters

dry_run
bool

Set to True to verify that the graph can be built without making any external API calls to Azure ML service.

default value: None
regenerate_outputs
bool

Set to True to force a new run (disallows module/datasource reuse).

default value: False

Returns

Dictionary of {node_id, (resource_id, is_new_resource)}

Return type

generate_yaml()

Generate the yaml representation of the graph.

generate_yaml()

Returns

Return the yaml dict

Return type

get_node(node_id)

Get a node by id.

get_node(node_id)

Parameters

node_id
str

The node id.

Returns

the node

Return type

sequence(nodes)

Configure a list of nodes to run in a sequence following the first node in the list.

sequence(nodes)

Parameters

nodes
list

The list of nodes.

submit(pipeline_parameters=None, continue_on_step_failure=False, regenerate_outputs=False, parent_run_id=None, **kwargs)

Submit the graph to run in the cloud.

submit(pipeline_parameters=None, continue_on_step_failure=False, regenerate_outputs=False, parent_run_id=None, **kwargs)

Parameters

pipeline_parameters
dict

Parameters for pipeline execution. Optional.

default value: None
continue_on_step_failure
bool

Indicates whether to let the experiment continue executing if one step fails. If True, only steps that have no dependency on the output of the failed step will continue execution.

default value: False
regenerate_outputs
bool

Set to True to force a new run (disallows module/datasource reuse)

default value: False
parent_run_id

Optional run ID to set for the parent run of this pipeline run, which is reflected in RunHistory. The parent run must belong to same experiment as this pipeline is being submitted to.

default value: None
kwargs
dict

Custom keyword arguments, reserved for future development

Returns

a PipelineRun

Return type

validate()

Validate graph. Returns a list of errors.

validate()

Returns

List errors.

Return type

Attributes

datasource_nodes

Get a list containing all datasource nodes.

Returns

List of Node

Return type

edges

Get an iterator of edges.

Returns

a list of Edge

Return type

module_nodes

Get a list containing all module nodes.

Returns

List of Node

Return type

node_dict

Get a dictionary containing all nodes.

Returns

Dictionary of {node Id, Node}

Return type

node_name_dict

Get a dictionary containing all nodes indexed by name.

Returns

Dictionary of {node name, Node}

Return type

nodes

Get a list containing all nodes.

Returns

List of Node

Return type

params

Get a dictionary containing all graph parameters. Values are literal types or data reference as JSON string.

Returns

Dictionary of {param name, param value}

Return type