DeriveColumnByExampleBuilder class

Definition

Interactive object that can be used to learn program for deriving a column based on a set of source columns and examples.

DeriveColumnByExampleBuilder(dataflow: azureml.dataprep.api.dataflow.Dataflow, engine_api: azureml.dataprep.api.engineapi.api.EngineAPI, source_columns: typing.List[str], new_column_name: str)
Inheritance
builtins.object
DeriveColumnByExampleBuilder

Methods

add_example(source_data: SourceData, example_value: str) -> None

Adds an example value that will be used when learning a program to derive the new column.

delete_example(example_id: int = None, example_row: typing.Union[pandas.core.series.Series, NoneType] = None)

Deletes example, so it's no longer considered in program generation.

Note

Can be used with either full example row from list_examples() result or just example_id.

generate_suggested_examples() -> pandas.core.frame.DataFrame

List examples that, if provided, would improve confidence in the generated program.

Note

This operation will internally make a pull on the data in order to generate suggestions.

learn() -> None

Learn program that adds a new column in which values satisfy constrain set by source data and examples provided.

list_examples() -> pandas.core.frame.DataFrame

Gets examples that are currently used to generate a program to derive a column.

preview(skip: int = 0, count: int = 10) -> pandas.core.frame.DataFrame

Preview result of the generated program.

to_dataflow() -> azureml.dataprep.api.dataflow.Dataflow

Uses the program learned based on the provided examples to derive a new column and create a new dataflow.

add_example(source_data: SourceData, example_value: str) -> None

Adds an example value that will be used when learning a program to derive the new column.

add_example(source_data: SourceData, example_value: str) -> None

Parameters

source_data

Source data for the provided example. Generally should be a Dict[str, str] or pandas.Series where key of dictionary or index of series are column names and values are corresponding column values. Easiest way to provide source_data is to pass in a specific row of pandas.DataFrame (eg. df.iloc[2])

example_value

Desired result for the provided source data.

Remarks

If an identical example is already present, this will do nothing. If a conflicting example is given (identical source_data but different example_value), an exception

will be raised.

delete_example(example_id: int = None, example_row: typing.Union[pandas.core.series.Series, NoneType] = None)

Deletes example, so it's no longer considered in program generation.

Note

Can be used with either full example row from list_examples() result or just example_id.

delete_example(example_id: int = None, example_row: typing.Union[pandas.core.series.Series, NoneType] = None)

Parameters

example_id

Id of example to delete.

example_row

Example row to delete.

generate_suggested_examples() -> pandas.core.frame.DataFrame

List examples that, if provided, would improve confidence in the generated program.

Note

This operation will internally make a pull on the data in order to generate suggestions.

generate_suggested_examples() -> pandas.core.frame.DataFrame

Returns

pandas.DataFrame of suggested examples.

Return type

learn() -> None

Learn program that adds a new column in which values satisfy constrain set by source data and examples provided.

learn() -> None

Remarks

Calling this function will trigger an attempt to generate a program that satisfies all the provided constraints (examples).

list_examples() -> pandas.core.frame.DataFrame

Gets examples that are currently used to generate a program to derive a column.

list_examples() -> pandas.core.frame.DataFrame

Returns

pandas.DataFrame with examples.

Return type

preview(skip: int = 0, count: int = 10) -> pandas.core.frame.DataFrame

Preview result of the generated program.

preview(skip: int = 0, count: int = 10) -> pandas.core.frame.DataFrame

Parameters

skip

Number of rows to skip. Allows you to move preview window forward. Default is 0.

count

Number of rows to preview. Default is 10.

Returns

pandas.DataFrame with preview data.

Return type

Remarks

Returned DataFrame consists of all the source columns used by the program as well as the derived column.

to_dataflow() -> azureml.dataprep.api.dataflow.Dataflow

Uses the program learned based on the provided examples to derive a new column and create a new dataflow.

to_dataflow() -> azureml.dataprep.api.dataflow.Dataflow

Returns

A new Dataflow with a derived column.