Add Columns

Important

Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.

Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.

ML Studio (classic) documentation is being retired and may not be updated in the future.

Adds a set of columns from one dataset to another

Category: Data Transformation / Manipulation

Note

Applies to: Machine Learning Studio (classic) only

Similar drag-and-drop modules are available in Azure Machine Learning designer.

Module overview

This article describes how to use the Add Columns module in Machine Learning Studio (classic) to concatenate two datasets.

You combine all columns from the two datasets that you specify as inputs to create a single dataset. If you need to concatenate more than two datasets, use several instances of Add Columns.

When combining two datasets that contain a different number of rows, we recommend using the Join Data module, which supports outer joins on a common key column.

How to configure Add Columns

  1. Add the Add Columns module to your experiment.

  2. Connect the two datasets that you want to concatenate. If you want to combine more than two datasets, you can chain together several combinations of Add Columns.

    • It is possible to combine two columns that have a different number of rows. The output dataset is padded with missing values for each row in the smaller source column.

    • You cannot choose individual columns to add. All the columns from each dataset are concatenated when you use Add Columns. Therefore, if you want to add only a subset of the columns, use Select Columns in Dataset to create a dataset with the columns you want.

  3. Run the experiment.

Results

After the experiment has run:

  • To see the first rows of the new dataset, right-click the output of Add Columns and select Visualize.
  • To save and name the concatenated dataset, right-click the output and select Save as Dataset .

The number of columns in the new dataset equals the sum of the columns of both input datasets.

If there are two columns with the same name in the input datasets, a numeric suffix is added to the name of the column from the dataset used in the right input column. For example, if there are two instances of a column named TargetOutcome, the right column would be renamed TargetOutcome (1).

Examples

For examples of how Add Columns is used in an experiment, see the Azure AI Gallery:

Expected inputs

Name Type Description
Left dataset Data Table Left dataset
Right dataset Data Table Right dataset

Output

Name Type Description
Combined dataset Data Table Combined dataset

Exceptions

Exception Description
Error 0003 An exception occurs if one or more input datasets is null or empty.
Error 0017 An exception occurs if one or more specified columns has a type that is unsupported by the current module.

For a list of errors specific to Studio (classic) modules, see Machine Learning Error codes.

For a list of API exceptions, see Machine Learning REST API Error Codes.

See also

Manipulation
Data Transformation
A-Z Module List