Adds a set of columns from one dataset to another
Category: Data Transformation / Manipulation
Applies to: Machine Learning Studio
This content pertains only to Studio. Similar drag and drop modules have been added to the visual interface in Machine Learning service. Learn more in this article comparing the two versions.
This article describes how to use the Add Columns module in Azure Machine Learning Studio to concatenate two datasets.
You combine all columns from the two datasets that you specify as inputs to create a single dataset. If you need to concatenate more than two datasets, use several instances of Add Columns.
When combining two datasets that contain a different number of rows, we recommend using the Join Data module, which supports outer joins on a common key column.
How to configure Add Columns
Add the Add Columns module to your experiment.
Connect the two datasets that you want to concatenate. If you want to combine more than two datasets, you can chain together several combinations of Add Columns.
It is possible to combine two columns that have a different number of rows. The output dataset is padded with missing values for each row in the smaller source column.
You cannot choose individual columns to add. All the columns from each dataset are concatenated when you use Add Columns. Therefore, if you want to add only a subset of the columns, use Select Columns in Dataset to create a dataset with the columns you want.
Run the experiment.
After the experiment has run:
- To see the first rows of the new dataset, right-click the output of Add Columns and select Visualize.
- To save and name the concatenated dataset, right-click the output and select Save as Dataset .
The number of columns in the new dataset equals the sum of the columns of both input datasets.
If there are two columns with the same name in the input datasets, a numeric suffix is added to the name of the column from the dataset used in the right input column. For example, if there are two instances of a column named TargetOutcome, the right column would be renamed TargetOutcome (1).
For examples of how Add Columns is used in an experiment, see the Azure AI Gallery:
Customer relationship prediction: A column that contains labels is combined with a feature dataset.
|Left dataset||Data Table||Left dataset|
|Right dataset||Data Table||Right dataset|
|Combined dataset||Data Table||Combined dataset|
|Error 0003||An exception occurs if one or more input datasets is null or empty.|
|Error 0017||An exception occurs if one or more specified columns has a type that is unsupported by the current module.|
For a list of errors specific to Studio modules, see Machine Learning Error codes.
For a list of API exceptions, see Machine Learning REST API Error Codes.