Data Transformation - Manipulation

This article describes the modules in Azure Machine Learning Studio that you can use for basic data manipulation.

Note

Applies to: Machine Learning Studio

This content pertains only to Studio. Similar drag and drop modules have been added to the visual interface in Machine Learning service. Learn more in this article comparing the two versions.

Machine Learning Studio supports tasks that are specific to machine learning, such as normalization or feature selection. The modules in this category are intended for more general tasks.

Tip

You can use Azure Machine Learning Workbench to perform more sophisticated data cleanup and preparations tasks by using "learn by example" functions. For examples, see Microsoft Machine Learning team blog post Data transformations “by example” in Machine Learning Workbench.

Data manipulation tasks

The modules in this category are intended to support core data management tasks that might need to be performed in Machine Learning Studio. The following tasks are examples of core data management tasks:

  • Combine two datasets, either by using joins, or by merging columns or rows.
  • Create new categories to use in grouping data.
  • Modify column headings, change column data types, or flag columns as features or labels.
  • Check for missing values, and then replace them with appropriate values.

Examples

For examples of how to work with complex data in machine learning experiments, see these samples in the Azure AI Gallery:

Modules in this category

The Data Transformation - Manipulation category includes the following modules:

See also