Data Transformation

This article lists the modules that are provided in Azure Machine Learning Studio for data transformation. For machine learning, data transformation entails some very general tasks, such as joining datasets or changing column names. But, it also includes many tasks that are specific to machine learning, such as normalization, binning and grouping, and inference of missing values.


Data that you use in Machine Learning Studio is generally expected to be "tidy" before you import it to Machine Learning Studio. Data preparation might include, for example, ensuring that the data uses the correct encoding and checking that the data has a consistent schema.

You can use Azure Machine Learning Workbench to transform and prepare all kinds of data. For examples, see Data transformations “by example” in Machine Learning Workbench.

Modules for data transformation are grouped into the following task-based categories:

List of modules

The following module categories are included in the Data Transformation category:

See also