Statistical Functions

This article describes the modules in Azure Machine Learning Studio that support mathematical and statistical operations critical for machine learning. If you need to perform tasks such as the following in your experiment, look in the Statistical Functions category:

  • Perform ad hoc computations on column values, such as rounding or using an absolute value.
  • Compute means, logarithms, and other statistics commonly used in machine learning.
  • Calculate correlation and probability scores.
  • Compute z-scores.
  • Compute widely used statistical distributions, such as Weibull, gamma, and beta.
  • Generate statistical reports over a set of columns or a dataset.

Note

Applies to: Machine Learning Studio

This content pertains only to Studio. Similar drag and drop modules have been added to the visual interface in Machine Learning service. Learn more in this article comparing the two versions.

For example, if you have a new dataset, you might use the Summarize Data module first. It generates a report for an entire dataset that includes standard statistical measures, such as mean and standard deviation.

If you need more advanced statistics, such as sample skewness or interquartile distance, use the Compute Elementary Statistics module to generate additional descriptive statistics.

Because the modules generate the results each time you run the experiment, the results are updated if your data changes.

List of modules

The Statistical Functions category includes the following modules:

See also