What is automated machine learning?

Automated machine learning, also referred to as automated ML, is the process of automating the time consuming, iterative tasks of machine learning model development. It allows data scientists, analysts, and developers to build ML models with high scale, efficiency, and productivity all while sustaining model quality. Automated ML is based on a breakthrough from our Microsoft Research division.

Traditional machine learning model development is resource-intensive, requiring significant domain knowledge and time to produce and compare dozens of models. Apply automated ML when you want Azure Machine Learning to train and tune a model for you using the target metric you specify. The service then iterates through ML algorithms paired with feature selections, where each iteration produces a model with a training score. The higher the score, the better the model is considered to "fit" your data.

With automated machine learning, you'll accelerate the time it takes to get production-ready ML models with great ease and efficiency.

When to use automated ML

Automated ML democratizes the machine learning model development process, and empowers its users, no matter their data science expertise, to identify an end-to-end machine learning pipeline for any problem.

Data scientists, analysts and developers across industries can use automated ML to:

  • Implement machine learning solutions without extensive programming knowledge
  • Save time and resources
  • Leverage data science best practices
  • Provide agile problem-solving

The following table lists common automated ML use cases.

Classification Regression Time series forecasting
Fraud Detection CPU Performance Prediction Demand Forecasting
Marketing Prediction Material Durability Prediction Sales Forecasting

How automated ML works

Using Azure Machine Learning, you can design and run your automated ML training experiments with these steps:

  1. Identify the ML problem to be solved: classification, forecasting, or regression

  2. Specify the source and format of the labeled training data: Numpy arrays or Pandas dataframe

  3. Configure the compute target for model training, such as your local computer, Azure Machine Learning Computes, remote VMs, or Azure Databricks. Learn about automated training on a remote resource.

  4. Configure the automated machine learning parameters that determine how many iterations over different models, hyperparameter settings, advanced preprocessing/featurization, and what metrics to look at when determining the best model. You can configure the settings for automatic training experiment in Azure Machine Learning studio, or with the SDK.


    The functionality in this studio, https://ml.azure.com, is accessible from Enterprise workspaces only. Learn more about editions and upgrading.

  5. Submit the training run.

Automated Machine learning

During training, Azure Machine Learning creates a number of in parallel pipelines that try different algorithms and parameters. It will stop once it hits the exit criteria defined in the experiment.

You can also inspect the logged run information, which contains metrics gathered during the run. The training run produces a Python serialized object (.pkl file) that contains the model and data preprocessing.

While model building is automated, you can also learn how important or relevant features are to the generated models.


In every automated machine learning experiment, your data is preprocessed using the default methods and optionally through advanced preprocessing.


Automated machine learning pre-processing steps (feature normalization, handling missing data, converting text to numeric, etc.) become part of the underlying model. When using the model for predictions, the same pre-processing steps applied during training are applied to your input data automatically.

Automatic preprocessing (standard)

In every automated machine learning experiment, your data is automatically scaled or normalized to help algorithms perform well. During model training, one of the following scaling or normalization techniques will be applied to each model.

Scaling & normalization Description
StandardScaleWrapper Standardize features by removing the mean and scaling to unit variance
MinMaxScalar Transforms features by scaling each feature by that column’s minimum and maximum
MaxAbsScaler Scale each feature by its maximum absolute value
RobustScalar This Scaler features by their quantile range
PCA Linear dimensionality reduction using Singular Value Decomposition of the data to project it to a lower dimensional space
TruncatedSVDWrapper This transformer performs linear dimensionality reduction by means of truncated singular value decomposition (SVD). Contrary to PCA, this estimator does not center the data before computing the singular value decomposition. This means it can work with scipy.sparse matrices efficiently
SparseNormalizer Each sample (that is, each row of the data matrix) with at least one non-zero component is re-scaled independently of other samples so that its norm (l1 or l2) equals one

Advanced preprocessing: optional featurization

Additional advanced preprocessing and featurization are also available, such as missing values imputation, encoding, and transforms. Learn more about what featurization is included. Enable this setting with:

  • Azure Machine Learning studio : Selecting the View featurization settings in the Configuration Run section with these steps.

  • Python SDK: Specifying "feauturization": auto' / 'off' / FeaturizationConfig for the AutoMLConfig class.

Time-series forecasting

Building forecasts is an integral part of any business, whether it’s revenue, inventory, sales, or customer demand. You can use automated ML to combine techniques and approaches and get a recommended, high-quality time-series forecast.

An automated time-series experiment is treated as a multivariate regression problem. Past time-series values are “pivoted” to become additional dimensions for the regressor together with other predictors. This approach, unlike classical time series methods, has an advantage of naturally incorporating multiple contextual variables and their relationship to one another during training. Automated ML learns a single, but often internally branched model for all items in the dataset and prediction horizons. More data is thus available to estimate model parameters and generalization to unseen series becomes possible.

Learn more and see an example of automated machine learning for time series forecasting. Or, see the energy demand notebook for detailed code examples of advanced forecasting configuration including:

  • holiday detection and featurization
  • time-series and DNN learners (Auto-ARIMA, Prophet, ForecastTCN)
  • many model support through grouping
  • rolling-origin cross validation
  • configurable lags
  • rolling window aggregate features

Ensemble models

Automated machine learning supports ensemble models, which are enabled by default. Ensemble learning improves machine learning results and predictive performance by combining multiple models as opposed to using single models. The ensemble iterations appear as the final iterations of your run. Automated machine learning uses both voting and stacking ensemble methods for combining models:

  • Voting: predicts based on the weighted average of predicted class probabilities (for classification tasks) or predicted regression targets (for regression tasks).
  • Stacking: stacking combines heterogenous models and trains a meta-model based on the output from the individual models. The current default meta-models are LogisticRegression for classification tasks and ElasticNet for regression/forecasting tasks.

The Caruana ensemble selection algorithm with sorted ensemble initialization is used to decide which models to use within the ensemble. At a high level, this algorithm initializes the ensemble with up to 5 models with the best individual scores, and verifies that these models are within 5% threshold of the best score to avoid a poor initial ensemble. Then for each ensemble iteration, a new model is added to the existing ensemble and the resulting score is calculated. If a new model improved the existing ensemble score, the ensemble is updated to include the new model.

See the how-to for changing default ensemble settings in automated machine learning.

Imbalanced data

Imbalanced data is commonly found in data for machine learning classification scenarios, and refers to data that contains a disproportionate ratio of observations in each class. This imbalance can lead to a falsely perceived positive effect of a model's accuracy, because the input data has bias towards one class, which results in the trained model to mimic that bias.

As part of its goal of simplifying the machine learning workflow, automated ML has built in capabilities to help deal with imbalanced data such as,

  • A weight column: automated ML supports a weighted column as input, causing rows in the data to be weighted up or down, which can make a class more or less “important”. See this notebook example

  • The algorithms used by automated ML can properly handle imbalance of up to 20:1, meaning the most common class can have 20 times more rows in the data than the least common class.

Identify models with imbalanced data

As classification algorithms are commonly evaluated by accuracy, checking a model's accuracy score is a good way to identify if it was impacted by imbalanced data. Did it have really high accuracy or really low accuracy for certain classes?

In addition, automated ML runs generate the following charts automatically, which can help you understand the correctness of the classifications of your model, and identify models potentially impacted by imbalanced data.

Chart Description
Confusion Matrix Evaluates the correctly classified labels against the actual labels of the data.
Precision-recall Evaluates the ratio of correct labels against the ratio of found label instances of the data
ROC Curves Evaluates the ratio of correct labels against the ratio of false-positive labels.

Handle imbalanced data

The following techniques are additional options to handle imbalanced data outside of automated ML.

  • Resampling to even the class imbalance, either by up-sampling the smaller classes or down-sampling the larger classes. These methods require expertise to process and analyze.

  • Use a performance metric that deals better with imbalanced data. For example, the F1 score is a weighted average of precision and recall. Precision measures a classifier’s exactness-- low precision indicates a high number of false positives--, while recall measures a classifier’s completeness-- low recall indicates a high number of false negatives.

Use with ONNX in C# apps

With Azure Machine Learning, you can use automated ML to build a Python model and have it converted to the ONNX format. The ONNX runtime supports C#, so you can use the model built automatically in your C# apps without any need for recoding or any of the network latencies that REST endpoints introduce. Try an example of this flow in this Jupyter notebook.

Automated ML across Microsoft

Automated ML is also available in other Microsoft solutions such as:

Integrations Description
ML.NET Automatic model selection and training in .NET apps using Visual Studio and Visual Studio Code with ML.NET automated ML (preview).
HDInsight Scale out your automated ML training jobs on Spark in HDInsight clusters in parallel.
Power BI Invoke machine learning models directly in Power BI (preview).
SQL Server Create new machine learning models over your data in SQL Server 2019 big data clusters.

Next steps

See examples and learn how to build models using automated machine learning: