Bayesian Linear Regression

Article
05/06/2019

Important

Support for Machine Learning Studio (classic) will end on 31 August 2024. We recommend you transition to Azure Machine Learning by that date.

Beginning 1 December 2021, you will not be able to create new Machine Learning Studio (classic) resources. Through 31 August 2024, you can continue to use the existing Machine Learning Studio (classic) resources.

See information on moving machine learning projects from ML Studio (classic) to Azure Machine Learning.
Learn more about Azure Machine Learning.

ML Studio (classic) documentation is being retired and may not be updated in the future.

Creates a Bayesian linear regression model

Category: Machine Learning / Initialize Model / Regression

Note

Applies to: Machine Learning Studio (classic) only

Similar drag-and-drop modules are available in Azure Machine Learning designer.

Module overview

This article describes how to use the Bayesian Linear Regression module in Machine Learning Studio (classic), to define a regression model based on Bayesian statistics.

After you have defined the model parameters, you must train the model using a tagged dataset and the Train Model module. The trained model can then be used to make predictions. Alternatively, the untrained model can be passed to Cross-Validate Model for cross-validation against a labeled data set.

More about Bayesian regression

In statistics, the Bayesian approach to regression is often contrasted with the frequentist approach.

The Bayesian approach uses linear regression supplemented by additional information in the form of a prior probability distribution. Prior information about the parameters is combined with a likelihood function to generate estimates for the parameters.

In contrast, the frequentist approach, represented by standard least-square linear regression, assumes that the data contains sufficient measurements to create a meaningful model.

For more information about the research behind this algorithm, see the links in the Technical Notes section.

How to configure Bayesian Regression

Add the Bayesian Linear Regression module to your experiment. You can find the this module under Machine Learning, Initialize, in the Regression category.
Regularization weight: Type a value to use for regularization. Regularization is used to prevent overfitting. This weight corresponds to L2. For more information, see the Technical Notes section.
Allow unknown categorical levels: Select this option to create a grouping for unknown values. The model can accept only the values contained in the training data. The model might be less precise on known values but provide better predictions for new (unknown) values.
Connect a training dataset, and one of the training modules. This model type has no parameters that can be changed in a parameter sweep, so although you can train the model using Tune Model Hyperparameters, it cannot automatically optimize the model.
Select the single numeric column that you want to model or predict.
Run the experiment.

Results

After training is complete:

To see a summary of the model's parameters, right-click the output of the Train Model module and select Visualize.
To create predictions, use the trained model as an input to Score Model.

Examples

For examples of regression models, see the Azure AI Gallery.

Compare Regression Models sample: Contrasts several different kinds of regression models.

Technical notes

The use of the lambda coefficient is described in detail in this textbook on machine learning: Pattern Recognition and Machine Learning, Christopher Bishop, Springer-Verlag, 2007.
This article is available as a PDF download from the Microsoft Research site: Bayesian Regression and Classification

Module parameters

Name	Range	Type	Default	Description
Regularization weight	>=double.Epsilon	Float	1.0	Type a constant to use in regularization. The constant represents the ratio of the precision of weight prior to the precision of noise.
Allow unknown categorical levels	Any	Boolean	true	If true creates an additional level for each categorical column. Any levels in the test dataset not available in the training dataset are mapped to this additional level.

Outputs

Name	Type	Description
Untrained model	ILearner interface	An untrained Bayesian linear regression model