# LbfgsLogisticRegressionBinaryTrainer 类

## 定义

IEstimator<TTransformer>使用通过 l-bfgs 方法训练的线性逻辑回归模型来预测目标的。

public sealed class LbfgsLogisticRegressionBinaryTrainer : Microsoft.ML.Trainers.LbfgsTrainerBase<Microsoft.ML.Trainers.LbfgsLogisticRegressionBinaryTrainer.Options,Microsoft.ML.Data.BinaryPredictionTransformer<Microsoft.ML.Calibrators.CalibratedModelParametersBase<Microsoft.ML.Trainers.LinearBinaryModelParameters,Microsoft.ML.Calibrators.PlattCalibrator>>,Microsoft.ML.Calibrators.CalibratedModelParametersBase<Microsoft.ML.Trainers.LinearBinaryModelParameters,Microsoft.ML.Calibrators.PlattCalibrator>>
type LbfgsLogisticRegressionBinaryTrainer = class
inherit LbfgsTrainerBase<LbfgsLogisticRegressionBinaryTrainer.Options, BinaryPredictionTransformer<CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>, CalibratedModelParametersBase<LinearBinaryModelParameters, PlattCalibrator>>
Public NotInheritable Class LbfgsLogisticRegressionBinaryTrainer
Inherits LbfgsTrainerBase(Of LbfgsLogisticRegressionBinaryTrainer.Options, BinaryPredictionTransformer(Of CalibratedModelParametersBase(Of LinearBinaryModelParameters, PlattCalibrator)), CalibratedModelParametersBase(Of LinearBinaryModelParameters, PlattCalibrator))

## 注解

To create this trainer, use LbfgsLogisticRegression or LbfgsLogisticRegression(Options).

### Input and Output Columns

The input label column data must be Boolean. The input features column data must be a known-sized vector of Single.

This trainer outputs the following columns:

Output Column Name Column Type Description
Score Single The unbounded score that was calculated by the model.
PredictedLabel Boolean The predicted label, based on the sign of the score. A negative score maps to false and a positive score maps to true.
Probability Single The probability calculated by calibrating the score of having true as the label. Probability value is in range [0, 1].

### Scoring Function

Linear logistic regression is a variant of linear model. It maps feature vector $\textbf{x} \in {\mathbb R}^n$ to a scalar via $\hat{y}\left( \textbf{x} \right) = \textbf{w}^T \textbf{x} + b = \sum_{j=1}^n w_j x_j + b$, where the $x_j$ is the $j$-th feature's value, the $j$-th element of $\textbf{w}$ is the $j$-th feature's coefficient, and $b$ is a learnable bias. The corresponding probability of getting a true label is $\frac{1}{1 + e^{\hat{y}\left( \textbf{x} \right)}}$.

### Training Algorithm Details

The optimization technique implemented is based on the limited memory Broyden-Fletcher-Goldfarb-Shanno method (L-BFGS). L-BFGS is a quasi-Newtonian method which replaces the expensive computation cost of the Hessian matrix with an approximation but still enjoys a fast convergence rate like the Newton method where the full Hessian matrix is computed. Since L-BFGS approximation uses only a limited amount of historical states to compute the next step direction, it is especially suited for problems with high-dimensional feature vector. The number of historical states is a user-specified parameter, using a larger number may lead to a better approximation to the Hessian matrix but also a higher computation cost per step.

Regularization is a method that can render an ill-posed problem more tractable by imposing constraints that provide information to supplement the data and that prevents overfitting by penalizing model's magnitude usually measured by some norm functions. This can improve the generalization of the model learned by selecting the optimal complexity in the bias-variance tradeoff. Regularization works by adding the penalty that is associated with coefficient values to the error of the hypothesis. An accurate model with extreme coefficient values would be penalized more, but a less accurate model with more conservative values would be penalized less.

This learner supports elastic net regularization: a linear combination of L1-norm (LASSO), $|| \textbf{w} ||_1$, and L2-norm (ridge), $|| \textbf{w} ||_2^2$ regularizations. L1-norm and L2-norm regularizations have different effects and uses that are complementary in certain respects. Using L1-norm can increase sparsity of the trained $\textbf{w}$. When working with high-dimensional data, it shrinks small weights of irrelevant features to 0 and therefore no resource will be spent on those bad features when making predictions. If L1-norm regularization is used, the training algorithm is OWL-QN. L2-norm regularization is preferable for data that is not sparse and it largely penalizes the existence of large weights.

An aggressive regularization (that is, assigning large coefficients to L1-norm or L2-norm regularization terms) can harm predictive capacity by excluding important variables out of the model. Therefore, choosing the right regularization coefficients is important when applying logistic regression.

## 字段

 讲师期望的功能列。 (继承自 TrainerEstimatorBase) 讲师期望的标签列。 可以为 null，表示不将标签用于定型。 (继承自 TrainerEstimatorBase) 讲师期望的权重列。 可以为 null，表示不用于定型的权重。 (继承自 TrainerEstimatorBase)

## 属性

 (继承自 LbfgsTrainerBase)

## 方法

 训练并返回一个 ITransformer 。 (继承自 TrainerEstimatorBase) 继续 LbfgsLogisticRegressionBinaryTrainer 使用已定型的培训 modelParameters ，并返回 BinaryPredictionTransformer 。 (继承自 TrainerEstimatorBase)

## 扩展方法

 将 "缓存检查点" 追加到估计器链。 这将确保下游估算将针对缓存的数据进行训练。 在执行多个数据传递的培训之前，最好具有一个缓存检查点。 给定一个估计器，返回一个将调用委托的包装对象 Fit(IDataView) 。 通常，估计器返回有关内容的信息是非常重要的，这就是该 Fit(IDataView) 方法返回特定类型的对象而不只是常规的原因 ITransformer 。 但同时， IEstimator 通常会形成包含多个对象的管道，因此，我们可能需要通过 EstimatorChain 在其中获得转换器的估计器在此链中的某个位置，来构建估算链。 对于这种情况，我们可以通过此方法附加将调用的委托。