AveragedPerceptronBinaryClassifier AveragedPerceptronBinaryClassifier AveragedPerceptronBinaryClassifier Class


Train a Average perceptron.

public sealed class AveragedPerceptronBinaryClassifier : Microsoft.ML.ILearningPipelineItem, Microsoft.ML.Runtime.EntryPoints.CommonInputs.ITrainerInputWithLabel
type AveragedPerceptronBinaryClassifier = class
    interface CommonInputs.ITrainerInputWithLabel
    interface CommonInputs.ITrainerInput
    interface ILearningPipelineItem
Public NotInheritable Class AveragedPerceptronBinaryClassifier
Implements CommonInputs.ITrainerInputWithLabel, ILearningPipelineItem


Perceptron is a classification algorithm that makes its predictions based on a linear function. I.e., for an instance with feature values f0, f1,..., f_D-1, , the prediction is given by the sign of sigma[0,D-1] ( w_i * f_i), where w_0, w_1,...,w_D-1 are the weights computed by the algorithm.

Perceptron is an online algorithm, i.e., it processes the instances in the training set one at a time. The weights are initialized to be 0, or some random values. Then, for each example in the training set, the value of sigma[0, D-1] (w_i * f_i) is computed. If this value has the same sign as the label of the current example, the weights remain the same. If they have opposite signs, the weights vector is updated by either subtracting or adding (if the label is negative or positive, respectively) the feature vector of the current example, multiplied by a factor 0 < a <= 1, called the learning rate. In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate, and by the gradient of some loss function (in the specific case described above, the loss is hinge-loss, whose gradient is 1 when it is non-zero).

In Averaged Perceptron (AKA voted-perceptron), the weight vectors are stored, together with a weight that counts the number of iterations it survived (this is equivalent to storing the weight vector after every iteration, regardless of whether it was updated or not). The prediction is then calculated by taking the weighted average of all the sums sigma[0, D-1] (w_i * f_i) or the different weight vectors.

Wikipedia entry for Perceptron
Large Margin Classification Using the Perceptron Algorithm


AveragedPerceptronBinaryClassifier() AveragedPerceptronBinaryClassifier() AveragedPerceptronBinaryClassifier()


Averaged Averaged Averaged

Do averaging?

AveragedTolerance AveragedTolerance AveragedTolerance

The inexactness tolerance for averaging

Caching Caching Caching

Whether learner should cache input training data

Calibrator Calibrator Calibrator

The calibrator kind to apply to the predictor. Specify null for no calibration

DecreaseLearningRate DecreaseLearningRate DecreaseLearningRate

Decrease learning rate

DoLazyUpdates DoLazyUpdates DoLazyUpdates

Instead of updating averaged weights on every example, only update when loss is nonzero

FeatureColumn FeatureColumn FeatureColumn

Column to use for features

InitialWeights InitialWeights InitialWeights

Initial Weights and bias, comma-separated

InitWtsDiameter InitWtsDiameter InitWtsDiameter

Init weights diameter

L2RegularizerWeight L2RegularizerWeight L2RegularizerWeight

L2 Regularization Weight

LabelColumn LabelColumn LabelColumn

Column to use for labels

LearningRate LearningRate LearningRate

Learning rate

LossFunction LossFunction LossFunction

Loss Function

MaxCalibrationExamples MaxCalibrationExamples MaxCalibrationExamples

The maximum number of examples to use when training the calibrator

NormalizeFeatures NormalizeFeatures NormalizeFeatures

Normalize option for the feature column

NumIterations NumIterations NumIterations

Number of iterations

RecencyGain RecencyGain RecencyGain

Extra weight given to more recent updates

RecencyGainMulti RecencyGainMulti RecencyGainMulti

Whether Recency Gain is multiplicative (vs. additive)

ResetWeightsAfterXExamples ResetWeightsAfterXExamples ResetWeightsAfterXExamples

Number of examples after which weights will be reset to the current average

Shuffle Shuffle Shuffle

Whether to shuffle for each training iteration

StreamingCacheSize StreamingCacheSize StreamingCacheSize

Size of cache when trained in Scope

TrainingData TrainingData TrainingData

The data to be used for training


ApplyStep(ILearningPipelineStep, Experiment) ApplyStep(ILearningPipelineStep, Experiment) ApplyStep(ILearningPipelineStep, Experiment)
GetInputData() GetInputData() GetInputData()

Applies to