Share via


SgdBinaryClassifier Class

Machine Learning Hogwild Stochastic Gradient Descent Binary Classifier

Inheritance
nimbusml.internal.core.linear_model._sgdbinaryclassifier.SgdBinaryClassifier
SgdBinaryClassifier
nimbusml.base_predictor.BasePredictor
SgdBinaryClassifier
sklearn.base.ClassifierMixin
SgdBinaryClassifier

Constructor

SgdBinaryClassifier(normalize='Auto', caching='Auto', loss='log', l2_regularization=1e-06, number_of_threads=None, convergence_tolerance=0.0001, number_of_iterations=20, initial_learning_rate=0.01, shuffle=True, positive_instance_weight=1.0, check_frequency=None, feature=None, label=None, weight=None, **params)

Parameters

feature

see Columns.

label

see Columns.

weight

see Columns.

normalize

Specifies the type of automatic normalization used:

  • "Auto": if normalization is needed, it is performed automatically. This is the default choice.

  • "No": no normalization is performed.

  • "Yes": normalization is performed.

  • "Warn": if normalization is needed, a warning message is displayed, but normalization is not performed.

Normalization rescales disparate data ranges to a standard scale. Feature scaling insures the distances between data points are proportional and enables various optimization methods such as gradient descent to converge much faster. If normalization is performed, a MaxMin normalizer is used. It normalizes values in an interval [a, b] where -1 <= a <= 0 and 0 <= b <= 1 and b - a = 1. This normalizer preserves sparsity by mapping zero to zero.

caching

Whether trainer should cache input training data.

loss

The default is Log. Other choices are Exp, Hinge, and SmoothedHinge. For more information, please see the documentation page about losses, Loss.

l2_regularization

L2 Regularization constant.

number_of_threads

Degree of lock-free parallelism. Defaults to automatic depending on data sparseness. Determinism not guaranteed.

convergence_tolerance

Exponential moving averaged improvement tolerance for convergence.

number_of_iterations

Maximum number of iterations; set to 1 to simulate online learning.

initial_learning_rate

Initial learning rate (only used by SGD).

shuffle

Shuffle data every epoch?.

positive_instance_weight

Apply weight to the positive class, for imbalanced data.

check_frequency

Convergence check frequency (in terms of number of iterations). Default equals number of threads.

params

Additional arguments sent to compute engine.

Examples


   ###############################################################################
   # SgdBinaryClassifier
   from nimbusml import Pipeline, FileDataStream
   from nimbusml.datasets import get_dataset
   from nimbusml.feature_extraction.categorical import OneHotVectorizer
   from nimbusml.linear_model import SgdBinaryClassifier

   # data input (as a FileDataStream)
   path = get_dataset('infert').as_filepath()

   data = FileDataStream.read_csv(path)
   print(data.head())
   #    age  case education  induced  parity ... row_num  spontaneous  ...
   # 0   26     1    0-5yrs        1       6 ...       1            2  ...
   # 1   42     1    0-5yrs        1       1 ...       2            0  ...
   # 2   39     1    0-5yrs        2       6 ...       3            0  ...
   # 3   34     1    0-5yrs        2       4 ...       4            0  ...
   # 4   35     1   6-11yrs        1       3 ...       5            1  ...

   # define the training pipeline
   pipeline = Pipeline([
       OneHotVectorizer(columns={'edu': 'education'}),
       SgdBinaryClassifier(feature=['parity', 'edu'], label='case')
   ])

   # train, predict, and evaluate
   metrics, predictions = pipeline.fit(data).test(data, output_scores=True)

   # print predictions
   print(predictions.head())
   #   PredictedLabel  Probability     Score
   # 0               0     0.363427 -0.560521
   # 1               0     0.378848 -0.494439
   # 2               0     0.363427 -0.560521
   # 3               0     0.369564 -0.534088
   # 4               0     0.336350 -0.679603
   # print evaluation metrics
   print(metrics)
   #        AUC  Accuracy  Positive precision  Positive recall  ...
   # 0  0.497006  0.665323                   0                0  ...

Remarks

The Stochastic Gradient Descent (SGD) is one of the most popular stochastic optimization procedure that can be integrated into several machine learning tasks to achieve state-of-the-art performance. The Hogwild SGD binary classification learner implements SGD for binary classification that supports multi-threading without any locking. If the associated optimization problem is sparse, then Hogwild SGD achieves a nearly optimal rate of convergence. For a detailed reference, please refer to https://arxiv.org/pdf/1106.5730v2.pdf.

Reference

https://arxiv.org/pdf/1106.5730v2.pdf

Methods

decision_function

Returns score values

get_params

Get the parameters for this operator.

predict_proba

Returns probabilities

decision_function

Returns score values

decision_function(X, **params)

get_params

Get the parameters for this operator.

get_params(deep=False)

Parameters

deep
default value: False

predict_proba

Returns probabilities

predict_proba(X, **params)