FieldAwareFactorizationMachineTrainer
FieldAwareFactorizationMachineTrainer
FieldAwareFactorizationMachineTrainer
Class
Definition
The IEstimator<TTransformer> to predict a target using a fieldaware factorization machine model trained using a stochastic gradient method.
public sealed class FieldAwareFactorizationMachineTrainer : Microsoft.ML.IEstimator<Microsoft.ML.Trainers.FieldAwareFactorizationMachinePredictionTransformer>
type FieldAwareFactorizationMachineTrainer = class
interface IEstimator<FieldAwareFactorizationMachinePredictionTransformer>
Public NotInheritable Class FieldAwareFactorizationMachineTrainer
Implements IEstimator(Of FieldAwareFactorizationMachinePredictionTransformer)
 Inheritance
 Implements
Remarks
Input and Output Columns
The input label column data must be Boolean. The input features column data must be a knownsized vector of Single.
This trainer outputs the following columns:
Output Column Name  Column Type  Description  

Score 
Single  The unbounded score that was calculated by the model.  
PredictedLabel 
Boolean  The predicted label, based on the sign of the score. A negative score maps to false and a positive score maps to true . 

Probability 
Single  The probability calculated by calibrating the score of having true as the label. Probability value is in range [0, 1]. 
To create this trainer, use FieldAwareFactorizationMachine FieldAwareFactorizationMachine, or FieldAwareFactorizationMachine(Options).
In contrast to other binary classifiers which can only support one feature column, fieldaware factorization machine can consume multiple feature columns. Each column is viewed as a container of some features and such a container is called a field. Note that all feature columns must be float vectors but their dimensions can be different. The motivation of splitting features into different fields is to model features from different distributions independently. For example, in online game store, features created from user profile and those from game profile can be assigned to two different fields.
Trainer Characteristics
Machine learning task  Binary classification 
Is normalization required?  Yes 
Is caching required?  No 
Required NuGet in addition to Microsoft.ML  None 
Background
Factorization machine family is a powerful model group for supervised learning problems. It was first introduced in Steffen Rendle's Factorization Machines paper in 2010. Later, one of its generalized versions, fieldaware factorization machine, became an important predictive module in recent recommender systems and clickthrough rate prediction contests. For examples, see winning solutions in Steffen Rendle's KDDCup 2012 (Track 1 and Track 2), Criteo's, Avazu's, and Outbrain's click prediction challenges on Kaggle.
Factorization machines are especially powerful when feature conjunctions are extremely correlated to the signal you want to predict. An example of feature pairs which can form important conjunctions is user ID and music ID in music recommendation. When a dataset consists of only dense numerical features, usage of factorization machine is not recommended or some featurizations should be performed.
Scoring Function
Fieldaware factorization machine is a scoring function which maps feature vectors from different fields to a scalar score. Assume that all $m$ feature columns are concatenated into a long feature vector $\textbf{x} \in {\mathbb R}^n$ and ${\mathcal F}(j)$ denotes the $j$th feature's field indentifier. The corresponding score is $\hat{y}\left(\textbf{x}\right) = \left\langle \textbf{w}, \textbf{x} \right\rangle + \sum_{j = 1}^n \sum_{j' = j + 1}^n \left\langle \textbf{v}{j, {\mathcal F}(j')} , \textbf{v}{j', {\mathcal F}(j)} \right\rangle x_j x_{j'}$, where $\left\langle \cdot, \cdot \right\rangle$ is the inner product operator, $\textbf{w} \in {\mathbb R}^n$ stores the linear coefficients, and $\textbf{v}_{j, f}\in {\mathbb R}^k$ is the $j$th feature's representation in the $f$th field's latent space. Note that $k$ is the latent dimension specified by the user.
The predicted label is the sign of $\hat{y}$. If $\hat{y} > 0$, this model predicts true. Otherwise, it predicts false.
For a systematic introduction to fieldaware factorization machine, please see this paper
Training Algorithm Details
The algorithm implemented in FieldAwareFactorizationMachineTrainer is based on a stochastic gradient method. Algorithm details is described in Algorithm 3 in this online document. The minimized loss function is logistic loss, so the trained model can be viewed as a nonlinear logistic regression.
Check the See Also section for links to usage examples.
Methods
Fit(IDataView) Fit(IDataView) Fit(IDataView) 
Trains and returns a FieldAwareFactorizationMachinePredictionTransformer. 
Fit(IDataView, IDataView, FieldAwareFactorizationMachineModelParameters) Fit(IDataView, IDataView, FieldAwareFactorizationMachineModelParameters) Fit(IDataView, IDataView, FieldAwareFactorizationMachineModelParameters) 
Continues the training of a FieldAwareFactorizationMachineTrainer using an already trained 
GetOutputSchema(SchemaShape) GetOutputSchema(SchemaShape) GetOutputSchema(SchemaShape) 
Schema propagation for transformers. Returns the output schema of the data, if the input schema is like the one provided. 
Extension Methods
WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>) WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>) WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>) 
Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called. 
Applies to
See also
 FieldAwareFactorizationMachine(BinaryClassificationCatalog+BinaryClassificationTrainers, String, String, String)
 FieldAwareFactorizationMachine(BinaryClassificationCatalog+BinaryClassificationTrainers, String[], String, String)
 FieldAwareFactorizationMachine(BinaryClassificationCatalog+BinaryClassificationTrainers, FieldAwareFactorizationMachineTrainer+Options)
 FieldAwareFactorizationMachineTrainer.Options
Feedback
Loading feedback...