Share via


WordHashBagEstimator Class

Definition

public sealed class WordHashBagEstimator : Microsoft.ML.IEstimator<Microsoft.ML.ITransformer>
type WordHashBagEstimator = class
    interface IEstimator<ITransformer>
Public NotInheritable Class WordHashBagEstimator
Implements IEstimator(Of ITransformer)
Inheritance
WordHashBagEstimator
Implements

Remarks

Estimator Characteristics

Does this estimator need to look at the data to train its parameters? Yes
Input column data type Vector of Text
Output column data type Vector of known-size of Single
Exportable to ONNX No

The resulting ITransformer creates a new column, named as specified in the output column name parameters, and produces a vector of n-gram counts (sequences of n consecutive words) from a given data. It does so by hashing each n-gram and using the hash value as the index in the bag.

WordHashBagEstimator is different from NgramHashingEstimator in that the former takes tokenizes text internally while the latter takes tokenized text as input.

Check the See Also section for links to usage examples.

Methods

Fit(IDataView)

Trains and returns a ITransformer.

GetOutputSchema(SchemaShape)

Schema propagation for estimators. Returns the output schema shape of the estimator, if the input schema shape is like the one provided.

Extension Methods

AppendCacheCheckpoint<TTrans>(IEstimator<TTrans>, IHostEnvironment)

Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes.

WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>)

Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called.

Applies to

See also