Microsoft.ML.Transforms.Text Namespace

Reference

Important

Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.

Namespace containing text data transformation components.

Classes

CustomStopWordsRemovingEstimator	IEstimator<TTransformer> for the CustomStopWordsRemovingTransformer.
CustomStopWordsRemovingEstimator.Options	Use stop words remover that can removes language-specific list of stop words (most common words) already defined in the system.
CustomStopWordsRemovingTransformer	ITransformer resulting from fitting a CustomStopWordsRemovingEstimator.
LatentDirichletAllocationEstimator	The LDA transform implements LightLDA, a state-of-the-art implementation of Latent Dirichlet Allocation.
LatentDirichletAllocationTransformer	ITransformer resulting from fitting a LatentDirichletAllocationEstimator.
LatentDirichletAllocationTransformer.ModelParameters	Provide details about the topics discovered by LightLDA.
NgramExtractingEstimator	Produces a vector of counts of n-grams (sequences of consecutive words) encountered in the input text.
NgramExtractingTransformer	ITransformer resulting from fitting an NgramExtractingEstimator.
NgramHashingEstimator	IEstimator<TTransformer> for the NgramHashingTransformer.
NgramHashingTransformer
StopWordsRemovingEstimator	IEstimator<TTransformer> for the CustomStopWordsRemovingTransformer.
StopWordsRemovingEstimator.Options	Use stop words remover that can remove language-specific list of stop words (most common words) already defined in the system.
StopWordsRemovingTransformer	ITransformer resulting from fitting a StopWordsRemovingEstimator.
TextFeaturizingEstimator	An estimator that turns a collection of text documents into numerical feature vectors. The feature vectors are normalized counts of word and/or character n-grams (based on the options supplied).
TextFeaturizingEstimator.Options	Advanced options for the TextFeaturizingEstimator.
TextNormalizingEstimator	IEstimator<TTransformer> for the TextNormalizingTransformer.
TextNormalizingTransformer	ITransformer resulting from fitting a TextNormalizingEstimator.
TokenizingByCharactersEstimator	IEstimator<TTransformer> for the TokenizingByCharactersTransformer.
TokenizingByCharactersTransformer	ITransformer resulting from fitting a TokenizingByCharactersEstimator.
WordBagEstimator	IEstimator<TTransformer> for the ITransformer.
WordBagEstimator.Options	Options for how the n-grams are extracted.
WordEmbeddingEstimator	Text featurizer which converts vectors of text tokens into a numerical vector using a pre-trained embeddings model.
WordEmbeddingTransformer	ITransformer resulting from fitting an WordEmbeddingEstimator.
WordHashBagEstimator	IEstimator<TTransformer> for the ITransformer.
WordTokenizingEstimator	Tokenizes input text using specified delimiters.
WordTokenizingTransformer	ITransformer resulting from fitting an WordTokenizingEstimator.

Structs

LatentDirichletAllocationTransformer.ModelParameters.ItemScore
LatentDirichletAllocationTransformer.ModelParameters.WordItemScore

Interfaces

IStopWordsRemoverOptions

Defines the different type of stop words remover supported.

Enums

NgramExtractingEstimator.WeightingCriteria	A statistical measure used to evaluate how important a word is to a document in a corpus. This enumeration is serialized.
StopWordsRemovingEstimator.Language	Stopwords language. This enumeration is serialized.
TextFeaturizingEstimator.Language	Text language. This enumeration is serialized.
TextFeaturizingEstimator.NormFunction	Text vector normalizer kind.
TextNormalizingEstimator.CaseMode	Case normalization mode of text. This enumeration is serialized.
WordEmbeddingEstimator.PretrainedModelKind	Specifies which word embeddings to use.

Microsoft.ML.Transforms.Text Namespace

Classes

Structs

Interfaces

Enums

Feedback

Additional resources