Microsoft.ML.Data Namespace

Namespace containing data loading and saving, data schema definitions, and model training metrics components.

Classes

AnomalyDetectionMetrics

Evaluation results for anomaly detection(unsupervised learning algorithm).

AnomalyPredictionTransformer<TModel>

Base class for the ISingleFeaturePredictionTransformer<TModel> working on anomaly detection tasks.

BinaryClassificationMetrics

Evaluation results for binary classifiers, excluding probabilistic metrics.

BinaryClassificationMetricsStatistics

The BinaryClassificationMetricsStatistics class holds summary statistics over multiple observations of BinaryClassificationMetrics.

BinaryPrecisionRecallDataPoint

This class represents one data point on Precision-Recall curve for binary classification.

BinaryPredictionTransformer<TModel>

Base class for the ISingleFeaturePredictionTransformer<TModel> working on binary classification tasks.

BooleanDataViewType

The standard boolean type. This has representation type of Boolean. Note this can have only one possible value, accessible by the singleton static property Instance.

CalibratedBinaryClassificationMetrics

Evaluation results for binary classifiers, including probabilistic metrics.

ClusteringMetrics

The metrics generated after evaluating the clustering predictions.

ClusteringPredictionTransformer<TModel>

Base class for the ISingleFeaturePredictionTransformer<TModel> working on clustering tasks.

ColumnConcatenatingTransformer

ITransformer resulting from fitting an ColumnConcatenatingEstimator.

ColumnCursorExtensions

Extension methods that allow to extract values of a single column of an IDataView as an IEnumerable<T>.

ColumnNameAttribute

Allows a member to specify IDataView column name directly, as opposed to the default behavior of using the member name as the column name.

CompositeDataLoader<TSource,TLastTransformer>

This class represents a data loader that applies a transformer chain after loading. It also has methods to save itself to a repository.

CompositeLoaderEstimator<TSource,TLastTransformer>

An estimator class for composite data loader. It can be used to build a 'trainable smart data loader', although this pattern is not very common.

ConfusionMatrix

Represents the confusion matrix of the classification results.

DataDebuggerPreview

This class represents an eager 'preview' of a IDataView.

DataDebuggerPreview.ColumnInfo
DataDebuggerPreview.RowInfo
DataViewType

This is the abstract base class for all types in the IDataView type system.

DataViewTypeAttribute

DataViewTypeAttribute should be used to decorated class properties and fields, if that class' instances will be loaded as ML.NET IDataView. The function Register() will be called to register a DataViewType for a Type with its Attributes. Whenever a value typed to the registered Type and its Attributes, that value's type (i.e., a Type) in IDataView would be the associated DataViewType.

DataViewTypeManager

A singleton class for managing the map between ML.NET DataViewType and C# Type. To support custom column type in IDataView, the column's underlying type (e.g., a C# class's type) should be registered with a class derived from DataViewType.

DateTimeDataViewType

The standard date time type. This has representation type of DateTime. Note this can have only one possible value, accessible by the singleton static property Instance.

DateTimeOffsetDataViewType

The standard date time offset type. This has representation type of DateTimeOffset. Note this can have only one possible value, accessible by the singleton static property Instance.

EstimatorChain<TLastTransformer>

Represents a chain (potentially empty) of estimators that end with a TLastTransformer. If the chain is empty, TLastTransformer is always ITransformer.

FileHandleSource

Wraps an IFileHandle as an IMultiStreamSource.

ImageLoadingEstimator

IEstimator<TTransformer> for the ImageLoadingTransformer.

ImageLoadingTransformer

ITransformer resulting from fitting a ImageLoadingEstimator.

KeyCount

Defines the cardinality, or count, of valid values of a KeyDataViewType column. This needs to be strictly positive. It is used by TextLoader and TypeConvertingEstimator.

KeyDataViewType

This type is for data representing some enumerated value. This is an enumeration over a defined, known cardinality set, as expressed through Count. The underlying .NET type is one of the unsigned integer types. Most commonly this will be UInt32, but could alternately be Byte, UInt16, or UInt64. Despite this, the information is not inherently numeric, so, typically, arithmetic is not meaningful. For example, in multi-class classification, the label is typically a class number which is naturally a KeyDataViewType.

Note that for data of this type, a value of 0, being the default value of the representation type, indicates the missing value since it would not be sensible for the default value to correspond to any one particular specific value of the set. The first non-missing value for the enumeration of the set is always 1.

For instance, if you had a key value with a Count of 3, then the UInt32 value 0 would correspond to the missing key value, and one of the values of 1, 2, or 3 would be one of the valid values, and no other values should in principle be used.

Note that in usage and structure, this is quite close in intended usage and structure to so-called "factor variables" in R.

KeyTypeAttribute

Allow member to be marked as a KeyDataViewType.

LoadColumnAttribute

Allow member to specify mapping to field(s) in text file. To override name of IDataView column use ColumnNameAttribute.

MetricStatistics

The MetricsStatistics class computes summary statistics over multiple observations of a metric.

MulticlassClassificationMetrics

Evaluation results for multi-class classification trainers.

MulticlassClassificationMetricsStatistics

The MulticlassClassificationMetricsStatistics class holds summary statistics over multiple observations of MulticlassClassificationMetrics.

MulticlassPredictionTransformer<TModel>

Base class for the ISingleFeaturePredictionTransformer<TModel> working on multi-class classification tasks.

MultiFileSource

Wraps a potentially compound path as an IMultiStreamSource.

NoColumnAttribute

Mark this member as not being exposed as a IDataView column in the DataViewSchema.

NumberDataViewType

The standard number type. This class is not directly instantiable. All allowed instances of this type are singletons, and are accessible as static properties on this class.

OneToOneTransformerBase

Base class for transformer which operates on pairs input and output columns.

PredictionTransformerBase<TModel>

Base class for transformers with no feature column, or more than one feature columns.

PrimitiveDataViewType

The abstract base class for all primitive types. Values of these types can be freely copied without concern for ownership, mutation, or disposing.

RankingMetrics

Evaluation results for rankers.

RankingMetricsStatistics

The RankingMetricsStatistics class holds summary statistics over multiple observations of RankingMetrics.

RankingPredictionTransformer<TModel>

Base class for the ISingleFeaturePredictionTransformer<TModel> working on ranking tasks.

RegressionMetrics

Evaluation results regression algorithms (supervised learning algorithm).

RegressionMetricsStatistics

The RegressionMetricsStatistics class holds summary statistics over multiple observations of RegressionMetrics.

RegressionPredictionTransformer<TModel>

Base class for the ISingleFeaturePredictionTransformer<TModel> working on regression tasks.

RowIdDataViewType

The RowIdDataViewType type. This has representation type of DataViewRowId. Note this can have only one possible value, accessible by the singleton static property Instance.

RowToRowTransformerBase

Base class for transformer which produce new columns, but doesn't affect existing ones.

SchemaAnnotationsExtensions

Extension methods to facilitate easy consumption of popular contents of Annotations.

SchemaDefinition

This class defines a schema of a typed data view.

SchemaDefinition.Column

One column of the data view.

SimpleFileHandle

A simple disk-based file handle.

SingleFeaturePredictionTransformerBase<TModel>

The base class for all the transformers implementing the ISingleFeaturePredictionTransformer<TModel>. Those are all the transformers that work with one feature column.

StructuredDataViewType

The abstract base class for all non-primitive types.

TextDataViewType

The standard text type. This has representation type of ReadOnlyMemory<T> with type parameter Char. Note this can have only one possible value, accessible by the singleton static property Instance.

TextLoader

Loads a text file into an IDataView. Supports basic mapping from input columns to IDataView columns.

TextLoader.Column

Describes how an input column should be mapped to an IDataView column.

TextLoader.Options

The settings for TextLoader

TextLoader.Range

Specifies the range of indices of input columns that should be mapped to an output column.

TimeSpanDataViewType

The standard timespan type. This has representation type of TimeSpan. Note this can have only one possible value, accessible by the singleton static property Instance.

TransformerChain<TLastTransformer>

A chain of transformers (possibly empty) that end with a TLastTransformer. For an empty chain, TLastTransformer is always ITransformer.

TrivialEstimator<TTransformer>

The trivial implementation of IEstimator<TTransformer> that already has the transformer and returns it on every call to Fit(IDataView).

Concrete implementations still have to provide the schema propagation mechanism, since there is no easy way to infer it from the transformer.

VBufferEditor

Various methods for creating VBufferEditor<T> instances.

VectorDataViewType

The standard vector type. The representation type of this is VBuffer<T>, where the type parameter is in ItemType.

VectorTypeAttribute

Allows a member to be marked as a VectorDataViewType, primarily allowing one to set the dimensionality of the resulting array.

Structs

DataViewRowId

A structure serving as the identifier of a row of IDataView. For datasets with millions of records, those IDs need to be unique, therefore the need for such a large structure to hold the values. Those Ids are derived from other Ids of the previous components of the pipelines, and dividing the structure in two: high order and low order of bits, and reduces the changes of those collisions even further.

VBuffer<T>

A buffer that supports both dense and sparse representations. This is the representation type for all VectorDataViewType instances. The explicitly defined values of this vector are exposed through GetValues() and, if not dense, GetIndices().

VBufferEditor<T>

An object capable of editing a VBuffer<T> by filling out Values (and Indices if the buffer is not dense).

Interfaces

IFileHandle

A file handle.

IMultiStreamSource

An interface for exposing some number of items that can be opened for reading.

IRowToRowMapper

This interface maps an input DataViewRow to an output DataViewRow. Typically, the output contains both the input columns and new columns added by the implementing class, although some implementations may return a subset of the input columns. This interface is similar to Microsoft.ML.Data.ISchemaBoundRowMapper, except it does not have any input role mappings, so to rebind, the same input column names must be used. Implementations of this interface are typically created over defined input DataViewSchema.

Enums

DataKind

Specifies a simple data type.

SchemaDefinition.Direction
TransformerScope

This enum allows for 'tagging' the estimators (and subsequently transformers) in the chain to be used 'only for training', 'for training and evaluation' etc. Most notable example is, transformations over the label column should not be used for scoring, so the scope should be Training or TrainTest.