Share via


ClusteringCatalog.CrossValidate Method

Definition

Run cross-validation over numberOfFolds folds of data, by fitting estimator, and respecting samplingKeyColumnName if provided. Then evaluate each sub-model against labelColumnName and return metrics.

public System.Collections.Generic.IReadOnlyList<Microsoft.ML.TrainCatalogBase.CrossValidationResult<Microsoft.ML.Data.ClusteringMetrics>> CrossValidate (Microsoft.ML.IDataView data, Microsoft.ML.IEstimator<Microsoft.ML.ITransformer> estimator, int numberOfFolds = 5, string labelColumnName = default, string featuresColumnName = default, string samplingKeyColumnName = default, int? seed = default);
member this.CrossValidate : Microsoft.ML.IDataView * Microsoft.ML.IEstimator<Microsoft.ML.ITransformer> * int * string * string * string * Nullable<int> -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.TrainCatalogBase.CrossValidationResult<Microsoft.ML.Data.ClusteringMetrics>>
Public Function CrossValidate (data As IDataView, estimator As IEstimator(Of ITransformer), Optional numberOfFolds As Integer = 5, Optional labelColumnName As String = Nothing, Optional featuresColumnName As String = Nothing, Optional samplingKeyColumnName As String = Nothing, Optional seed As Nullable(Of Integer) = Nothing) As IReadOnlyList(Of TrainCatalogBase.CrossValidationResult(Of ClusteringMetrics))

Parameters

data
IDataView

The data to run cross-validation on.

estimator
IEstimator<ITransformer>

The estimator to fit.

numberOfFolds
Int32

Number of cross-validation folds.

labelColumnName
String

Optional label column for evaluation (clustering tasks may not always have a label).

featuresColumnName
String

Optional features column for evaluation (needed for calculating Dbi metric)

samplingKeyColumnName
String

Name of a column to use for grouping rows. If two examples share the same value of the samplingKeyColumnName, they are guaranteed to appear in the same subset (train or test). This can be used to ensure no label leakage from the train to the test set. If null no row grouping will be performed.

seed
Nullable<Int32>

Seed for the random number generator used to select rows for cross-validation folds.

Returns

Applies to