@Brian Barbieri Thanks for the question. Can you please add more details about the azure ML SDK version.
Here is the doc for cross validation data folds.
Using "cv_splits_indices" in AutoMLConfig
Brian Barbieri
1
Reputation point
When training an regression model with AutoMLConfig with n_cross_validations being a normal int, I'm facing no problems.
Now I want to use TimeSeriesSplit as the cross validation method for training a model with AutoMLConfig. For this there is a "cv_splits_indices" argument where I put in a list of lists of indicis like the following when n_splits=5 in TimeSeriesSplit :
array([[array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]),
array([11, 12, 13, 14])],
[array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]),
array([15, 16, 17, 18])],
[array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18]),
array([19, 20, 21, 22])],
[array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22]),
array([23, 24, 25, 26])],
[array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26]),
array([27, 28, 29, 30])]], dtype=object)
Unfortunately when running the following cell:
automl_settings = {
"iteration_timeout_minutes": 15,
"experiment_timeout_hours": 0.3,
"max_cores_per_iteration" : -1,
"enable_early_stopping": True,
"primary_metric": 'normalized_root_mean_squared_error',
"featurization": 'auto',
"verbosity": logging.INFO,
"cv_splits_indices": idxs
}
automl_config = AutoMLConfig(task='regression',
debug_log=f'automated_ml_errors_.log',
training_data=train,
validation_data=train,
label_column_name=y_var,
**automl_settings)
I receive the following error:
ConfigException: ConfigException:
Message: cv_splits_indices should be a List of List[numpy.ndarray]. Each List[numpy.ndarray] corresponds to a CV fold and should have just 2 elements: The indices for training set and for the validation set.
InnerException: None
ErrorResponse
{
"error": {
"code": "UserError",
"message": "cv_splits_indices should be a List of List[numpy.ndarray]. Each List[numpy.ndarray] corresponds to a CV fold and should have just 2 elements: The indices for training set and for the validation set.",
"details_uri": "https://aka.ms/AutoMLConfig",
"target": "cv_splits_indices",
"inner_error": {
"code": "BadArgument",
"inner_error": {
"code": "ArgumentInvalid"
}
},
"reference_code": "XXXXXXREDACTEDXXXX"
}
}
What is going wrong here? My input looks correct?
Thank you
1 answer
Sort by: Most helpful
-
Ramr-msft 17,616 Reputation points
2021-03-04T14:48:46.463+00:00