dataset module

Manages the interaction with Azure Machine Learning Datasets.

This module provides functionality for consuming raw data, managing data, and performing actions on data in Azure Machine Learning service. Use the Dataset class in this module to create datasets along with the functionality in the data package, which contains the supporting classes FileDataset and TabularDataset.

To get started with datasets, see the article Add & register datasets.



Represents a resource for exploring, transforming, and managing data in Azure Machine Learning.

A Dataset is a reference to data in a Datastore. The following Datasets types are supported:

  • TabularDataset represents data in a tabular format created by parsing the provided file or list of files.

  • FileDataset references single or multiple files in datastores or from public URLs.

You can explore data in a Dataset with summary statistics and transform it using intelligent transforms. When you are ready to use the data for training, you can save the Dataset to your Azure Machine Learning workspace as a versioned Dataset.

To get started with datasets, see the article Add & register datasets, or see the notebooks and