az ml dataset

Note

This reference is part of the ml extension for the Azure CLI (version 2.15.0 or higher). The extension will automatically install the first time you run an az ml dataset command. Learn more about extensions.

Manage Azure ML dataset assets.

Azure ML dataset assets are references to file(s) in your storage services or public URLs along with any corresponding metadata. They are not copies of your data. You can use these dataset assets to access relevant data during model training and mount or download the referenced data to your compute target.

Commands

az ml dataset create

Create a dataset asset.

az ml dataset list

List dataset assets in a workspace.

az ml dataset show

Shows details for a dataset asset.

az ml dataset update

Update a dataset asset.

az ml dataset create

Create a dataset asset.

Dataset assets can be defined from files on your local machine or as references to files in cloud storage. The created dataset asset will be tracked in the workspace under the specified name and version.

To create a dataset asset from file(s) on your local machine, specify the 'local_path' field in your YAML config. Azure ML will upload these file(s) to the blob container that backs the workspace's default datastore (named 'workspaceblobstore'). The created dataset asset will then point to that uploaded data.

To create a dataset asset that references file(s) in cloud storage, specify the 'datastore' that corresponds to the storage service and the 'path' to the file(s) in storage in your YAML config.

You can also create a dataset asset directly from a storage URL or public URL. To do so, specify the URL to the 'path' field in your YAML config.

az ml dataset create --resource-group
                     --workspace-name
                     [--description]
                     [--file]
                     [--local-path]
                     [--name]
                     [--paths]
                     [--set]
                     [--tags]
                     [--version]

Examples

Create a dataset asset from a YAML specification file

az ml dataset create --file data.yml --resource-group my-resource-group --workspace-name my-workspace

Required Parameters

--resource-group -g

Name of resource group. You can configure the default group using az configure --defaults group=<name>.

--workspace-name -w

Name of the Azure ML workspace. You can configure the default group using az configure --defaults workspace=<name>.

Optional Parameters

--description -d

Description of the dataset.

--file -f

Local path to the YAML file containing the Azure ML dataset specification.

--local-path -l

Local file or folder path to create the dataset.

--name -n

Name of the dataset.

--paths -p

Path of data in supported URI formats to create the dataset. Examples: 'folder:azureml://datastores/mydatastore/paths/path_to_data/', 'file:azureml://datastores/mydatastore/paths/path_to_data/myfile.csv'.

--set

Update an object by specifying a property path and value to set. Example: --set property1.property2=.

--tags

Space-separated tags for the dataset.

--version -v

Version of the dataset.

az ml dataset list

List dataset assets in a workspace.

az ml dataset list --resource-group
                   --workspace-name
                   [--max-results]
                   [--name]

Examples

List all the dataset assets in a workspace

az ml dataset list --resource-group my-resource-group --workspace-name my-workspace

List all the dataset asset versions for the specified name in a workspace

az ml dataset list --name my-data --resource-group my-resource-group --workspace-name my-workspace

List all the dataset assets in a workspace using --query argument to execute a JMESPath query on the results of commands.

az ml dataset list --query "[].{Name:name}" --output table --resource-group my-resource-group --workspace-name my-workspace

Required Parameters

--resource-group -g

Name of resource group. You can configure the default group using az configure --defaults group=<name>.

--workspace-name -w

Name of the Azure ML workspace. You can configure the default group using az configure --defaults workspace=<name>.

Optional Parameters

--max-results -r

Max number of results to return.

--name -n

Name of the data asset. If provided, all the data versions under this name will be returned.

az ml dataset show

Shows details for a dataset asset.

az ml dataset show --name
                   --resource-group
                   --workspace-name
                   [--label]
                   [--version]

Examples

Show details for a dataset asset with the specified name and version

az ml dataset show --name my-data --version 1 --resource-group my-resource-group --workspace-name my-workspace

Required Parameters

--name -n

Name of the data asset.

--resource-group -g

Name of resource group. You can configure the default group using az configure --defaults group=<name>.

--workspace-name -w

Name of the Azure ML workspace. You can configure the default group using az configure --defaults workspace=<name>.

Optional Parameters

--label -l

Label of the data asset.

--version -v

Version of the data asset.

az ml dataset update

Update a dataset asset.

Only the 'description' and 'tags' properties can be updated.

az ml dataset update --resource-group
                     --workspace-name
                     [--add]
                     [--force-string]
                     [--label]
                     [--name]
                     [--remove]
                     [--set]
                     [--version]

Required Parameters

--resource-group -g

Name of resource group. You can configure the default group using az configure --defaults group=<name>.

--workspace-name -w

Name of the Azure ML workspace. You can configure the default group using az configure --defaults workspace=<name>.

Optional Parameters

--add

Add an object to a list of objects by specifying a path and key value pairs. Example: --add property.listProperty <key=value, string or JSON string>.

--force-string

When using 'set' or 'add', preserve string literals instead of attempting to convert to JSON.

--label -l

Label of the data asset.

--name -n

Name of the data asset.

--remove

Remove a property or an element from a list. Example: --remove property.list OR --remove propertyToRemove.

--set

Update an object by specifying a property path and value to set. Example: --set property1.property2=.

--version -v

Version of the data asset.