Create datastores

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

In this article, you learn how to connect to Azure data storage services with Azure Machine Learning datastores.

Prerequisites

Note

Machine Learning datastores do not create the underlying storage account resources. Instead, they link an existing storage account for Machine Learning use. Machine Learning datastores aren't required. If you have access to the underlying data, you can use storage URIs directly.

Create an Azure Blob datastore

from azure.ai.ml.entities import AzureBlobDatastore
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureBlobDatastore(
    name="",
    description="",
    account_name="",
    container_name=""
)

ml_client.create_or_update(store)

Create an Azure Data Lake Storage Gen2 datastore

from azure.ai.ml.entities import AzureDataLakeGen2Datastore
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureDataLakeGen2Datastore(
    name="",
    description="",
    account_name="",
    filesystem=""
)

ml_client.create_or_update(store)

Create an Azure Files datastore

from azure.ai.ml.entities import AzureFileDatastore
from azure.ai.ml.entities import AccountKeyConfiguration
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureFileDatastore(
    name="file_example",
    description="Datastore pointing to an Azure File Share.",
    account_name="mytestfilestore",
    file_share_name="my-share",
    credentials=AccountKeyConfiguration(
        account_key= "XXXxxxXXXxXXXXxxXXXXXxXXXXXxXxxXxXXXxXXXxXXxxxXXxxXXXxXxXXXxxXxxXXXXxxxxxXXxxxxxxXXXxXXX"
    ),
)

ml_client.create_or_update(store)

Create an Azure Data Lake Storage Gen1 datastore

from azure.ai.ml.entities import AzureDataLakeGen1Datastore
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureDataLakeGen1Datastore(
    name="",
    store_name="",
    description="",
)

ml_client.create_or_update(store)

Create a OneLake (Microsoft Fabric) datastore (preview)

This section describes various options to create a OneLake datastore. The OneLake datastore is part of Microsoft Fabric. At this time, Machine Learning supports connection to Microsoft Fabric lakehouse artifacts that include folders or files and Amazon S3 shortcuts. For more information about lakehouses, see What is a lakehouse in Microsoft Fabric?.

OneLake datastore creation requires the following information from your Microsoft Fabric instance:

  • Endpoint
  • Fabric workspace name or GUID
  • Artifact name or GUID

The following three screenshots describe the retrieval of these required information resources from your Microsoft Fabric instance.

OneLake workspace name

In your Microsoft Fabric instance, you can find the workspace information, as shown in this screenshot. You can use either a GUID value or a "friendly name" to create a Machine Learning OneLake datastore.

Screenshot that shows Microsoft Fabric workspace details in the Microsoft Fabric UI.

OneLake endpoint

This screenshot shows how you can find endpoint information in your Microsoft Fabric instance.

Screenshot that shows Microsoft Fabric endpoint details in the Microsoft Fabric UI.

OneLake artifact name

This screenshot shows how you can find the artifact information in your Microsoft Fabric instance. The screenshot also shows how you can use either a GUID value or a friendly name to create a Machine Learning OneLake datastore.

Screenshot that shows how to get Microsoft Fabric lakehouse artifact details in the Microsoft Fabric UI.

Create a OneLake datastore

from azure.ai.ml.entities import OneLakeDatastore, OneLakeArtifact
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = OneLakeDatastore(
    name="onelake_example_id",
    description="Datastore pointing to an Microsoft fabric artifact.",
    one_lake_workspace_name="AzureML_Sample_OneLakeWS",
    endpoint="msit-onelake.dfs.fabric.microsoft.com"
    artifact = OneLakeArtifact(
        name="AzML_Sample_LH",
        type="lake_house"
    )
)

ml_client.create_or_update(store)

Next steps