Use blob index tags to manage and find data with Python

This article shows how to use blob index tags to manage and find data using the Azure Storage client library for Python.

To learn about setting blob index tags using asynchronous APIs, see Set blob index tags asynchronously.

Prerequisites

About blob index tags

Blob index tags categorize data in your storage account using key-value tag attributes. These tags are automatically indexed and exposed as a searchable multi-dimensional index to easily find data. This article shows you how to set, get, and find data using blob index tags.

Blob index tags aren't supported for storage accounts with hierarchical namespace enabled. To learn more about the blob index tag feature along with known issues and limitations, see Manage and find Azure Blob data with blob index tags.

Set tags

You can set index tags if your code has authorized access to blob data through one of the following mechanisms:

For more information, see Setting blob index tags.

You can set tags by using the following method:

The specified tags in this method will replace existing tags. If old values must be preserved, they must be downloaded and included in the call to this method. The following example shows how to set tags:

def set_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")

    # Get any existing tags for the blob if they need to be preserved
    tags = blob_client.get_blob_tags()

    # Add or modify tags
    updated_tags = {'Sealed': 'false', 'Content': 'image', 'Date': '2022-01-01'}
    tags.update(updated_tags)

    blob_client.set_blob_tags(tags)

You can delete all tags by passing an empty dict object into the set_blob_tags method:

def clear_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")

    # Pass in empty dict object to clear tags
    tags = dict()
    blob_client.set_blob_tags(tags)

Get tags

You can get index tags if your code has authorized access to blob data through one of the following mechanisms:

For more information, see Getting and listing blob index tags.

You can get tags by using the following method:

The following example shows how to retrieve and iterate over the blob's tags:

def get_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
    blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")

    tags = blob_client.get_blob_tags()
    print("Blob tags: ")
    for k, v in tags.items():
        print(k, v)

Filter and find data with blob index tags

You can use index tags to find and filter data if your code has authorized access to blob data through one of the following mechanisms:

For more information, see Finding data using blob index tags.

Note

You can't use index tags to retrieve previous versions. Tags for previous versions aren't passed to the blob index engine. For more information, see Conditions and known issues.

You can find data by using the following method:

The following example finds and lists all blobs tagged as an image:

def find_blobs_by_tags(self, blob_service_client: BlobServiceClient, container_name):
    container_client = blob_service_client.get_container_client(container=container_name)

    query = "\"Content\"='image'"
    blob_list = container_client.find_blobs_by_tags(filter_expression=query)
    
    print("Blobs tagged as images")
    for blob in blob_list:
        print(blob.name)

Set blob index tags asynchronously

The Azure Blob Storage client library for Python supports working with blob index tags asynchronously. To learn more about project setup requirements, see Asynchronous programming.

Follow these steps to set blob index tags using asynchronous APIs:

  1. Add the following import statements:

    import asyncio
    
    from azure.identity.aio import DefaultAzureCredential
    from azure.storage.blob.aio import BlobServiceClient
    
  2. Add code to run the program using asyncio.run. This function runs the passed coroutine, main() in our example, and manages the asyncio event loop. Coroutines are declared with the async/await syntax. In this example, the main() coroutine first creates the top level BlobServiceClient using async with, then calls the method that sets the blob index tags. Note that only the top level client needs to use async with, as other clients created from it share the same connection pool.

    async def main():
        sample = BlobSamples()
    
        # TODO: Replace <storage-account-name> with your actual storage account name
        account_url = "https://<storage-account-name>.blob.core.windows.net"
        credential = DefaultAzureCredential()
    
        async with BlobServiceClient(account_url, credential=credential) as blob_service_client:
            await sample.set_blob_tags(blob_service_client, "sample-container")
    
    if __name__ == '__main__':
        asyncio.run(main())
    
  3. Add code to set the blob index tags. The code is the same as the synchronous example, except that the method is declared with the async keyword and the await keyword is used when calling the get_blob_tags and set_blob_tags methods.

    async def set_blob_tags(self, blob_service_client: BlobServiceClient, container_name):
        blob_client = blob_service_client.get_blob_client(container=container_name, blob="sample-blob.txt")
    
        # Get any existing tags for the blob if they need to be preserved
        tags = await blob_client.get_blob_tags()
    
        # Add or modify tags
        updated_tags = {'Sealed': 'false', 'Content': 'image', 'Date': '2022-01-01'}
        tags.update(updated_tags)
    
        await blob_client.set_blob_tags(tags)
    

With this basic setup in place, you can implement other examples in this article as coroutines using async/await syntax.

Resources

To learn more about how to use index tags to manage and find data using the Azure Blob Storage client library for Python, see the following resources.

REST API operations

The Azure SDK for Python contains libraries that build on top of the Azure REST API, allowing you to interact with REST API operations through familiar Python paradigms. The client library methods for managing and using blob index tags use the following REST API operations:

Code samples

Client library resources

See also