Use .NET to manage ACLs in Azure Data Lake Storage Gen2

This article shows you how to use .NET to get, set, and update the access control lists of directories and files.

ACL inheritance is already available for new child items that are created under a parent directory. But you can also add, update, and remove ACLs recursively on the existing child items of a parent directory without having to make these changes individually for each child item.

Package (NuGet) | Samples | Recursive ACL Sample | API reference | Gen1 to Gen2 mapping | Give Feedback

Prerequisites

  • An Azure subscription. See Get Azure free trial.

  • A storage account that has hierarchical namespace (HNS) enabled. Follow these instructions to create one.

  • Azure CLI version 2.6.0 or higher.

  • One of the following security permissions:

    • A provisioned Azure Active Directory (AD) security principal that has been assigned the Storage Blob Data Owner role in the scope of the either the target container, parent resource group or subscription.

    • Owning user of the target container or directory to which you plan to apply ACL settings. To set ACLs recursively, this includes all child items in the target container or directory.

    • Storage account key.

Set up your project

To get started, install the Azure.Storage.Files.DataLake NuGet package.

  1. Open a command window (For example: Windows PowerShell).

  2. From your project directory, install the Azure.Storage.Files.DataLake preview package by using the dotnet add package command.

    dotnet add package Azure.Storage.Files.DataLake -v 12.6.0 -s https://pkgs.dev.azure.com/azure-sdk/public/_packaging/azure-sdk-for-net/nuget/v3/index.json
    

    Then, add these using statements to the top of your code file.

    using Azure;
    using Azure.Core;
    using Azure.Storage;
    using Azure.Storage.Files.DataLake;
    using Azure.Storage.Files.DataLake.Models;
    using System.Collections.Generic;
    using System.Threading.Tasks;
    

Connect to the account

To use the snippets in this article, you'll need to create a DataLakeServiceClient instance that represents the storage account.

Connect by using Azure Active Directory (AD)

Note

If you're using Azure Active Directory (Azure AD) to authorize access, then make sure that your security principal has been assigned the Storage Blob Data Owner role. To learn more about how ACL permissions are applied and the effects of changing them, see Access control model in Azure Data Lake Storage Gen2.

You can use the Azure identity client library for .NET to authenticate your application with Azure AD.

After you install the package, add this using statement to the top of your code file.

using Azure.Identity;

Get a client ID, a client secret, and a tenant ID. To do this, see Acquire a token from Azure AD for authorizing requests from a client application. As part of that process, you'll have to assign one of the following Azure role-based access control (Azure RBAC) roles to your security principal.

Role ACL setting capability
Storage Blob Data Owner All directories and files in the account.
Storage Blob Data Contributor Only directories and files owned by the security principal.

This example creates a DataLakeServiceClient instance by using a client ID, a client secret, and a tenant ID.

public static void GetDataLakeServiceClient(ref DataLakeServiceClient dataLakeServiceClient,
    String accountName, String clientID, string clientSecret, string tenantID)
{

    TokenCredential credential = new ClientSecretCredential(
        tenantID, clientID, clientSecret, new TokenCredentialOptions());

    string dfsUri = "https://" + accountName + ".dfs.core.windows.net";

    dataLakeServiceClient = new DataLakeServiceClient(new Uri(dfsUri), credential);
}

Note

For more examples, see the Azure identity client library for .NET documentation..

Connect by using an account key

This is the easiest way to connect to an account.

This example creates a DataLakeServiceClient instance by using an account key.

public static void GetDataLakeServiceClient(ref DataLakeServiceClient dataLakeServiceClient,
    string accountName, string accountKey)
{
    StorageSharedKeyCredential sharedKeyCredential =
        new StorageSharedKeyCredential(accountName, accountKey);

    string dfsUri = "https://" + accountName + ".dfs.core.windows.net";

    dataLakeServiceClient = new DataLakeServiceClient
        (new Uri(dfsUri), sharedKeyCredential);
}

Set ACLs

When you set an ACL, you replace the entire ACL including all of it's entries. If you want to change the permission level of a security principal or add a new security principal to the ACL without affecting other existing entries, you should update the ACL instead. To update an ACL instead of replace it, see the Update ACLs section of this article.

If you choose to set the ACL, you must add an entry for the owning user, an entry for the owning group, and an entry for all other users. To learn more about the owning user, the owning group, and all other users, see Users and identities.

This section shows you how to:

  • Set the ACL of a directory
  • Set the ACL of a file
  • Set ACLs recursively

Set the ACL of a directory

Get the access control list (ACL) of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method and set the ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

This example gets and sets the ACL of a directory named my-directory. The string user::rwx,group::r-x,other::rw- gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read and write permission.

public async Task ManageDirectoryACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
      fileSystemClient.GetDirectoryClient("");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    foreach (var item in directoryAccessControl.AccessControlList)
    {
        Console.WriteLine(item.ToString());
    }

    IList<PathAccessControlItem> accessControlList
        = PathAccessControlExtensions.ParseAccessControlList
        ("user::rwx,group::r-x,other::rw-");

    directoryClient.SetAccessControlList(accessControlList);

}

You can also get and set the ACL of the root directory of a container. To get the root directory, pass an empty string ("") into the DataLakeFileSystemClient.GetDirectoryClient method.

Set the ACL of a file

Get the access control list (ACL) of a file by calling the DataLakeFileClient.GetAccessControlAsync method and set the ACL by calling the DataLakeFileClient.SetAccessControlList method.

This example gets and sets the ACL of a file named my-file.txt. The string user::rwx,group::r-x,other::rw- gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read and write permission.

public async Task ManageFileACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    DataLakeFileClient fileClient =
        directoryClient.GetFileClient("hello.txt");

    PathAccessControl FileAccessControl =
        await fileClient.GetAccessControlAsync();

    foreach (var item in FileAccessControl.AccessControlList)
    {
        Console.WriteLine(item.ToString());
    }

    IList<PathAccessControlItem> accessControlList
        = PathAccessControlExtensions.ParseAccessControlList
        ("user::rwx,group::r-x,other::rw-");

    fileClient.SetAccessControlList(accessControlList);
}

Set ACLs recursively

Set ACLs recursively by calling the DataLakeDirectoryClient.SetAccessControlRecursiveAsync method. Pass this method a List of PathAccessControlItem. Each PathAccessControlItem defines an ACL entry.

If you want to set a default ACL entry, then you can set the PathAccessControlItem.DefaultScope property of the PathAccessControlItem to true.

This example sets the ACL of a directory named my-parent-directory. This method accepts a boolean parameter named isDefaultScope that specifies whether to set the default ACL. That parameter is used in the constructor of the PathAccessControlItem. The entries of the ACL give the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others no access. The last ACL entry in this example gives a specific user with the object ID ""xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" read and execute permissions.

    public async Task SetACLRecursively(DataLakeServiceClient serviceClient, bool isDefaultScope)
{
    DataLakeDirectoryClient directoryClient =
        serviceClient.GetFileSystemClient("my-container").
            GetDirectoryClient("my-parent-directory");

    List<PathAccessControlItem> accessControlList =
        new List<PathAccessControlItem>()
    {
new PathAccessControlItem(AccessControlType.User,
    RolePermissions.Read |
    RolePermissions.Write |
    RolePermissions.Execute, isDefaultScope),

new PathAccessControlItem(AccessControlType.Group,
    RolePermissions.Read |
    RolePermissions.Execute, isDefaultScope),

new PathAccessControlItem(AccessControlType.Other,
    RolePermissions.None, isDefaultScope),

new PathAccessControlItem(AccessControlType.User,
    RolePermissions.Read |
    RolePermissions.Execute, isDefaultScope,
    entityId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"),
    };

    await directoryClient.SetAccessControlRecursiveAsync
        (accessControlList, null);
}

To see an example that sets ACLs recursively in batches by specifying a batch size, see the .NET sample.

Update ACLs

When you update an ACL, you modify the ACL instead of replacing the ACL. For example, you can add a new security principal to the ACL without affecting other security principals listed in the ACL. To replace the ACL instead of update it, see the Set ACLs section of this article.

This section shows you how to:

  • Update an ACL
  • Update ACLs recursively

Update an ACL

First, get the ACL of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method. Copy the list of ACL entries to a new List of PathAccessControl objects. Then locate the entry that you want to update and replace it in the list. Set the ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

This example updates the root ACL of a container by replacing the ACL entry for all other users.

public async Task UpdateDirectoryACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
      fileSystemClient.GetDirectoryClient("");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    List<PathAccessControlItem> accessControlListUpdate 
        = (List<PathAccessControlItem>)directoryAccessControl.AccessControlList;

    int index = -1;

    foreach (var item in accessControlListUpdate)
    {
        if (item.AccessControlType == AccessControlType.Other)
        {
            index = accessControlListUpdate.IndexOf(item);
            break;
        }
    }

    if (index > -1)
    {
        accessControlListUpdate[index] = new PathAccessControlItem(AccessControlType.Other,
        RolePermissions.Read |
        RolePermissions.Execute);

        directoryClient.SetAccessControlList(accessControlListUpdate);
    }

   }

Update ACLs recursively

To update an ACL recursively, create a new ACL object with the ACL entry that you want to update, and then use that object in update ACL operation. Do not get the existing ACL, just provide ACL entries to be updated.

Update an ACL recursively by calling the DataLakeDirectoryClient.UpdateAccessControlRecursiveAsync method. Pass this method a List of PathAccessControlItem. Each PathAccessControlItem defines an ACL entry.

If you want to update a default ACL entry, then you can set the PathAccessControlItem.DefaultScope property of the PathAccessControlItem to true.

This example updates an ACL entry with write permission. This method accepts a boolean parameter named isDefaultScope that specifies whether to update the default ACL. That parameter is used in the constructor of the PathAccessControlItem.

public async Task UpdateACLsRecursively(DataLakeServiceClient serviceClient, bool isDefaultScope)
{
    DataLakeDirectoryClient directoryClient =
        serviceClient.GetFileSystemClient("my-container").
        GetDirectoryClient("my-parent-directory");

    List<PathAccessControlItem> accessControlListUpdate =
        new List<PathAccessControlItem>()
    {
new PathAccessControlItem(AccessControlType.User,
    RolePermissions.Read |
    RolePermissions.Write |
    RolePermissions.Execute, isDefaultScope,
    entityId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"),
    };

    await directoryClient.UpdateAccessControlRecursiveAsync
        (accessControlListUpdate, null);

}

To see an example that updates ACLs recursively in batches by specifying a batch size, see the .NET sample.

Remove ACL entries

You can remove one or more ACL entries. This section shows you how to:

  • Remove an ACL entry
  • Remove ACL entries recursively

Remove an ACL entry

First, get the ACL of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method. Copy the list of ACL entries to a new List of PathAccessControl objects. Then locate the entry that you want to remove and call the Remove method of the collection. Set the updated ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

This example updates the root ACL of a container by replacing the ACL entry for all other users.

public async Task RemoveDirectoryACLEntry
    (DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
      fileSystemClient.GetDirectoryClient("");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    List<PathAccessControlItem> accessControlListUpdate
        = (List<PathAccessControlItem>)directoryAccessControl.AccessControlList;

    PathAccessControlItem entryToRemove = null;

    foreach (var item in accessControlListUpdate)
    {
        if (item.EntityId == "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
        {
            entryToRemove = item;
            break;
        }
    }

    if (entryToRemove != null)
    {
        accessControlListUpdate.Remove(entryToRemove);
        directoryClient.SetAccessControlList(accessControlListUpdate);
    }

}

Remove ACL entries recursively

To remove ACL entries recursively, create a new ACL object for ACL entry to be removed, and then use that object in remove ACL operation. Do not get the existing ACL, just provide the ACL entries to be removed.

Remove ACL entries by calling the DataLakeDirectoryClient.RemoveAccessControlRecursiveAsync method. Pass this method a List of PathAccessControlItem. Each PathAccessControlItem defines an ACL entry.

If you want to remove a default ACL entry, then you can set the PathAccessControlItem.DefaultScope property of the PathAccessControlItem to true.

This example removes an ACL entry from the ACL of the directory named my-parent-directory. This method accepts a boolean parameter named isDefaultScope that specifies whether to remove the entry from the default ACL. That parameter is used in the constructor of the PathAccessControlItem.

public async Task RemoveACLsRecursively(DataLakeServiceClient serviceClient, bool isDefaultScope)
{
    DataLakeDirectoryClient directoryClient =
        serviceClient.GetFileSystemClient("my-container").
            GetDirectoryClient("my-parent-directory");

    List<RemovePathAccessControlItem> accessControlListForRemoval =
        new List<RemovePathAccessControlItem>()
        {
    new RemovePathAccessControlItem(AccessControlType.User, isDefaultScope,
    entityId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"),
        };

    await directoryClient.RemoveAccessControlRecursiveAsync
        (accessControlListForRemoval, null);

}

To see an example that removes ACLs recursively in batches by specifying a batch size, see the .NET sample.

Recover from failures

You might encounter runtime or permission errors when modifying ACLs recursively. For runtime errors, restart the process from the beginning. Permission errors can occur if the security principal doesn't have sufficient permission to modify the ACL of a directory or file that is in the directory hierarchy being modified. Address the permission issue, and then choose to either resume the process from the point of failure by using a continuation token, or restart the process from beginning. You don't have to use the continuation token if you prefer to restart from the beginning. You can reapply ACL entries without any negative impact.

This example returns a continuation token in the event of a failure. The application can call this example method again after the error has been addressed, and pass in the continuation token. If this example method is called for the first time, the application can pass in a value of null for the continuation token parameter.

public async Task<string> ResumeAsync(DataLakeServiceClient serviceClient,
    DataLakeDirectoryClient directoryClient,
    List<PathAccessControlItem> accessControlList,
    string continuationToken)
{
    try
    {
        var accessControlChangeResult =
            await directoryClient.SetAccessControlRecursiveAsync(
                accessControlList, continuationToken: continuationToken, null);

        if (accessControlChangeResult.Value.Counters.FailedChangesCount > 0)
        {
            continuationToken =
                accessControlChangeResult.Value.ContinuationToken;
        }

        return continuationToken;
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.ToString());
        return continuationToken;
    }

}

To see an example that sets ACLs recursively in batches by specifying a batch size, see the .NET sample.

If you want the process to complete uninterrupted by permission errors, you can specify that.

To ensure that the process completes uninterrupted, pass in an AccessControlChangedOptions object and set the ContinueOnFailure property of that object to true.

This example sets ACL entries recursively. If this code encounters a permission error, it records that failure and continues execution. This example prints the number of failures to the console.

public async Task ContinueOnFailureAsync(DataLakeServiceClient serviceClient,
    DataLakeDirectoryClient directoryClient,
    List<PathAccessControlItem> accessControlList)
{
    var accessControlChangeResult =
        await directoryClient.SetAccessControlRecursiveAsync(
            accessControlList, null, new AccessControlChangeOptions()
            { ContinueOnFailure = true });

    var counters = accessControlChangeResult.Value.Counters;

    Console.WriteLine("Number of directories changed: " +
        counters.ChangedDirectoriesCount.ToString());

    Console.WriteLine("Number of files changed: " +
        counters.ChangedFilesCount.ToString());

    Console.WriteLine("Number of failures: " +
        counters.FailedChangesCount.ToString());
}

To see an example that sets ACLs recursively in batches by specifying a batch size, see the .NET sample.

Best practices

This section provides you some best practice guidelines for setting ACLs recursively.

Handling runtime errors

A runtime error can occur for many reasons (For example: an outage or a client connectivity issue). If you encounter a runtime error, restart the recursive ACL process. ACLs can be reapplied to items without causing a negative impact.

Handling permission errors (403)

If you encounter an access control exception while running a recursive ACL process, your AD security principal might not have sufficient permission to apply an ACL to one or more of the child items in the directory hierarchy. When a permission error occurs, the process stops and a continuation token is provided. Fix the permission issue, and then use the continuation token to process the remaining dataset. The directories and files that have already been successfully processed won't have to be processed again. You can also choose to restart the recursive ACL process. ACLs can be reapplied to items without causing a negative impact.

Credentials

We recommend that you provision an Azure AD security principal that has been assigned the Storage Blob Data Owner role in the scope of the target storage account or container.

Performance

To reduce latency, we recommend that you run the recursive ACL process in an Azure Virtual Machine (VM) that is located in the same region as your storage account.

ACL limits

The maximum number of ACLs that you can apply to a directory or file is 32 access ACLs and 32 default ACLs. For more information, see Access control in Azure Data Lake Storage Gen2.

See also