Use .NET to manage directories, files, and ACLs in Azure Data Lake Storage Gen2

This article shows you how to use .NET to create and manage directories, files, and permissions in storage accounts that has hierarchical namespace (HNS) enabled.

Package (NuGet) | Samples | API reference | Gen1 to Gen2 mapping | Give Feedback

Prerequisites

  • An Azure subscription. See Get Azure free trial.
  • A storage account that has hierarchical namespace (HNS) enabled. Follow these instructions to create one.

Set up your project

To get started, install the Azure.Storage.Files.DataLake NuGet package.

For more information about how to install NuGet packages, see Install and manage packages in Visual Studio using the NuGet Package Manager.

Then, add these using statements to the top of your code file.

using Azure.Storage.Files.DataLake;
using Azure.Storage.Files.DataLake.Models;
using Azure.Storage;
using System.IO;
using Azure;

Connect to the account

To use the snippets in this article, you'll need to create a DataLakeServiceClient instance that represents the storage account.

Connect by using an account key

This is the easiest way to connect to an account.

This example creates a DataLakeServiceClient instance by using an account key.

public void GetDataLakeServiceClient(ref DataLakeServiceClient dataLakeServiceClient,
    string accountName, string accountKey)
{
    StorageSharedKeyCredential sharedKeyCredential =
        new StorageSharedKeyCredential(accountName, accountKey);

    string dfsUri = "https://" + accountName + ".dfs.core.windows.net";

    dataLakeServiceClient = new DataLakeServiceClient
        (new Uri(dfsUri), sharedKeyCredential);
}

Connect by using Azure Active Directory (AD)

You can use the Azure identity client library for .NET to authenticate your application with Azure AD.

This example creates a DataLakeServiceClient instance by using a client ID, a client secret, and a tenant ID. To get these values, see Acquire a token from Azure AD for authorizing requests from a client application.

public void GetDataLakeServiceClient(ref DataLakeServiceClient dataLakeServiceClient, 
    String accountName, String clientID, string clientSecret, string tenantID)
{

    TokenCredential credential = new ClientSecretCredential(
        tenantID, clientID, clientSecret, new TokenCredentialOptions());

    string dfsUri = "https://" + accountName + ".dfs.core.windows.net";

    dataLakeServiceClient = new DataLakeServiceClient(new Uri(dfsUri), credential);
}

Note

For more examples, see the Azure identity client library for .NET documentation..

Create a file system

A file system acts as a container for your files. You can create one by calling the DataLakeServiceClient.CreateFileSystem method.

This example creates a file system named my-file-system.

public async Task<DataLakeFileSystemClient> CreateFileSystem
    (DataLakeServiceClient serviceClient)
{
        return await serviceClient.CreateFileSystemAsync("my-file-system");
}

Create a directory

Create a directory reference by calling the DataLakeFileSystemClient.CreateDirectoryAsync method.

This example adds a directory named my-directory to a file system, and then adds a sub-directory named my-subdirectory.

public async Task<DataLakeDirectoryClient> CreateDirectory
    (DataLakeServiceClient serviceClient, string fileSystemName)
{
    DataLakeFileSystemClient fileSystemClient =
        serviceClient.GetFileSystemClient(fileSystemName);

    DataLakeDirectoryClient directoryClient =
        await fileSystemClient.CreateDirectoryAsync("my-directory");

    return await directoryClient.CreateSubDirectoryAsync("my-subdirectory");
}

Rename or move a directory

Rename or move a directory by calling the DataLakeDirectoryClient.RenameAsync method. Pass the path of the desired directory a parameter.

This example renames a sub-directory to the name my-subdirectory-renamed.

public async Task<DataLakeDirectoryClient> 
    RenameDirectory(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory/my-subdirectory");

    return await directoryClient.RenameAsync("my-directory/my-subdirectory-renamed");
}

This example moves a directory named my-subdirectory-renamed to a sub-directory of a directory named my-directory-2.

public async Task<DataLakeDirectoryClient> MoveDirectory
    (DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
            fileSystemClient.GetDirectoryClient("my-directory/my-subdirectory-renamed");

    return await directoryClient.RenameAsync("my-directory-2/my-subdirectory-renamed");                
}

Delete a directory

Delete a directory by calling the DataLakeDirectoryClient.Delete method.

This example deletes a directory named my-directory.

public void DeleteDirectory(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    directoryClient.Delete();
}

Manage a directory ACL

Get the access control list (ACL) of a directory by calling the DataLakeDirectoryClient.GetAccessControlAsync method and set the ACL by calling the DataLakeDirectoryClient.SetAccessControlList method.

Note

If your application authorizes access by using Azure Active Directory (Azure AD), then make sure that the security principal that your application uses to authorize access has been assigned the Storage Blob Data Owner role. To learn more about how ACL permissions are applied and the effects of changing them, see Access control in Azure Data Lake Storage Gen2.

This example gets and sets the ACL of a directory named my-directory. The string user::rwx,group::r-x,other::rw- gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read and write permission.

public async Task ManageDirectoryACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    PathAccessControl directoryAccessControl =
        await directoryClient.GetAccessControlAsync();

    Console.WriteLine(directoryAccessControl.AccessControlList);

    IList<PathAccessControlItem> accessControlList
        = PathAccessControlExtensions.ParseAccessControlList
        ("user::rwx,group::r-x,other::rw-");

    directoryClient.SetAccessControlList(accessControlList);

}

Upload a file to a directory

First, create a file reference in the target directory by creating an instance of the DataLakeFileClient class. Upload a file by calling the DataLakeFileClient.AppendAsync method. Make sure to complete the upload by calling the DataLakeFileClient.FlushAsync method.

This example uploads a text file to a directory named my-directory.

public async Task UploadFile(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    DataLakeFileClient fileClient = await directoryClient.CreateFileAsync("uploaded-file.txt");

    FileStream fileStream = 
        File.OpenRead("C:\\file-to-upload.txt");

    long fileSize = fileStream.Length;

    await fileClient.AppendAsync(fileStream, offset: 0);

    await fileClient.FlushAsync(position: fileSize);

}

Tip

If your file size is large, your code will have to make multiple calls to the DataLakeFileClient.AppendAsync. Consider using the DataLakeFileClient.UploadAsync method instead. That way, you can upload the entire file in a single call.

See the next section for an example.

Upload a large file to a directory

Use the DataLakeFileClient.UploadAsync method to upload large files without having to make multiple calls to the DataLakeFileClient.AppendAsync method.

public async Task UploadFileBulk(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    DataLakeFileClient fileClient = directoryClient.GetFileClient("uploaded-file.txt");

    FileStream fileStream =
        File.OpenRead("C:\\file-to-upload.txt");

    await fileClient.UploadAsync(fileStream);

}

Manage a file ACL

Get the access control list (ACL) of a file by calling the DataLakeFileClient.GetAccessControlAsync method and set the ACL by calling the DataLakeFileClient.SetAccessControlList method.

Note

If your application authorizes access by using Azure Active Directory (Azure AD), then make sure that the security principal that your application uses to authorize access has been assigned the Storage Blob Data Owner role. To learn more about how ACL permissions are applied and the effects of changing them, see Access control in Azure Data Lake Storage Gen2.

This example gets and sets the ACL of a file named my-file.txt. The string user::rwx,group::r-x,other::rw- gives the owning user read, write, and execute permissions, gives the owning group only read and execute permissions, and gives all others read and write permission.

public async Task ManageFileACLs(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    DataLakeFileClient fileClient = 
        directoryClient.GetFileClient("hello.txt");

    PathAccessControl FileAccessControl =
        await fileClient.GetAccessControlAsync();

    Console.WriteLine(FileAccessControl.AccessControlList);

    IList<PathAccessControlItem> accessControlList
        = PathAccessControlExtensions.ParseAccessControlList
        ("user::rwx,group::r-x,other::rw-");

    fileClient.SetAccessControlList(accessControlList);
}

Download from a directory

First, create a DataLakeFileClient instance that represents the file that you want to download. Use the DataLakeFileClient.ReadAsync method, and parse the return value to obtain a Stream object. Use any .NET file processing API to save bytes from the stream to a file.

This example uses a BinaryReader and a FileStream to save bytes to a file.

public async Task DownloadFile(DataLakeFileSystemClient fileSystemClient)
{
    DataLakeDirectoryClient directoryClient =
        fileSystemClient.GetDirectoryClient("my-directory");

    DataLakeFileClient fileClient = 
        directoryClient.GetFileClient("my-image.png");

    Response<FileDownloadInfo> downloadResponse = await fileClient.ReadAsync();

    BinaryReader reader = new BinaryReader(downloadResponse.Value.Content);

    FileStream fileStream = 
        File.OpenWrite("C:\\my-image-downloaded.png");

    int bufferSize = 4096;

    byte[] buffer = new byte[bufferSize];

    int count;

    while ((count = reader.Read(buffer, 0, buffer.Length)) != 0)
    {
        fileStream.Write(buffer, 0, count);
    }

    await fileStream.FlushAsync();

    fileStream.Close();
}

List directory contents

List directory contents by calling the FileSystemClient.GetPathsAsync method, and then enumerating through the results.

This example, prints the names of each file that is located in a directory named my-directory.

public async Task ListFilesInDirectory(DataLakeFileSystemClient fileSystemClient)
{
    IAsyncEnumerator<PathItem> enumerator = 
        fileSystemClient.GetPathsAsync("my-directory").GetAsyncEnumerator();

    await enumerator.MoveNextAsync();

    PathItem item = enumerator.Current;

    while (item != null)
    {
        Console.WriteLine(item.Name);

        if (!await enumerator.MoveNextAsync())
        {
            break;
        }
                
        item = enumerator.Current;
    }

}

See also