Use JavaScript SDK in Node.js to manage directories and files in Azure Data Lake Storage Gen2

This article shows you how to use Node.js to create and manage directories and files in storage accounts that have a hierarchical namespace.

To learn about how to get, set, and update the access control lists (ACL) of directories and files, see Use JavaScript SDK in Node.js to manage ACLs in Azure Data Lake Storage Gen2.

Package (Node Package Manager) | Samples | Give Feedback

Prerequisites

  • An Azure subscription. For more information, see Get Azure free trial.

  • A storage account that has hierarchical namespace enabled. Follow these instructions to create one.

  • If you're using this package in a Node.js application, you'll need Node.js 8.0.0 or higher.

Set up your project

Install Data Lake client library for JavaScript by opening a terminal window, and then typing the following command.

npm install @azure/storage-file-datalake

Import the storage-file-datalake package by placing this statement at the top of your code file.

const {
AzureStorageDataLake,
DataLakeServiceClient,
StorageSharedKeyCredential
} = require("@azure/storage-file-datalake");

Note

Multi-protocol access on Data Lake Storage enables applications to use both Blob APIs and Data Lake Storage Gen2 APIs to work with data in storage accounts with hierarchical namespace (HNS) enabled. When working with capabilities unique to Data Lake Storage Gen2, such as directory operations and ACLs, use the Data Lake Storage Gen2 APIs, as shown in this article.

When choosing which APIs to use in a given scenario, consider the workload and the needs of your application, along with the known issues and impact of HNS on workloads and applications.

Connect to the account

To use the snippets in this article, you'll need to create a DataLakeServiceClient instance that represents the storage account.

Connect by using Microsoft Entra ID

You can use the Azure identity client library for JS to authenticate your application with Microsoft Entra ID.

Create a DataLakeServiceClient instance and pass in a new instance of the DefaultAzureCredential class.

function GetDataLakeServiceClientAD(accountName) {

  const dataLakeServiceClient = new DataLakeServiceClient(
      `https://${accountName}.dfs.core.windows.net`,
      new DefaultAzureCredential());

  return dataLakeServiceClient;
}

To learn more about using DefaultAzureCredential to authorize access to data, see Overview: Authenticate JavaScript apps to Azure using the Azure SDK.

Connect by using an account key

You can authorize access to data using your account access keys (Shared Key). This example creates a DataLakeServiceClient instance that is authorized with the account key.


function GetDataLakeServiceClient(accountName, accountKey) {

  const sharedKeyCredential =
     new StorageSharedKeyCredential(accountName, accountKey);

  const dataLakeServiceClient = new DataLakeServiceClient(
      `https://${accountName}.dfs.core.windows.net`, sharedKeyCredential);

  return dataLakeServiceClient;
}

This method of authorization works only for Node.js applications. If you plan to run your code in a browser, you can authorize by using Microsoft Entra ID.

Caution

Authorization with Shared Key is not recommended as it may be less secure. For optimal security, disable authorization via Shared Key for your storage account, as described in Prevent Shared Key authorization for an Azure Storage account.

Use of access keys and connection strings should be limited to initial proof of concept apps or development prototypes that don't access production or sensitive data. Otherwise, the token-based authentication classes available in the Azure SDK should always be preferred when authenticating to Azure resources.

Microsoft recommends that clients use either Microsoft Entra ID or a shared access signature (SAS) to authorize access to data in Azure Storage. For more information, see Authorize operations for data access.

Create a container

A container acts as a file system for your files. You can create one by getting a FileSystemClient instance, and then calling the FileSystemClient.Create method.

This example creates a container named my-file-system.

async function CreateFileSystem(dataLakeServiceClient) {

  const fileSystemName = "my-file-system";

  const fileSystemClient = dataLakeServiceClient.getFileSystemClient(fileSystemName);

  const createResponse = await fileSystemClient.create();

}

Create a directory

Create a directory reference by getting a DirectoryClient instance, and then calling the DirectoryClient.create method.

This example adds a directory named my-directory to a container.

async function CreateDirectory(fileSystemClient) {

  const directoryClient = fileSystemClient.getDirectoryClient("my-directory");

  await directoryClient.create();

}

Rename or move a directory

Rename or move a directory by calling the DirectoryClient.rename method. Pass the path of the desired directory a parameter.

This example renames a subdirectory to the name my-directory-renamed.

async function RenameDirectory(fileSystemClient) {

  const directoryClient = fileSystemClient.getDirectoryClient("my-directory");
  await directoryClient.move("my-directory-renamed");

}

This example moves a directory named my-directory-renamed to a subdirectory of a directory named my-directory-2.

async function MoveDirectory(fileSystemClient) {

  const directoryClient = fileSystemClient.getDirectoryClient("my-directory-renamed");
  await directoryClient.move("my-directory-2/my-directory-renamed");

}

Delete a directory

Delete a directory by calling the DirectoryClient.delete method.

This example deletes a directory named my-directory.

async function DeleteDirectory(fileSystemClient) {

  const directoryClient = fileSystemClient.getDirectoryClient("my-directory");
  await directoryClient.delete();

}

Upload a file to a directory

First, read a file. This example uses the Node.js fs module. Then, create a file reference in the target directory by creating a FileClient instance, and then calling the FileClient.create method. Upload a file by calling the FileClient.append method. Make sure to complete the upload by calling the FileClient.flush method.

This example uploads a text file to a directory named my-directory.`

async function UploadFile(fileSystemClient) {

  const fs = require('fs')

  var content = "";

  fs.readFile('mytestfile.txt', (err, data) => {
      if (err) throw err;

      content = data.toString();

  })

  const fileClient = fileSystemClient.getFileClient("my-directory/uploaded-file.txt");
  await fileClient.create();
  await fileClient.append(content, 0, content.length);
  await fileClient.flush(content.length);

}

Download from a directory

First, create a FileSystemClient instance that represents the file that you want to download. Use the FileSystemClient.read method to read the file. Then, write the file. This example uses the Node.js fs module to do that.

Note

This method of downloading a file works only for Node.js applications. If you plan to run your code in a browser, see the Azure Storage File Data Lake client library for JavaScript readme file for an example of how to do this in a browser.

async function DownloadFile(fileSystemClient) {

  const fileClient = fileSystemClient.getFileClient("my-directory/uploaded-file.txt");

  const downloadResponse = await fileClient.read();

  const downloaded = await streamToString(downloadResponse.readableStreamBody);

  async function streamToString(readableStream) {
    return new Promise((resolve, reject) => {
      const chunks = [];
      readableStream.on("data", (data) => {
        chunks.push(data.toString());
      });
      readableStream.on("end", () => {
        resolve(chunks.join(""));
      });
      readableStream.on("error", reject);
    });
  }

  const fs = require('fs');

  fs.writeFile('mytestfiledownloaded.txt', downloaded, (err) => {
    if (err) throw err;
  });
}

List directory contents

This example, prints the names of each directory and file that is located in a directory named my-directory.

async function ListFilesInDirectory(fileSystemClient) {

  let i = 1;

  let iter = await fileSystemClient.listPaths({path: "my-directory", recursive: true});

  for await (const path of iter) {

    console.log(`Path ${i++}: ${path.name}, is directory: ${path.isDirectory}`);
  }

}

See also