Access control model in Azure Data Lake Storage Gen2

Data Lake Storage Gen2 supports the following authorization mechanisms:

  • Shared Key authorization
  • Shared access signature (SAS) authorization
  • Role-based access control (Azure RBAC)
  • Access control lists (ACL)

Shared Key and SAS authorization grants access to a user (or application) without requiring them to have an identity in Azure Active Directory (Azure AD). With these two forms of authentication, Azure RBAC and ACLs have no effect.

Azure RBAC and ACL both require the user (or application) to have an identity in Azure AD. Azure RBAC lets you grant "coarse-grain" access to storage account data, such as read or write access to all of the data in a storage account, while ACLs let you grant "fine-grained" access, such as write access to a specific directory or file.

This article focuses on Azure RBAC and ACLs, and how the system evaluates them together to make authorization decisions for storage account resources.

Role-based access control (Azure RBAC)

Azure RBAC uses role assignments to apply sets of permissions to security principals. A security principal is an object that represents a user, group, service principal, or managed identity that is defined in Azure Active Directory (AD). A permission set can give a security principal a "coarse-grain" level of access such as read or write access to all of the data in a storage account or all of the data in a container.

The following roles permit a security principal to access data in a storage account.

Role Description
Storage Blob Data Owner Full access to Blob storage containers and data. This access permits the security principal to set the owner an item, and to modify the ACLs of all items.
Storage Blob Data Contributor Read, write, and delete access to Blob storage containers and blobs. This access does not permit the security principal to set the ownership of an item, but it can modify the ACL of items that are owned by the security principal.
Storage Blob Data Reader Read and list Blob storage containers and blobs.

Roles such as Owner, Contributor, Reader, and Storage Account Contributor permit a security principal to manage a storage account, but do not provide access to the data within that account. However, these roles (excluding Reader) can obtain access to the storage keys, which can be used in various client tools to access the data.

Access control lists (ACLs)

ACLs give you the ability to apply "finer grain" level of access to directories and files. An ACL is a permission construct that contains a series of ACL entries. Each ACL entry associates security principal with an access level. To learn more, see Access control lists (ACLs) in Azure Data Lake Storage Gen2.

How permissions are evaluated

During security principal-based authorization, permissions are evaluated in the following order.

1️⃣   Azure role assignments are evaluated first and take priority over any ACL assignments.

2️⃣   If the operation is fully authorized based on Azure role assignment, then ACLs are not evaluated at all.

3️⃣   If the operation is not fully authorized, then ACLs are evaluated.

data lake storage permission flow

Because of the way that access permissions are evaluated by the system, you cannot use an ACL to restrict access that has already been granted by a role assignment. That's because the system evaluates Azure role assignments first, and if the assignment grants sufficient access permission, ACLs are ignored.

The following diagram shows the permission flow for three common operations: listing directory contents, reading a file, and writing a file.

data lake storage permission flow example

Permissions table: Combining Azure RBAC and ACL

The following table shows you how to combine Azure roles and ACL entries so that a security principal can perform the operations listed in the Operation column. This table shows a column that represents each level of a fictitious directory hierarchy. There's a column for the root directory of the container (/), a subdirectory named Oregon, a subdirectory of the Oregon directory named Portland, and a text file in the Portland directory named Data.txt. Appearing in those columns are short form representations of the ACL entry required to grant permissions. N/A (Not applicable) appears in the column if an ACL entry is not required to perform the operation.

Operation Assigned Azure role / Oregon/ Portland/ Data.txt
Read Data.txt Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader N/A N/A N/A N/A
None --X --X --X R--
Append to Data.txt Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader --X --X --X -W-
None --X --X --X RW-
Delete Data.txt Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader --X --X -WX N/A
None --X --X -WX N/A
Create Data.txt Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader --X --X -WX N/A
None --X --X -WX N/A
List / Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader N/A N/A N/A N/A
None R-X N/A N/A N/A
List /Oregon/ Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader N/A N/A N/A N/A
None --X R-X N/A N/A
List /Oregon/Portland/ Storage Blob Data Owner N/A N/A N/A N/A
Storage Blob Data Contributor N/A N/A N/A N/A
Storage Blob Data Reader N/A N/A N/A N/A
None --X --X R-X N/A

Note

To view the contents of a container in Azure Storage Explorer, security principals must sign in to Storage Explorer by using Azure AD, and (at a minimum) have read access (R--) to the root folder (\) of a container. This level of permission does give them the ability to list the contents of the root folder. If you don't want the contents of the root folder to be visible, you can assign them Reader role. With that role, they'll be able to list the containers in the account, but not container contents. You can then grant access to specific directories and files by using ACLs.

Security groups

Always use Azure AD security groups as the assigned principal in an ACL entry. Resist the opportunity to directly assign individual users or service principals. Using this structure will allow you to add and remove users or service principals without the need to reapply ACLs to an entire directory structure. Instead, you can just add or remove users and service principals from the appropriate Azure AD security group.

There are many different ways to set up groups. For example, imagine that you have a directory named /LogData which holds log data that is generated by your server. Azure Data Factory (ADF) ingests data into that folder. Specific users from the service engineering team will upload logs and manage other users of this folder, and various Databricks clusters will analyze logs from that folder.

To enable these activities, you could create a LogsWriter group and a LogsReader group. Then, you could assign permissions as follows:

  • Add the LogsWriter group to the ACL of the /LogData directory with rwx permissions.
  • Add the LogsReader group to the ACL of the /LogData directory with r-x permissions.
  • Add the service principal object or Managed Service Identity (MSI) for ADF to the LogsWriters group.
  • Add users in the service engineering team to the LogsWriter group.
  • Add the service principal object or MSI for Databricks to the LogsReader group.

If a user in the service engineering team leaves the company, you could just remove them from the LogsWriter group. If you did not add that user to a group, but instead, you added a dedicated ACL entry for that user, you would have to remove that ACL entry from the /LogData directory. You would also have to remove the entry from all subdirectories and files in the entire directory hierarchy of the /LogData directory.

To create a group and add members, see Create a basic group and add members using Azure Active Directory.

Limits on Azure role assignments and ACL entries

By using groups, you're less likely to exceed the maximum number of role assignments per subscription and the maximum number of ACL entries per file or directory. The following table describes these limits.

Mechanism Scope Limits Supported level of permission
Azure RBAC Storage accounts, containers.
Cross resource Azure role assignments at subscription or resource group level.
2000 Azure role assignments in a subscription Azure roles (built-in or custom)
ACL Directory, file 32 ACL entries (effectively 28 ACL entries) per file and per directory. Access and default ACLs each have their own 32 ACL entry limit. ACL permission

Shared Key and Shared Access Signature (SAS) authorization

Azure Data Lake Storage Gen2 also supports Shared Key and SAS methods for authentication. A characteristic of these authentication methods is that no identity is associated with the caller and therefore security principal permission-based authorization cannot be performed.

In the case of Shared Key, the caller effectively gains 'super-user' access, meaning full access to all operations on all resources including data, setting owner, and changing ACLs.

SAS tokens include allowed permissions as part of the token. The permissions included in the SAS token are effectively applied to all authorization decisions, but no additional ACL checks are performed.

Next steps

To learn more about access control lists, see Access control lists (ACLs) in Azure Data Lake Storage Gen2.