question

JohnnyHumphrey-3892 avatar image
0 Votes"
JohnnyHumphrey-3892 asked PaulHasell-5165 commented

What is the Synapse security setup to allow Reader access to databases and tables created by a Spark pool through the serverless SQL poolpublic endpoint?

Goal: Connect client apps (e.g., SSMS, Tableau, Power BI) to the Synapse workspace serverless SQL endpoint and query databases and tables created by an Apache Spark pool in the same Synapse workspace.

Setup:
What we have:

  1. Active Directory users with Azure Reader role assignment on the Resource group which includes the Synapse workspace, Apache Spark pool, and storage account.

  2. Inherited Azure Reader role assignment on the Synapse workspace and storage account.

  3. Firewall rules to allow access from client machines.

  4. Synapse Administrator Synapse RBAC role assignment to be able to query the information. (This role looks overly powerful. See question below.)

At that point one user got an error:
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]External table 'dbo' is not accessible because content of directory cannot be listed.
The table "[dbo].[billing_all_prod]" does not exist.

Looking at the document on setting up Synapse access control (https://docs.microsoft.com/en-us/azure/synapse-analytics/security/how-to-set-up-access-control),
we assigned the Storage Blob Data Reader role to the AD user. We haven't heard back yet about whether this resolved all of the access issues, but the end-users (managers for this project) are tired of being used as debuggers, and we would like to be sure everything that needs to be done is completed.

That Synapse access control document, in STEP 7:
By default, all users assigned the Synapse Administrator role are also assigned the SQL db_owner role on the serverless SQL pool, 'Built-in', and all its databases.
Access to SQL pools for other users and for the workspace MSI is controlled using SQL permissions.

When I proceed to STEP 7.1: Serverless SQL pool, Built-in and run the scripts, when trying to create users on the Spark-created databases I get an error:
Operation CREATE USER is not allowed for a replicated database.

That sort of makes sense, since the Spark pool actually created the database and the serverless SQL pool is just using a copy of some metadata. But my questions from above are:

  1. What is the minimum (most restrictive) Synapse RBAC role that I would need to assign so that a client could use the serverless SQL pool public endpoint to query a database and table created by the Spark pool?

  2. Is there any SQL-level access that needs to be granted, or can it all be done with various Azure and Synapse RBAC roles? (Again, we have Reader on the resource group, workspace, and storage account, and we also have Storage Blob Data reader on the storage account, and Synapse Administrator at the workspace scope.)

Thanks,
Johnny

azure-synapse-analyticsazure-rbac
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Johnny,

I'm taking a look to see what the least permissions is for your scenario. The product team is very aware that the permissions in Synapse are overly complicated and is working towards solving this.

1 Vote 1 ·
SamaraSoucy-MSFT avatar image
0 Votes"
SamaraSoucy-MSFT answered JohnnyHumphrey-3892 commented

Adding Storage Blob Data Reader should resolve that error.

There are two levels of authentication required to do what you are trying to do:

  1. SQL Server level. Outside of admin it isn't currently possible to handle this purely with RBAC, but you can create a server login that is am AAD security group and then assign your users to that group. This avoids having to manage them individually within SQL and will resolve needing to give everyone admin.
    CREATE LOGIN [Security Group Name] FROM EXTERNAL PROVIDER; This must be a security group in AAD, not a Microsoft 365/Distribution group

  2. As a Spark database is not actually a SQL database but a set of parquet files, adding SQL database level permissions doesn't work, as you have discovered. The Storage Blob Data Reader role takes its place as your serverless pool is reading directly from the storage account.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for the update.

0 Votes 0 ·
alycke avatar image
0 Votes"
alycke answered PaulHasell-5165 commented

Hi! I pretty much have the exact same case, but the suggested solution does not resolve the issue.

I've tried creating SQL logins (CREATE LOGIN), which makes me login and see the databases on the server. But when I try to expand "tables", I get this:
81904-image.png

Which makes me think the SQL login works, but the data access is limited. I get the same result when I create a login for the user who are to access the data based on their email: CREATE LOGIN email@example.com FROM EXTERNAL PROVIDER;.

However, the user is granted Storage Blob Data Reader access on the resource group that contains the data lake/storage account, without this changing the outcome. The user is supposed to connect through Power BI, and the error looks like this:
81883-image.png

What am I doing wrong here?



image.png (35.9 KiB)
image.png (249.8 KiB)
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Since a SQL login can't be given storage account permissions, it doesn't work with accessing the built-in external tables for Spark. You could set up an additional copy of external tables that use SAS authentication, but if you have AAD login available to you, I would recommend going that route.

Is the issue service-wide or is it limited to this particular user?

0 Votes 0 ·
alycke avatar image alycke SamaraSoucy-MSFT ·

The problem is not limited to this user only. As mentioned, the user is Storage Blob Data Reader on the resource group, so it is not limited to SQL access.

0 Votes 0 ·

II'm using an AAD account and get the same issue, although you can add at the Server level there is no way to provide authentication at the Database level so you appear to be stuck with the accounts created on provisioning...which is unhelpful

0 Votes 0 ·

I have given the service account the 'Synapse SQL Administrator' role in Azure as well as the 'Storage Blob Data Reader' and it's now able to access the database instance. I guess that's because it has implicit access from being an admin so bypasses the need for a database user

0 Votes 0 ·