question

msftsyn7270 avatar image
0 Votes"
msftsyn7270 asked msftsyn7270 commented

Apache Spark to access data from serverless on-demand sql database in Synapse?

Looking for a way to access already created serverless on-demand external tables through Apache Spark in Azure Synapse. Any ideas/thoughts?

azure-synapse-analyticsazure-data-lake-storagedotnet-ml-big-data
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

SamaraSoucy-MSFT avatar image
0 Votes"
SamaraSoucy-MSFT answered msftsyn7270 commented

Hello!

You can find instructions in this post: https://docs.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html

Basically, the jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work.

It's worth noting that it's not usually necessary to connect to the serverless pool from spark. All the data in your serverless pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway. That being said, I can see a scenario where you have some complex views or something similar that you want to tap into from spark.

· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you for your response @SamaraSoucy-MSFT ! When accessing the storage account, is there a way that would enable us to specific folders/directories or files within ADLS ?

0 Votes 0 ·

Similar to how you would control access in the serverless pool, both Access Control Lists combined with AAD users and SAS tokens are available for scoped access to ADLS.

The TokenLibrary library availble in your Spark pool will work with either option: https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-secure-credentials-with-tokenlibrary?pivots=programming-language-python


0 Votes 0 ·

Hi @msftsyn7270
I wanted to check in with you to see if you have any follow up questions for me on this issue.

0 Votes 0 ·
Show more comments
arvindmalav-7458 avatar image
0 Votes"
arvindmalav-7458 answered msftsyn7270 commented

You can find instructions : https://docs.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html

The jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work. It's worth noting that it's not usually necessary to connect to the server less pool from spark. All the data in your server less pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you for your help @arvindmalav-7458

0 Votes 0 ·