Looking for a way to access already created serverless on-demand external tables through Apache Spark in Azure Synapse. Any ideas/thoughts?
Looking for a way to access already created serverless on-demand external tables through Apache Spark in Azure Synapse. Any ideas/thoughts?
Hello!
You can find instructions in this post: https://docs.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html
Basically, the jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work.
It's worth noting that it's not usually necessary to connect to the serverless pool from spark. All the data in your serverless pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway. That being said, I can see a scenario where you have some complex views or something similar that you want to tap into from spark.
Thank you for your response @SamaraSoucy-MSFT ! When accessing the storage account, is there a way that would enable us to specific folders/directories or files within ADLS ?
Similar to how you would control access in the serverless pool, both Access Control Lists combined with AAD users and SAS tokens are available for scoped access to ADLS.
The TokenLibrary library availble in your Spark pool will work with either option: https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-secure-credentials-with-tokenlibrary?pivots=programming-language-python
Hi @msftsyn7270
I wanted to check in with you to see if you have any follow up questions for me on this issue.
You can find instructions : https://docs.microsoft.com/en-us/answers/questions/428775/connect-synapse-spark-to-synapse-serverless-sql-po.html
The jbdc driver will allow access to SQL, but the version pre-installed in your Spark Pool doesn't have AAD support. You need to update and add some libraries to your pool to get this to work. It's worth noting that it's not usually necessary to connect to the server less pool from spark. All the data in your server less pool is in storage so you can go straight to the storage account- you would need access permissions to storage anyway.
8 people are following this question.
I am getting error while running the spark job within the azure synapse analytics workspace
Why does fsspec always connects to blob?
Azure Synapse Analytics - can I downgrade Python libraries?
Mounting Azure Data Lake Store Gen 2 to Azure Synapse Analytics Spark Pool
Copy activity failed in for each activity with error 2200 source failed