question

TylerRomeo-1102 avatar image
0 Votes"
TylerRomeo-1102 asked PRADEEPCHEEKATLA-MSFT commented

Issues writing to SQl synapse Via polybase on azure databricks

I am writing to a table in sql synapse using Polybase as recommended by the documentation here https://docs.databricks.com/data/data-sources/azure/synapse-analytics.html and guidance from this post https://medium.com/microsoftazure/azure-synapse-data-load-using-polybase-or-copy-command-from-vnet-protected-azure-storage-da8aa6a9ac68

In pyspark, we set up our configuration with the following code:

 spark.conf.set("fs.azure.account.auth.type", "OAuth")
 spark.conf.set("fs.azure.account.oauth.provider.type",
                "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
 spark.conf.set("fs.azure.account.oauth2.client.id", os.environ['AZURE_CLIENT_ID'])
 spark.conf.set("fs.azure.account.oauth2.client.secret", os.environ['AZURE_CLIENT_SECRET'])
 spark.conf.set("fs.azure.account.oauth2.client.endpoint",
                f"https://login.microsoftonline.com/{os.environ['AZURE_TENANT_ID']}/oauth2/token")

and then write a dataframe of data to sql synapse through spark with:


 tempDir = "abfss://<container_name>@us6intdev001.dfs.core.windows.net/tempDirs/temp1"
    
 df.write \
             .format("com.databricks.spark.sqldw") \
             .option("useAzureMSI", "true") \
             .mode("append") \
             .option("url", "jdbc:sqlserver://<sql synapse hostname>;database=<db_name>;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;") \
             .option("dbTable", "test_staging") \
             .option("user", "myuser_dev") \
             .option("password", "<redacted>") \
             .option("tempDir", tempDir) \
             .save()

Our synapse workspace is set up so that it has the "Storage Blob Contributor" role for the container used, which is ADLS gen2. However we get an error response when attempting to write:

 Py4JJavaError: An error occurred while calling o333.save.
 : com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.
 Underlying SQLException(s):
   - com.microsoft.sqlserver.jdbc.SQLServerException: An internal error occurred while authenticating against Managed Service Identity. Please contact support if this problem persists. [ErrorCode = 105098] [SQLState = S0001]


This exact code/configuration worked until 2 weeks ago, when it started returning the error. We have checked and the Storage Blob Contributor Role is still present on the managed identity.

Is there a way to get more information about the authentication error, or other settings to check to see if they have changed?




azure-synapse-analyticsazure-databricksazure-data-lake-storage
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @TylerRomeo-1102

Have you added a try catch block in your notebook

It will give you detailed error

0 Votes 0 ·

I'm in the same team as the OP - we actually have the detailed error; OP just highlighted the most relevant part. It is not terribly helpful in pinpointing the cause beyond the fact that there is an authentication issue. Our puzzle is more on the cause, rather than how to get more detailed stack trace of this error.

0 Votes 0 ·

1 Answer

PRADEEPCHEEKATLA-MSFT avatar image
2 Votes"
PRADEEPCHEEKATLA-MSFT answered PRADEEPCHEEKATLA-MSFT commented

Hello @TylerRomeo-1102 & @Tania-6701,

Thanks for the question and using MS Q&A platform.

Could you please try to pause and resume the Synapse SQL pool and then try to run the notebook and see if that works.

In case if you are still facing the issue, please do share the complete stack trace of the error message?

Hope this will help. Please let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks @PRADEEPCHEEKATLA-MSFT - that actually did the trick.

0 Votes 0 ·

Thanks for the help @PRADEEPCHEEKATLA-MSFT! Is this a common occurrence with sql synapse? I'm wondering if we should automate a regular restart of the synapse pool, or if there are other preventative measures we could take to avoid this issue in future.

0 Votes 0 ·

Hello @TylerRomeo-1102,

This should not be a common occurrence with Synapse SQL. I'm just wondering if the above issue occur every time you run the notebook?

0 Votes 0 ·
TylerRomeo-1102 avatar image TylerRomeo-1102 PRADEEPCHEEKATLA-MSFT ·

It was happening every time until we did a pause+resume, and it hasn't happened again since

0 Votes 0 ·
Show more comments