I am writing to a table in sql synapse using Polybase as recommended by the documentation here https://docs.databricks.com/data/data-sources/azure/synapse-analytics.html and guidance from this post https://medium.com/microsoftazure/azure-synapse-data-load-using-polybase-or-copy-command-from-vnet-protected-azure-storage-da8aa6a9ac68
In pyspark, we set up our configuration with the following code:
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type",
"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", os.environ['AZURE_CLIENT_ID'])
spark.conf.set("fs.azure.account.oauth2.client.secret", os.environ['AZURE_CLIENT_SECRET'])
spark.conf.set("fs.azure.account.oauth2.client.endpoint",
f"https://login.microsoftonline.com/{os.environ['AZURE_TENANT_ID']}/oauth2/token")
and then write a dataframe of data to sql synapse through spark with:
tempDir = "abfss://<container_name>@us6intdev001.dfs.core.windows.net/tempDirs/temp1"
df.write \
.format("com.databricks.spark.sqldw") \
.option("useAzureMSI", "true") \
.mode("append") \
.option("url", "jdbc:sqlserver://<sql synapse hostname>;database=<db_name>;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;") \
.option("dbTable", "test_staging") \
.option("user", "myuser_dev") \
.option("password", "<redacted>") \
.option("tempDir", tempDir) \
.save()
Our synapse workspace is set up so that it has the "Storage Blob Contributor" role for the container used, which is ADLS gen2. However we get an error response when attempting to write:
Py4JJavaError: An error occurred while calling o333.save.
: com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.
Underlying SQLException(s):
- com.microsoft.sqlserver.jdbc.SQLServerException: An internal error occurred while authenticating against Managed Service Identity. Please contact support if this problem persists. [ErrorCode = 105098] [SQLState = S0001]
This exact code/configuration worked until 2 weeks ago, when it started returning the error. We have checked and the Storage Blob Contributor Role is still present on the managed identity.
Is there a way to get more information about the authentication error, or other settings to check to see if they have changed?
or upvote
button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is