Error at create spark database in synapse workspace notebook using orchestrate pipeline.

Rishabh 11 Reputation points
2020-07-21T07:01:54.63+00:00

I'm getting below error while running synapse notebook from orchestrate in synapse workbook. My notebook runs fine when ran individually from Develop. I gave Owner and Storage Blob Data Owner role to myself before running the pipeline as mentioned in https://github.com/MicrosoftDocs/azure-docs/issues/55550 issue but still i get the same error.

Error
org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: Operation failed: \"This request is not authorized to perform this operation using this permission", 403, HEAD, https://storage_account.dfs.core.windows.net/synapse/tmp/hive?upn=false&action=getStatus&timeout=90;",
"traceback": [
" at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)\n",
" at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214)\n",
" at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114)\n",
" at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102)\n",
" at org.apache.spark.sql.internal.SharedState.globalTempViewManager$lzycompute(SharedState.scala:141)\n",
" at org.apache.spark.sql.internal.SharedState.globalTempViewManager(SharedState.scala:136)\n",
" at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anonfun$2.apply(HiveSessionStateBuilder.scala:55)\n",
" at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anonfun$2.apply(HiveSessionStateBuilder.scala:55)\n",
" at org.apache.spark.sql.catalyst.catalog.SessionCatalog.globalTempViewManager$lzycompute(SessionCatalog.scala:91)\n",
" at org.apache.spark.sql.catalyst.catalog.SessionCatalog.globalTempViewManager(SessionCatalog.scala:91)\n",
" at org.apache.spark.sql.catalyst.catalog.SessionCatalog.setCurrentDatabase(SessionCatalog.scala:258)\n",
" at org.apache.spark.sql.execution.command.SetDatabaseCommand.run(databases.scala:59)\n",
" at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)\n",
" at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)\n",
" at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)\n",
" at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)\n",
" at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)\n",
" at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)\n",
" at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)\n",
" at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)\n",
" at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)\n",
" at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)\n",
" at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)\n",
" at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)\n",
" at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)\n",
" ... 52 elided\n",

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,389 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,527 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Rishabh 11 Reputation points
    2020-07-23T07:35:15.327+00:00

    @MartinJaffer-MSFT Got the solution as MSI of workspace didn't have permission to container.

    2 people found this answer helpful.