question

AdelineBornatico-0453 avatar image
1 Vote"
AdelineBornatico-0453 asked PRADEEPCHEEKATLA-MSFT commented

Databricks Dela Live Tables: "String index out of range: 89"

In the dlt pipeline in azure databricks, when I try to create a table from json from blob storage, I get the error "shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.lang.StringIndexOutOfBoundsException: String index out of range: 89". This doesn't happen, when I try to create table from csv. The jsons are single lined.

command looks like this:

CREATE OR REFRESH STREAMING LIVE TABLE neon_raw
AS SELECT * FROM cloud_files("wasbs://landing@sthvchddwhnoeu002.blob.core.windows.net/powerbi-tenant/activityevents/2021/11/","json")

I don't really get what this means in this situation. Any Ideas?

Best,
Adeline

(Detailed error messages below)




org.apache.spark.sql.AnalysisException: Unable to process statement for table 'neon_raw'.
at com.databricks.sql.transaction.tahoe.DeltaErrors$.analysisException(DeltaErrors.scala:244)
at com.databricks.pipelines.execution.languages.SQLPipeline.com$databricks$pipelines$execution$languages$SQLPipeline$$analyze(SQLPipeline.scala:195)
at com.databricks.pipelines.execution.languages.SQLPipeline$$anonfun$1.$anonfun$applyOrElse$4(SQLPipeline.scala:133)
at com.databricks.pipelines.Pipeline$$anon$1.$anonfun$call$3(Pipeline.scala:529)
at com.databricks.pipelines.Pipeline$.withContext(Pipeline.scala:83)
at com.databricks.pipelines.Pipeline$$anon$1.$anonfun$call$2(Pipeline.scala:529)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.pipelines.Pipeline$$anon$1.call(Pipeline.scala:528)
at com.databricks.pipelines.graph.Flow.flowFuncResult(elements.scala:370)
at com.databricks.pipelines.graph.Flow.flowFuncResult$(elements.scala:368)
at com.databricks.pipelines.graph.BasicFlow.flowFuncResult$lzycompute(elements.scala:443)
at com.databricks.pipelines.graph.BasicFlow.flowFuncResult(elements.scala:443)
at com.databricks.pipelines.graph.Flow.failure(elements.scala:419)
at com.databricks.pipelines.graph.Flow.failure$(elements.scala:418)
at com.databricks.pipelines.graph.BasicFlow.failure(elements.scala:443)
at com.databricks.pipelines.graph.Flow.resolved(elements.scala:427)
at com.databricks.pipelines.graph.Flow.resolved$(elements.scala:427)
at com.databricks.pipelines.graph.BasicFlow.resolved(elements.scala:443)
at com.databricks.pipelines.graph.DataflowGraph.$anonfun$resolve$1(DataflowGraph.scala:261)
at com.databricks.pipelines.Pipeline$.withGraphToConnect(Pipeline.scala:113)
at com.databricks.pipelines.graph.DataflowGraph.resolve(DataflowGraph.scala:211)
at com.databricks.pipelines.graph.DataflowGraph.connect(DataflowGraph.scala:201)
at com.databricks.pipelines.execution.TableManager.materializeTables(TableManager.scala:114)
at com.databricks.pipelines.execution.UpdateExecution.$anonfun$setupTables$1(UpdateExecution.scala:349)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$2(DeltaPipelinesUsageLogging.scala:114)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter.executeWithPeriodicReporting(OperationStatusReporter.scala:119)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter$.executeWithPeriodicReporting(OperationStatusReporter.scala:159)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$5(DeltaPipelinesUsageLogging.scala:133)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:341)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:435)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:455)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:215)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:95)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:213)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:210)
at com.databricks.pipelines.execution.monitoring.PublicLogging.withAttributionContext(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:251)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:243)
at com.databricks.pipelines.execution.monitoring.PublicLogging.withAttributionTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:430)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:350)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperationWithResultTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:341)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:313)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperation(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperation0(DeltaPipelinesUsageLogging.scala:59)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation(DeltaPipelinesUsageLogging.scala:126)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation$(DeltaPipelinesUsageLogging.scala:98)
at com.databricks.pipelines.execution.PipelineRunnable.recordPipelinesOperation(PipelineRunnable.scala:34)
at com.databricks.pipelines.execution.UpdateExecution.executeStage(UpdateExecution.scala:191)
at com.databricks.pipelines.execution.UpdateExecution.setupTables(UpdateExecution.scala:340)
at com.databricks.pipelines.execution.UpdateExecution.executeUpdate(UpdateExecution.scala:262)
at com.databricks.pipelines.execution.UpdateExecution.start(UpdateExecution.scala:113)
at com.databricks.pipelines.execution.service.ExecutionBackend$$anon$1.run(ExecutionBackend.scala:278)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:68)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:54)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:101)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.lang.StringIndexOutOfBoundsException: String index out of range: 89
at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1063)
at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:512)
at shaded.databricks.org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at com.databricks.sql.fileNotification.autoIngest.SchemaInferenceUtils$.sampleData(SchemaInferenceUtils.scala:59)
at com.databricks.sql.fileNotification.autoIngest.SchemaInferenceUtils$.getOrUpdatePersistedSchema(SchemaInferenceUtils.scala:173)
at com.databricks.sql.fileNotification.autoIngest.SchemaInferenceUtils$.inferSchema(SchemaInferenceUtils.scala:559)
at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.determineSchema(CloudFilesSourceProvider.scala:77)
at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.sourceSchema(CloudFilesSourceProvider.scala:114)
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:259)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:140)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:140)
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:34)
at com.databricks.sql.streaming.CloudFilesAnalysis$$anonfun$rewrite$1.applyOrElse(CloudFilesAnalysis.scala:87)
at com.databricks.sql.streaming.CloudFilesAnalysis$$anonfun$rewrite$1.applyOrElse(CloudFilesAnalysis.scala:47)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:94)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1143)
at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1142)
at org.apache.spark.sql.catalyst.plans.logical.UnaryNode.mapChildren(LogicalPlan.scala:188)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
at com.databricks.sql.streaming.CloudFilesAnalysis.rewrite(CloudFilesAnalysis.scala:47)
at com.databricks.sql.streaming.CloudFilesAnalysis.rewrite(CloudFilesAnalysis.scala:43)
at com.databricks.sql.optimizer.DatabricksEdgeRule.apply(DatabricksEdgeRule.scala:36)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$3(RuleExecutor.scala:216)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:216)
at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
at scala.collection.immutable.List.foldLeft(List.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:213)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:205)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:205)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:297)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:290)
at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:192)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:290)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:184)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:109)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:184)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:270)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:331)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:269)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:112)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:134)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:246)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:958)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:246)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:113)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:110)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:102)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:958)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:93)
at com.databricks.pipelines.execution.languages.SQLPipeline$.analyze(SQLPipeline.scala:332)
at com.databricks.pipelines.execution.languages.SQLPipeline.com$databricks$pipelines$execution$languages$SQLPipeline$$analyze(SQLPipeline.scala:187)
at com.databricks.pipelines.execution.languages.SQLPipeline$$anonfun$1.$anonfun$applyOrElse$4(SQLPipeline.scala:133)
at com.databricks.pipelines.Pipeline$$anon$1.$anonfun$call$3(Pipeline.scala:529)
at com.databricks.pipelines.Pipeline$.withContext(Pipeline.scala:83)
at com.databricks.pipelines.Pipeline$$anon$1.$anonfun$call$2(Pipeline.scala:529)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.pipelines.Pipeline$$anon$1.call(Pipeline.scala:528)
at com.databricks.pipelines.graph.Flow.flowFuncResult(elements.scala:370)
at com.databricks.pipelines.graph.Flow.flowFuncResult$(elements.scala:368)
at com.databricks.pipelines.graph.BasicFlow.flowFuncResult$lzycompute(elements.scala:443)
at com.databricks.pipelines.graph.BasicFlow.flowFuncResult(elements.scala:443)
at com.databricks.pipelines.graph.Flow.failure(elements.scala:419)
at com.databricks.pipelines.graph.Flow.failure$(elements.scala:418)
at com.databricks.pipelines.graph.BasicFlow.failure(elements.scala:443)
at com.databricks.pipelines.graph.Flow.resolved(elements.scala:427)
at com.databricks.pipelines.graph.Flow.resolved$(elements.scala:427)
at com.databricks.pipelines.graph.BasicFlow.resolved(elements.scala:443)
at com.databricks.pipelines.graph.DataflowGraph.$anonfun$resolve$1(DataflowGraph.scala:261)
at com.databricks.pipelines.Pipeline$.withGraphToConnect(Pipeline.scala:113)
at com.databricks.pipelines.graph.DataflowGraph.resolve(DataflowGraph.scala:211)
at com.databricks.pipelines.graph.DataflowGraph.connect(DataflowGraph.scala:201)
at com.databricks.pipelines.execution.TableManager.materializeTables(TableManager.scala:114)
at com.databricks.pipelines.execution.UpdateExecution.$anonfun$setupTables$1(UpdateExecution.scala:349)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$2(DeltaPipelinesUsageLogging.scala:114)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter.executeWithPeriodicReporting(OperationStatusReporter.scala:119)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter$.executeWithPeriodicReporting(OperationStatusReporter.scala:159)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$5(DeltaPipelinesUsageLogging.scala:133)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:341)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:435)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:455)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:215)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:95)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:213)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:210)
at com.databricks.pipelines.execution.monitoring.PublicLogging.withAttributionContext(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:251)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:243)
at com.databricks.pipelines.execution.monitoring.PublicLogging.withAttributionTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:430)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:350)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperationWithResultTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:341)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:313)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperation(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperation0(DeltaPipelinesUsageLogging.scala:59)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation(DeltaPipelinesUsageLogging.scala:126)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation$(DeltaPipelinesUsageLogging.scala:98)
at com.databricks.pipelines.execution.PipelineRunnable.recordPipelinesOperation(PipelineRunnable.scala:34)
at com.databricks.pipelines.execution.UpdateExecution.executeStage(UpdateExecution.scala:191)
at com.databricks.pipelines.execution.UpdateExecution.setupTables(UpdateExecution.scala:340)
at com.databricks.pipelines.execution.UpdateExecution.executeUpdate(UpdateExecution.scala:262)
at com.databricks.pipelines.execution.UpdateExecution.start(UpdateExecution.scala:113)
at com.databricks.pipelines.execution.service.ExecutionBackend$$anon$1.run(ExecutionBackend.scala:278)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:68)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:54)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:101)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
java.lang.StringIndexOutOfBoundsException: String index out of range: 89
at java.lang.String.charAt(String.java:658)
at hadoop_azure_shaded.com.microsoft.azure.storage.core.Base64.decode(Base64.java:82)
at hadoop_azure_shaded.com.microsoft.azure.storage.StorageCredentialsAccountAndKey.<init>(StorageCredentialsAccountAndKey.java:81)
at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.connectUsingConnectionStringCredentials(AzureNativeFileSystemStore.java:903)
at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.createAzureStorageSession(AzureNativeFileSystemStore.java:1048)
at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.initialize(AzureNativeFileSystemStore.java:512)
at shaded.databricks.org.apache.hadoop.fs.azure.NativeAzureFileSystem.initialize(NativeAzureFileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at com.databricks.sql.fileNotification.autoIngest.SchemaInferenceUtils$.sampleData(SchemaInferenceUtils.scala:59)
at com.databricks.sql.fileNotification.autoIngest.SchemaInferenceUtils$.getOrUpdatePersistedSchema(SchemaInferenceUtils.scala:173)
at com.databricks.sql.fileNotification.autoIngest.SchemaInferenceUtils$.inferSchema(SchemaInferenceUtils.scala:559)
at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.determineSchema(CloudFilesSourceProvider.scala:77)
at com.databricks.sql.fileNotification.autoIngest.CloudFilesSourceProvider.sourceSchema(CloudFilesSourceProvider.scala:114)
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:259)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:140)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:140)
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:34)
at com.databricks.sql.streaming.CloudFilesAnalysis$$anonfun$rewrite$1.applyOrElse(CloudFilesAnalysis.scala:87)
at com.databricks.sql.streaming.CloudFilesAnalysis$$anonfun$rewrite$1.applyOrElse(CloudFilesAnalysis.scala:47)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:94)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1143)
at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1142)
at org.apache.spark.sql.catalyst.plans.logical.UnaryNode.mapChildren(LogicalPlan.scala:188)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30)
at com.databricks.sql.streaming.CloudFilesAnalysis.rewrite(CloudFilesAnalysis.scala:47)
at com.databricks.sql.streaming.CloudFilesAnalysis.rewrite(CloudFilesAnalysis.scala:43)
at com.databricks.sql.optimizer.DatabricksEdgeRule.apply(DatabricksEdgeRule.scala:36)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$3(RuleExecutor.scala:216)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:216)
at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
at scala.collection.immutable.List.foldLeft(List.scala:91)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:213)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:205)
at scala.collection.immutable.List.foreach(List.scala:431)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:205)
at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:297)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:290)
at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:192)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:290)
at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:184)
at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:109)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:184)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:270)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:331)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:269)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:112)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:134)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:246)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:958)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:246)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:113)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:110)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:102)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:958)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:93)
at com.databricks.pipelines.execution.languages.SQLPipeline$.analyze(SQLPipeline.scala:332)
at com.databricks.pipelines.execution.languages.SQLPipeline.com$databricks$pipelines$execution$languages$SQLPipeline$$analyze(SQLPipeline.scala:187)
at com.databricks.pipelines.execution.languages.SQLPipeline$$anonfun$1.$anonfun$applyOrElse$4(SQLPipeline.scala:133)
at com.databricks.pipelines.Pipeline$$anon$1.$anonfun$call$3(Pipeline.scala:529)
at com.databricks.pipelines.Pipeline$.withContext(Pipeline.scala:83)
at com.databricks.pipelines.Pipeline$$anon$1.$anonfun$call$2(Pipeline.scala:529)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.pipelines.Pipeline$$anon$1.call(Pipeline.scala:528)
at com.databricks.pipelines.graph.Flow.flowFuncResult(elements.scala:370)
at com.databricks.pipelines.graph.Flow.flowFuncResult$(elements.scala:368)
at com.databricks.pipelines.graph.BasicFlow.flowFuncResult$lzycompute(elements.scala:443)
at com.databricks.pipelines.graph.BasicFlow.flowFuncResult(elements.scala:443)
at com.databricks.pipelines.graph.Flow.failure(elements.scala:419)
at com.databricks.pipelines.graph.Flow.failure$(elements.scala:418)
at com.databricks.pipelines.graph.BasicFlow.failure(elements.scala:443)
at com.databricks.pipelines.graph.Flow.resolved(elements.scala:427)
at com.databricks.pipelines.graph.Flow.resolved$(elements.scala:427)
at com.databricks.pipelines.graph.BasicFlow.resolved(elements.scala:443)
at com.databricks.pipelines.graph.DataflowGraph.$anonfun$resolve$1(DataflowGraph.scala:261)
at com.databricks.pipelines.Pipeline$.withGraphToConnect(Pipeline.scala:113)
at com.databricks.pipelines.graph.DataflowGraph.resolve(DataflowGraph.scala:211)
at com.databricks.pipelines.graph.DataflowGraph.connect(DataflowGraph.scala:201)
at com.databricks.pipelines.execution.TableManager.materializeTables(TableManager.scala:114)
at com.databricks.pipelines.execution.UpdateExecution.$anonfun$setupTables$1(UpdateExecution.scala:349)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$2(DeltaPipelinesUsageLogging.scala:114)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter.executeWithPeriodicReporting(OperationStatusReporter.scala:119)
at com.databricks.pipelines.common.monitoring.OperationStatusReporter$.executeWithPeriodicReporting(OperationStatusReporter.scala:159)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.$anonfun$recordPipelinesOperation$5(DeltaPipelinesUsageLogging.scala:133)
at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:341)
at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:435)
at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:455)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:215)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:95)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:213)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:210)
at com.databricks.pipelines.execution.monitoring.PublicLogging.withAttributionContext(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:251)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:243)
at com.databricks.pipelines.execution.monitoring.PublicLogging.withAttributionTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:430)
at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:350)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperationWithResultTags(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:341)
at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:313)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperation(DeltaPipelinesUsageLogging.scala:24)
at com.databricks.pipelines.execution.monitoring.PublicLogging.recordOperation0(DeltaPipelinesUsageLogging.scala:59)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation(DeltaPipelinesUsageLogging.scala:126)
at com.databricks.pipelines.execution.monitoring.DeltaPipelinesUsageLogging.recordPipelinesOperation$(DeltaPipelinesUsageLogging.scala:98)
at com.databricks.pipelines.execution.PipelineRunnable.recordPipelinesOperation(PipelineRunnable.scala:34)
at com.databricks.pipelines.execution.UpdateExecution.executeStage(UpdateExecution.scala:191)
at com.databricks.pipelines.execution.UpdateExecution.setupTables(UpdateExecution.scala:340)
at com.databricks.pipelines.execution.UpdateExecution.executeUpdate(UpdateExecution.scala:262)
at com.databricks.pipelines.execution.UpdateExecution.start(UpdateExecution.scala:113)
at com.databricks.pipelines.execution.service.ExecutionBackend$$anon$1.run(ExecutionBackend.scala:278)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.$anonfun$run$1(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:68)
at org.apache.spark.util.threads.SparkThreadLocalCapturingHelper.runWithCaptured$(SparkThreadLocalForwardingThreadPoolExecutor.scala:54)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.runWithCaptured(SparkThreadLocalForwardingThreadPoolExecutor.scala:101)
at org.apache.spark.util.threads.SparkThreadLocalCapturingRunnable.run(SparkThreadLocalForwardingThreadPoolExecutor.scala:104)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

azure-databricks
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @AdelineBornatico-0453,

Just checking in if you have had a chance to see the previous response. We need the following information to understand/investigate this issue further.

0 Votes 0 ·

Hello @AdelineBornatico-0453,

Following up to see if the below suggestion was helpful. And, if you have any further query do let us know.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.

0 Votes 0 ·

1 Answer

PRADEEPCHEEKATLA-MSFT avatar image
0 Votes"
PRADEEPCHEEKATLA-MSFT answered PRADEEPCHEEKATLA-MSFT commented

Hello @AdelineBornatico-0453,

Welcome to the MS Q&A platform.

How exactly are you authenticating the Azure Blob Storage account?

Could you please help me where exactly running the Delta Lake Live Tables? Please do share the screenshot to help us understand.

196885-image.png

I would suggest you to try using mounting a blob storage container of your own and then try use the path as /dbfs/mnt/mountname.

Meanwhile, you can go through Delta Live Tables QuickStart.

Hope this will help. Please let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators


image.png (89.6 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @AdelineBornatico-0453,

Just checking in to see if the above answer helped. If this answers your query, do click Accept Answer and Up-Vote for the same. And, if you have any further query do let us know.

1 Vote 1 ·