question

BosROYRoy-3914 avatar image
0 Votes"
BosROYRoy-3914 asked PRADEEPCHEEKATLA-MSFT commented

Cannot access Databricks sample data sets

On our Databricks instances, trying to access the Databricks sample data sets using display(dbutils.fs.ls('/databricks-datasets/')) or equivalent as described here, the commands keep running without giving an output. Normally these data sets should be available even in new clusters. Data in the file store can be accessed normally. Here are some settings we use that might be relevant:

 spark_conf = {
   "spark.databricks.cluster.profile"                = "serverless"
   "spark.databricks.passthrough.enabled"            = true
   "spark.databricks.delta.preview.enabled"          = true
   "spark.databricks.pyspark.enableProcessIsolation" = true
   "spark.databricks.repl.allowedLanguages"          = "python,sql"
 }



azure-databricksdotnet-ml-big-data
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Check the error log of the cluster. It is a known issue. Turns out you need to whitelist the S3 bucket on which the sample data resides.

1 Vote 1 ·
BosROYRoy-3914 avatar image BosROYRoy-3914 GopinathRajee-8127 ·

Thanks for the suggestion. I will look into this.

0 Votes 0 ·

1 Answer

PRADEEPCHEEKATLA-MSFT avatar image
0 Votes"
PRADEEPCHEEKATLA-MSFT answered PRADEEPCHEEKATLA-MSFT commented

Hello @BosROYRoy-3914,

Thanks for the question and using MS Q&A platform.

When you say "commands keep running" - how long it keeps on running? Could you please try to create a new cluster and see if you are experiencing similar behaviour.

As per the repro from our end, it's working as excepted without any issue.

202681-adb-databricks-datasets.gif

Hope this will help. Please let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators


· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I tried with a new cluster, but the command consistently runs for 14.49 minutes, and then gives the following error:

 ExecutionError: An error occurred while calling z:com.databricks.backend.daemon.dbutils.FSUtils.ls.
 : com.databricks.backend.daemon.data.common.InvalidMountException: Error while using path /databricks-datasets for resolving path '/' within mount at '/databricks-datasets'.
     at com.databricks.backend.daemon.data.common.InvalidMountException$.apply(DataMessages.scala:681)
     at com.databricks.backend.daemon.data.filesystem.MountEntryResolver.resolve(MountEntryResolver.scala:84)
     at com.databricks.backend.daemon.data.client.DBFSV2.resolve(DatabricksFileSystemV2.scala:81)
     ...
 Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out with exception after 1 attempts
     at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
     at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
     at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
     ...
 Caused by: java.lang.Throwable: Timed out with exception after 1 attempts
     at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff0(TimeUtils.scala:221)
     at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff(TimeUtils.scala:145)
     at com.databricks.backend.common.util.TimeUtils$.retryWithTimeout(TimeUtils.scala:94)
     ...



0 Votes 0 ·

Hello @BosROYRoy-3914,

Apologize for the delay in response.

Are you still experiencing the above error message?

If yes, could you please share the Databricks Runtime version details.

0 Votes 0 ·