Prepare Storage for Data Loading and Model Checkpointing

Data loading and model checkpointing are crucial to deep learning (especially distributed DL) workloads.

In Databricks Runtime 6.0 and above, Azure Databricks provides a high performance FUSE mount.

In Databricks Runtime 5.3 to Databricks Runtime 5.5, Azure Databricks provides dbfs:/ml, a special folder that offers high-performance I/O for deep learning workloads, that maps to file:/dbfs/ml on driver and worker nodes. Azure Databricks recommends using Databricks Runtime 5.3 or above and saving data under /dbfs/ml. This FUSE mount also alleviates the local file I/O API limitation in Databricks Runtime of supporting only files smaller than 2GB.

If you use a Databricks Runtime version lower than 5.3 or you want to use your own storage, Databricks recommends that you use the blobfuse client, an open source project to provide a virtual filesystem backed by Azure Blob storage. To mount an Azure Blob storage container as a file system with blobfuse, you can use an init script. The following notebook explains how to generate an init script and configure a cluster to run the script.

blobfuse init script notebook

Get notebook