TensorFlow

TensorFlow is an open-source framework for machine learning created by Google. It supports deep-learning and general numerical computations on CPUs, GPUs, and clusters of GPUs. It is subject to the terms and conditions of the Apache 2.0 License.

In the sections below, we provide guidance on installing TensorFlow on Azure Databricks and give an example of running TensorFlow programs.

Note

This guide is not a comprehensive guide on TensorFlow. See the TensorFlow website.

Install TensorFlow

TensorFlow versions included in Databricks Runtime ML

Databricks Runtime ML includes TensorFlow and TensorBoard so you can use these libraries without installing any packages. Here are the TensorFlow versions included:

Databricks Runtime ML Version TensorFlow Version
6.1 1.14.0
5.5 LTS - 6.0 1.13.1
5.1 - 5.4 1.12.0
5.0 1.10.0

Install TensorFlow on Databricks Runtime ML and Databricks Runtime

Azure Databricks provides instructions for installing newer releases of TensorFlow on Databricks Runtime ML and Databricks Runtime, so that you can try out the latest features in TensorFlow. Due to package dependencies, there might be compatibility issues with other pre-installed packages. After installation, you can verify the installed version by executing the command below in a Python notebook:

import tensorflow as tf
print([tf.__version__, tf.test.is_gpu_available()])

Install TensorFlow 2.0 on Databricks Runtime 6.1 ML

Azure Databricks recommends installing TensorFlow 2.0 on Databricks Runtime 6.1 ML using an init script.

  • Init script for CPU clusters:

    #!/bin/bash
    
    set -e
    
    conda install tensorflow-mkl=2.0 setuptools=41
    
  • Init script for GPU clusters:

    #!/bin/bash
    
    set -e
    
    conda install tensorflow-gpu=2.0 setuptools=41
    

Install TensorFlow 2.0 on Databricks Runtime 6.1

On Databricks Runtime 6.1 CPU clusters, install TensorFlow 2.0 as a Databricks PyPI library: tensorflow==2.0.0.

Install TensorFlow 2.0 on Databricks Runtime 5.5 LTS ML

Azure Databricks recommends installing TensorFlow 2.0 on Databricks Runtime 5.5 LTS ML using an init script.

  • Init script for CPU clusters:

    #!/bin/bash
    
    set -e
    
    /databricks/python/bin/python -V
    . /databricks/conda/etc/profile.d/conda.sh
    conda install -y conda=4.6
    conda activate /databricks/python
    
    conda install -y tensorflow-mkl=2.0 setuptools=41
    
  • Init script for GPU clusters:

    #!/bin/bash
    
    set -e
    
    /databricks/python/bin/python -V
    . /databricks/conda/etc/profile.d/conda.sh
    conda install -y conda=4.6
    conda activate /databricks/python
    
    conda install -y tensorflow-gpu=2.0 setuptools=41
    

Install TensorFlow 1.14 on Databricks Runtime 5.5 LTS ML

Install TensorFlow 1.14 on Databricks Runtime 5.5 LTS ML as a Databricks PyPI library:

  • CPU: tensorflow==1.14.0
  • GPU: tensorflow-gpu==1.14.0

Install TensorFlow 2.0 on Databricks Runtime 5.5 LTS

Azure Databricks recommends installing TensorFlow 2.0 on Databricks Runtime 5.5 LTS using an init script.

  • Init script for CPU clusters:

    #!/bin/bash
    
    set -e
    
    /databricks/python/bin/python -V
    /databricks/python/bin/pip install pip==19.*
    /databricks/python/bin/pip install tensorflow==2.0.0
    
  • Init script for GPU clusters:

    #!/bin/bash
    
    set -e
    
    apt-get update
    apt-get install -y gnupg-curl
    
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
    dpkg -i cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
    apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
    
    wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
    dpkg -i nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
    
    apt-get update
    apt-get install -y --no-install-recommends cuda-libraries-10-0 libcudnn7=7.4.2.24-1+cuda10.0
    
    /databricks/python/bin/python -V
    /databricks/python/bin/pip install pip==19.*
    /databricks/python/bin/pip install tensorflow-gpu==2.0.0
    

Install TensorFlow 1.14 on Databricks Runtime 5.5 LTS

  • CPU clusters: you can install TensorFlow 1.14 as a Databricks PyPI library: tensorflow==1.14.0 on the cluster or use Databricks library utilities to install the library in a Python notebook session:

    dbutils.library.installPyPI("tensorflow", version="1.14.0")
    dbutils.library.restartPython()
    
  • GPU clusters: install TensorFlow 1.14 using the following init script:

    #!/bin/bash
    
    set -e
    
    apt-get update
    apt-get install -y gnupg-curl
    
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
    dpkg -i cuda-repo-ubuntu1604_10.0.130-1_amd64.deb
    apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
    
    wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
    dpkg -i nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
    
    apt-get update
    apt-get install -y --no-install-recommends cuda-libraries-10-0 libcudnn7=7.4.2.24-1+cuda10.0
    
    /databricks/python/bin/python -V
    /databricks/python/bin/pip install tensorflow-gpu==1.14.0
    

Known issues

TensorFlow 2.0.0 has a known incompatibility with Python pickling. You might encounter it if you use PySpark, HorovodRunner, Hyperopt, or any other packages that depend on pickling. The workaround is to explicitly import TensorFlow modules inside your functions. Here is an example:

import tensorflow as tf

def bad_func(_):
  tf.keras.Sequential()

# You might see an error.
sc.parallelize(range(0)).foreach(bad_func)

def good_func(_):
  import tensorflow as tf
  tf.keras.some_func

# No error.
sc.parallelize(range(0)).foreach(good_func)

TensorBoard

TensorBoard is TensorFlow’s suite of visualization tools for debugging, optimizing, and understanding TensorFlow programs.

Note

  • TensorBoard is supported in Databricks Runtime versions 5.0 and above. Earlier versions, including Databricks Runtime 4.3 and Databricks Runtime 4.1 ML, do not include TensorBoard support.
  • You must install TensorFlow as a Databricks PyPI library.

Using TensorBoard

To start TensorBoard from your notebook, use the dbutils.tensorboard utility.

dbutils.tensorboard.start("/tmp/tensorflow_log_dir")

This command displays a link that, when clicked, opens TensorBoard in a new tab.

no-alternative-text

TensorBoard reads from the same log directory that you write to in TensorFlow (for example, tf.summary.FileWriter("/tmp/tensorflow_log_dir", graph=sess.graph)). For the best performance, we recommend you use a local directory on the driver, for example, /tmp/tensorflow_log_dir, to store your log files and copy to persistent storage as needed.

TensorBoard continues to run until you either stop it with dbutils.tensorboard.stop() or you shut down your cluster. Only one instance of TensorBoard can run on a cluster at a time.

Note

If you attach TensorFlow to your cluster as a Databricks library, you may need to reattach your notebook before starting TensorBoard.

Use TensorFlow on a single node

To test and migrate single-machine TensorFlow workflows, you can start with a driver-only cluster on Azure Databricks by setting the number of workers to zero. Though Apache Spark is not functional under this setting, it is a cost-effective way to run single-machine TensorFlow workflows. This example shows how you can run TensorFlow, with TensorBoard monitoring on a driver-only cluster.

TensorFlow notebook

Get notebook

Spark-TensorFlow data conversion

spark-tensorflow-connector is a library within the TensorFlow ecosystem that enables conversion between Spark DataFrames and TFRecords (a popular format for storing data for TensorFlow). With spark-tensorflow-connector, you can use Spark DataFrame APIs to read TFRecords files into DataFrames and write DataFrames as TFRecords.

Installation

Note

The spark-tensorflow-connector library is included in Databricks Runtime ML, a machine learning runtime that provides a ready-to-go environment for machine learning and data science. Instead of installing the library using the instructions below, you can simply create a cluster using Databricks Runtime ML. See Databricks Runtime for Machine Learning.

To use spark-tensorflow-connector on Azure Databricks, you’ll need to build the project JAR locally, upload it to Azure Databricks, and attach it to your cluster as a library.

  1. Ensure you have Maven in your PATH (see the Maven installation instructions if needed).

  2. Clone the TensorFlow ecosystem repository and cd into the spark-tensorflow-connector subdirectory:

    git clone https://github.com/tensorflow/ecosystem
    cd ecosystem/spark/spark-tensorflow-connector
    
  3. Follow the instructions in the README to build the project locally. For the build to succeed, you may need to modify the test configuration so that tests run serially. You can do this by adding a <configuration> tag to the scalatest plugin in ecosystem/spark/spark-tensorflow-connector/pom.xml:

    <configuration>
       <parallel>false</parallel>
    </configuration>
    

    The build command prints the path of the spark-tensorflow-connector JAR, for example:

    Installing /Users/<yourusername>/ecosystem/spark/spark-tensorflow-connector/target/spark-tensorflow-connector_2.11-1.6.0.jar
    to /Users/<yourusername>/.m2/repository/org/tensorflow/spark-tensorflow-connector_2.11/1.6.0/spark-tensorflow-connector_2.11-1.6.0.jar
    
  4. Upload this JAR to Azure Databricks as a library and attach it to your cluster. You should now be able to run the example notebook (adapted from the spark-tensorflow-connector usage examples):

spark-tensorflow-connector notebook

Get notebook