Databricks Runtime 4.1 ML (Beta)

Databricks Runtime 4.1 ML provides a ready-to-go environment for machine learning and data science. It contains multiple popular libraries, including TensorFlow, Keras, and XGBoost. It also supports distributed TensorFlow training using Horovod.

Note

This release was deprecated on January 17, 2019. We recommend that you use a newer version of Databricks Runtime ML, depending on which library versions you want to use.

For more information, including instructions for creating a Databricks Runtime ML cluster, see Databricks Runtime for Machine Learning.

Note

Databricks Runtime ML releases pick up all maintenance updates to the base Databricks Runtime release. For a list of all maintenance updates, see Databricks Runtime Maintenance Updates.

Libraries

Databricks Runtime 4.1 ML is built on top of Databricks Runtime 4.1. For information on what’s new in Databricks Runtime 4.1, see the Databricks Runtime 4.1 release notes. In addition to the new features in Databricks Runtime 4.1, Databricks Runtime 4.1 ML includes the following libraries to support machine learning. Some of these are also included in the base Databricks Runtime 4.1 and are noted as such.

Category Libraries
Distributed Deep Learning Distributed training with Horovod and Spark:

* HorovodEstimator
* horovod 0.12.1
* openmpi 3.0.0
* paramiko 2.4.1
* cloudpickle 0.5.2

Distributed TensorFlow and Keras prediction:

* spark-deep-learning 1.0 pre-release
* tensorframes 0.3.0
Deep Learning Keras:

* keras 2.1.5
* h5py 2.7.1

TensorFlow:

* (CPU clusters) tensorflow 1.7.1
* (GPU clusters) tensorflow-gpu 1.7.1

GPU libraries:

* CUDA 9.0 (also installed in base Databricks Runtime)
* cuDNN 7.0 (also installed in base Databricks Runtime)
* NCCL 2.0.5-3
XGBoost * XGBoost4j 0.8-spark2.3-s_2.11
Other machine learning libraries * numpy 1.14.2 (also installed in base Databricks Runtime; version may differ)
* scikit-learn 0.18.1 (also installed in base Databricks Runtime)
* scipy (also installed in base Databricks Runtime)