Databricks runtimes are the set of core components that run on Azure Databricks clusters. Azure Databricks offers several types of runtimes:
Includes Apache Spark but also adds a number of components and updates that substantially improve the usability, performance, and security of big data analytics.
Databricks Runtime with Conda
An experimental version of Databricks Runtime based on Conda. Databricks Runtime with Conda provides an updated and optimized list of default packages and a flexible Python environment for advanced users who require maximum control over packages and environments.
Databricks Runtime for Machine Learning
Built on Databricks Runtime and provides a ready-to-go environment for machine learning and data science. It contains multiple popular libraries, including TensorFlow, Keras, PyTorch, and XGBoost.
Databricks Runtime for Health and Life Sciences
A version of Databricks Runtime optimized for working with genomic and biomedical data.
The Databricks packaging of the open source Apache Spark runtime. It provides a runtime option for jobs that don’t need the advanced performance, reliability, or autoscaling benefits provided by Databricks Runtime. You can select Databricks Light only when you create a cluster to run a JAR, Python, or spark-submit job; you cannot select this runtime for clusters on which you run interactive or notebook job workloads.
You can choose from among many supported runtime versions when you create a cluster.
For details on each runtime type, see:
- Databricks Runtime
- Databricks Runtime with Conda
- Databricks Runtime for Machine Learning
- Databricks Runtime for Genomics
- Databricks Light
For information about the contents of each runtime version, see the release notes.