Git integration with Databricks Repos
Support for arbitrary files in Databricks Repos is now in Public Preview. For details, see Work with non-notebook files in an Azure Databricks repo and Import Python and R modules.
To support best practices for data science and engineering code development, Databricks Repos provides repository-level integration with Git providers. You can develop code in an Azure Databricks notebook and sync it with a remote Git repository. Databricks Repos lets you use Git functionality such as cloning a remote repo, managing branches, pushing and pulling changes, and visually comparing differences upon commit.
Databricks Repos also provides an API that you can integrate with your CI/CD pipeline. For example, you can programmatically update a Databricks repo so that it always has the most recent code version.
When audit logging is enabled, audit events are logged when you interact with a Databricks repo. For example, an audit event is logged when you create, update, or delete a Databricks repo, when you list all Databricks Repos associated with a workspace, and when you sync changes between your Databricks repo and the Git remote.
For more information about best practices for code development using Databricks Repos, see CI/CD workflows with Databricks Repos and Git integration.
Azure Databricks supports these Git providers:
- Bitbucket Cloud and Server
- Azure DevOps (not available in Azure China regions)
- AWS CodeCommit
- GitHub AE
Databricks Repos supports Bitbucket Server, GitHub Enterprise Server, or a GitLab self-managed subscription instance integration, if the server is internet accessible.
To integrate with a private Git server instance that is not internet-accessible, get in touch with your Databricks representative.
Support for arbitrary files in Databricks Repos is available in Databricks Runtime 8.4 and above.
Submit and view feedback for