This guide describes how to set up version control for notebooks using Bitbucket Cloud through the UI. Although this document describes how to set up Bitbucket Cloud integration through the UI, you can also use the Databricks CLI or Workspace API to import and export notebooks and manage the notebook versions using Bitbucket tools.
Enable and disable Git versioning
By default version control is enabled. To toggle this setting, see Manage the ability to version notebooks in Git. If Git versioning is disabled, the Git Integration tab is not visible in the User Settings screen.
Configure version control
Configuring version control involves creating access credentials in your version control provider and adding those credentials to Azure Databricks.
Get an app password
- Go to Bitbucket Cloud and create an app password that allows access to your repositories. See the Bitbucket Cloud documentation.
- Record the password. You enter this password in Azure Databricks in the next step.
Save your app password and username to Azure Databricks
Click the User icon at the top right of your screen and select User Settings.
Click the Git Integration tab.
If you have previously entered credentials, click the Change token or app password button.
In the Git provider drop-down, select Bitbucket Cloud.
Paste your password and username into the respective fields and click Save.
Work with notebook revisions
You work with notebook revisions in the History panel. Open the history panel by clicking Revision history at the top right of the notebook.
You cannot modify a notebook while the History panel is open.
Link a notebook to Bitbucket Cloud
Open the History panel. The Git status bar displays Git: Not linked.
Click Git: Not linked.
The Git Preferences dialog displays. The first time you open your notebook, the Status is Unlink, because the notebook is not in Bitbucket Cloud.
In the Status field, click Link.
In the Link field, paste the URL of the Bitbucket Cloud repository.
Click the Branch drop-down and select a branch.
In the Path in Git Repo field, specify where in the repository to store your file.
Python notebooks have the suggested default file extension
.py. If you use
.ipynb, your notebook will save in iPython notebook format. If the file already exists on Bitbucket Cloud, you can directly copy and paste the URL of the file.
Click Save to finish linking your notebook. If this file did not previously exist, a prompt with the option Save this file to your Bitbucket Cloud repo displays.
Type a message and click Save.
Save a notebook to Bitbucket Cloud
While the changes that you make to your notebook are saved automatically to the Azure Databricks revision history, changes do not automatically persist to Bitbucket Cloud.
Open the History panel.
Click Save Now to save your notebook to Bitbucket Cloud. The Save Notebook Revision dialog displays.
Optionally, enter a message to describe your change.
Make sure that Also commit to Git is selected.
Revert or update a notebook to a version from Bitbucket Cloud
Once you link a notebook, Azure Databricks syncs your history with Git every time you re-open the History panel. Versions that sync to Git have commit hashes as part of the entry.
Open the History panel.
Choose an entry in the History panel. Azure Databricks displays that version.
Click Restore this version.
Click Confirm to confirm that you want to restore that version.
Unlink a notebook
Open the History panel.
The Git status bar displays Git: Synced.
Click Git: Synced.
In the Git Preferences dialog, click Unlink.
Click Confirm to confirm that you want to unlink the notebook from version control.
Create a pull request
Open History panel.
Click the Git status bar to open the Git Preferences dialog.
Click Create PR. Bitbucket Cloud opens to a pull request page for the branch.
Best practice for code reviews
Azure Databricks supports Git branching.
- You can link a notebook to your own fork and choose a branch.
- We recommend using separate branches for each notebook.
- Once you are happy with your changes, you can use the Create PR link in the Git Preferences dialog to take you to Bitbucket Cloud’s pull request page.
- The Create PR link displays only if you’re not working on the default branch of the parent repository.
Bitbucket Server integration is not supported. However, you can use the Workspace API to programmatically create notebooks and manage the code base in Bitbucket Server.
If you receive errors related to Bitbucket Cloud history sync, verify the following:
- You have initialized the repository on Bitbucket Cloud, and it isn’t empty. Try the URL that you entered and verify that it forwards to your Bitbucket Cloud repository.
- Your app password is active and your username is correct.
- If the repository is private, you should have read and write access (through Bitbucket Cloud) on the repository.