What is Databricks Assistant?

Article
05/22/2024

Important

This feature is currently in Public Preview.

Databricks Assistant works as an AI-based companion pair-programmer to make you more efficient as you create notebooks, queries, and files. It can help you rapidly answer questions by generating, optimizing, completing, explaining, and fixing code and queries.

This page provides general information about the Assistant in the form of frequently asked questions. For questions about privacy and security, see Privacy and security.

Enable or disable Databricks Assistant

Databricks Assistant is enabled by default. You can manage enablement for all workspaces in an account or individual workspaces.

Enablement of the Databricks Assistant for your account is captured as an account event in your audit logs, see Account events.

Note

An improved Databricks Assistant experience that tracks query threads and history throughout editor contexts is also available on an opt-in basis.

To enable this experience, change the New Assistant toggle to On when you open the Assistant and reload the page.

Enable improved DB Assistant experience that tracks query threads and history.

Manage the account setting

To enable or disable all workspaces in an account for Databricks Assistant, follow these instructions:

As an account admin, log in to the account console.

Important

If no users in your Microsoft Entra ID (formerly Azure Active Directory) tenant have yet logged in to the account console, you or another user in your tenant must log in as the first account admin. To do this, you must be a Microsoft Entra ID Global Administrator, but only when you first log in to the Azure Databricks Account Console. Upon first login, you become an Azure Databricks account admin and no longer need the Microsoft Entra ID Global Administrator role to access the Azure Databricks account. As the first account admin, you can assign users in the Microsoft Entra ID tenant as additional account admins (who can themselves assign more account admins). Additional account admins do not require specific roles in Microsoft Entra ID. See Manage users, service principals, and groups.
Click Settings.
Click the Feature enablement tab.
For the Azure AI services-powered AI assistive features option, select Enabled or Disabled, and then click Save.

Manage the workspace setting

If the account setting permits workspace setting overrides, workspace admins can enable or disable specific workspaces. To do this, use a Workspace Setting to override the default setting in the Account Console as follows:

Go to the workspace admin settings page.
Click the Advanced tab.
Use the Azure AI services-powered AI assistive features drop-down menu to make your selection.
Click Save.

Get coding help from Databricks Assistant

To access Databricks Assistant, click the Assistant icon in the left sidebar of the notebook, the file editor, the SQL Editor, or the dashboard Data tab.

The Assistant pane can open on the left or right side of the screen.

Databricks assistant pane

Some capabilities of Databricks Assistant are the following:

Generate: Use natural language to generate a SQL query.
Explain: Highlight a query or a block of code and have Databricks Assistant walk through the logic in clear, concise English.
Fix: Explain and fix syntax and runtime errors with a single click.
Transform and optimize: Convert Pandas code to PySpark for faster execution.

Any code generated by the Databricks Assistant is intended to run in a Databricks compute environment. It is optimized to create code in Databricks-supported programming languages, frameworks, and dialects. It is not intended to be a general-purpose programming assistant. The Assistant often uses information from Databricks resources, such as the Databricks Documentation website or Knowledge Base, to better answer user queries. It performs best when the user question is related to questions that can be answered with knowledge from Databricks documentation, Unity Catalog, and user code in the Workspace.

Users should always review any code generated by the Assistant before running it because it can sometimes make mistakes.

Create data visualizations using the Databricks Assistant

You can use the Databricks Assistant when drafting dashboards. As you create visualizations on an existing dashboard dataset, prompt the Assistant with questions to receive responses in the form of generated charts. To use the Assistant in a dashboard, first create one or more datasets, then add a visualization widget to the Canvas. The visualization widget includes a prompt to describe your new chart. Type a description of the chart you want to see, and the assistant will generate it. You can approve or reject the chart, or modify the description to generate something new.

For details and examples of using the Assistant with dashboards, see Create visualizations with Databricks Assistant.

Services used by Databricks Assistant

Databricks Assistant uses Azure OpenAI services to provide responses.

The Azure OpenAI service is operated by Microsoft, not OpenAI, and is subject to their respective data management policies. Data sent to this service is not used for any model training. For details, see Azure data management policy.

For Azure OpenAI, Azure Databricks has opted out of Abuse Monitoring so no prompts or responses are stored with Azure OpenAI.

Tips for improving the accuracy of results

Use the prompt “Find Tables” for better responses. Before you ask questions about data in a table, ask the Assistant to find related tables by subject matter or other characteristics. Example: Find tables related to NFL games.
Specify the structure of the response you want. The structure and detail that Databricks Assistant provides varies, even for the same prompt. Databricks Assistant knows about your table and column schema and metadata, so you can use natural language to ask your question. Example: List active and retired NFL quarterbacks' passing completion rate, for those who had over 500 attempts in a season. Assistant answers using data from columns such as s.player_id and s.attempts.
Provide examples of your row-level data values. Databricks Assistant doesn’t have access to row-level data, thus for more accurate answers provide examples of the data. Example: List the average height for each position in inches. This returns an error because the data set shows height in feet and inches, as in 6-2.
Test code snippets by running them in the Assistant pane. Use the Assistant pane as a scratchpad that saves iterations of your queries and assistant answers. You can run code and edit it in the pane until you are ready to add it to a notebook.
Use cell actions in a notebook. Cell actions include shortcuts to common tasks, such as documenting (commenting), fixing, and explaining code.

For fully illustrated examples, see 5 tips for Databricks Assistant.

Databricks Assistant considers the history of the conversation so you can refine your questions as you go.

Give feedback

The best way to send feedback is to use the Provide Feedback links in the notebook and SQL editor. You can also send an email to assistant-feedback@databricks.com or to your account team.

Share product improvement suggestions and user experience issues rather than feedback about prompt accuracy. If you receive an unhelpful suggestion from the Assistant, click the “Not useful” button.

Privacy and security

Q: What data is being sent to the models?

Databricks Assistant sends code and metadata to the models on each API request. This helps return more relevant results for your data. Examples include:

Code/queries in the current notebook cell or SQL Editor tab
Table and Column names and descriptions
Previous questions
Favorite tables

Q: Does the metadata sent to the models respect the user’s Unity Catalog permissions?

Yes, all of the data sent to the model respects the user’s Unity Catalog permissions, so it does not send metadata relating to tables that the user does not have permission to see.

Q: If I execute a query with results, and then ask a question, do the results of my query get sent to the model?

No, only the code contents in cells, metadata about tables, and the user-entered text is shared with the model. For the “fix error” feature, Databricks also shares the stack trace from the error output.

Q: Will Databricks Assistant execute dangerous code?

No. Databricks Assistant does not automatically run code on your behalf. AI models can make mistakes, misunderstand intent, and hallucinate or give incorrect answers. Review and test AI- generated code before you run it.

Q: Has Databricks done any assessment to evaluate the accuracy and appropriateness of the Assistant responses?

Yes. Databricks has mitigations to prevent the Assistant from generating harmful responses such as hate speech, insecure code, prompt jailbreaks, and third-party copyright content. Databricks has done extensive testing of all our AI assistive features with thousands of simulated user inputs to assess the robustness of mitigations. These assessments focused on the expected use cases for the Assistant such as code generation in the Python, Databricks SQL, R, and Scala languages.

Q: Can I use Databricks Assistant with tables that process regulated data (PHI, PCI, IRAP, FedRAMP)?

Yes. To do so, you must comply with requirements, such as enabling the compliance security profile, and add the relevant compliance standard as part of the compliance security profile configuration.

Share via