Monitor usage with system tables

Important

This feature is in Public Preview. There are currently no charges to use this feature. In the future, some of this usage might incur a charge.

This article explains the concept of system tables in Azure Databricks and highlights resources you can use to get the most out of your system tables data.

What are system tables?

System tables are an Azure Databricks-hosted analytical store of your account’s operational data found in the system catalog. System tables can be used for historical observability across your account.

Note

For documentation on system.information_schema, see Information schema.

Requirements

To access system tables, your workspace must be enabled for Unity Catalog. For more information, see Enable system table schemas.

Which system tables are available?

Currently, Azure Databricks hosts system tables for:

Table Description Location Supports streaming Retention Include global or regional data
Audit logs Includes records for all audit events from workspaces in your region. For a list of available audit events, see Diagnostic log reference. system.access.audit Yes 365 days Regional for workspace-level events. Global for account-level events.
Table lineage Includes a record for each read or write event on a Unity Catalog table or path. system.access.table_lineage Yes 365 days Regional
Column lineage Includes a record for each read or write event on a Unity Catalog column (but does not include events that do not have a source). system.access.column_lineage Yes 365 days Regional
Billable usage Includes records for all billable usage across your account. Each usage record is an hourly aggregate of a resource’s billable usage. system.billing.usage Yes 365 days Global
Pricing A historical log of SKU pricing. A record gets added each time there is a change to a SKU price. system.billing.list_prices No N/A Global
Clusters A slow-changing dimension table that contains the full history of cluster configurations over time for any cluster. system.compute.clusters Yes None Regional
Node types Captures the currently available node types with their basic hardware information. system.compute.node_types No N/A Regional
SQL warehouse events Captures events related to SQL warehouses. For example, starting, stopping, running, scaling up and down. system.compute.warehouse_events Yes 365 days Regional
Marketplace funnel events Includes consumer impression and funnel data for your listings. system.marketplace.listing_

funnel_events
Yes 365 days Regional
Marketplace listing access Includes consumer info for completed request data or get data events on your listings. system.marketplace.listing_

access_events
Yes 365 days Regional
Predictive optimization Tracks the operation history of the predictive optimization feature. system.storage.predictive_

optimization_operations_history
No 180 days Regional

Note

You may see other system tables in your account besides the ones listed above. Those tables are in Private Preview currently and are empty by default. If you are interested in using any of these tables, please reach out to your Databricks account team.

Enable system table schemas

Since system tables are governed by Unity Catalog, you need to have at least one Unity Catalog-enabled workspace in your account to enable and access system tables. System tables include data from all workspaces in your account but they can only be accessed from a Unity Catalog-enabled workspace.

System tables are enabled at the schema level. If you enable a system schema, you enable all the tables within that schema. When new schemas are released, an account admin needs to manually enable the schema.

System tables must be enabled by an account admin. You can enable system tables using the SystemSchemas API.

List available system schemas

Use the following curl command to list available system schemas:

curl -v -X GET -H "Authorization: Bearer <PAT Token>" "https://adb-<xxx>.azuredatabricks.net/api/2.0/unity-catalog/metastores/<metastore-id>/systemschemas"

The following is an example output of the GET command:

{"schemas":[{"schema":"access","state":"<AVAILABLE OR EnableCompleted>"},{"schema":"billing","state":"<AVAILABLE OR EnableCompleted>"},{"schema":"information_schema","state":"<AVAILABLE OR EnableCompleted>"}]}

state: AVAILABLE: The system schema is available but has not yet been enabled.

state: EnableCompleted: You have enabled the system schema and it is visible in Catalog Explorer.

Enable a system schema

Use the following curl command to enable a system schema:

curl -v -X PUT -H "Authorization: Bearer <PAT Token>" "https://adb-<xxx>.azuredatabricks.net/api/2.0/unity-catalog/metastores/<metastore-id>/systemschemas/<SCHEMA_NAME>"

If the system schema is enabled successfully, result code 200 is returned.

If you attempt to re-enable a system schema, the following is returned: "error_code":"SCHEMA_ALREADY_EXISTS","message":"Schema <schema-name> already exists".

Disable a system schema

Use the following curl command to disable a system schema:

curl -v -X DELETE -H "Authorization: Bearer <PAT Token>" "https://adb-<xxx>.azuredatabricks.net/api/2.0/unity-catalog/metastores/<metastore-id>/systemschemas/<SCHEMA_NAME>"

Grant access to system tables

System table access is governed by Unity Catalog. By default, no users have access to system tables. To grant access, a metastore admin or other privileged user must grant USE and SELECT permissions on the system schemas. See Manage privileges in Unity Catalog.

System tables are read-only and cannot be modified.

Note

If your account was created after November 9, 2023, you might not have a metastore admin by default. For more information, see Set up and manage Unity Catalog.

Do system tables contain data for all workspaces in your account?

The audit log and lineage tables contain operational data for all workspaces in your account deployed within the same cloud region. The billing system table (system.billing.usage) contains data for all workspaces in your account, no matter what region they are deployed in.

Even though system tables can only be accessed through a Unity Catalog workspace, the tables also include operational data for non-Unity Catalog workspaces in your account.

Where are the system tables located?

The system tables in your account are located in a catalog called system, which is included in every Unity Catalog metastore. In the system catalog you’ll see schemas such as access and billing that contain the system tables.

Note

During the system tables Public Preview, Azure Databricks will retain all your system tables data.

Considerations for streaming system tables

Access to system tables is supported by Delta Sharing. Be aware of the following considerations when streaming with Delta Sharing:

  • If you are using streaming with system tables, set the skipChangeCommit option to true. This ensures the streaming job is not disrupted from deletes in the system tables. See Ignore updates and deletes.
  • Trigger.AvailableNow is not supported with Delta Sharing streaming. It will be converted to Trigger.Once.
  • If you use a trigger in your streaming job and find the job isn’t catching up to the latest system table version, Databricks recommends increasing the scheduled frequency of the job.

Known issues

  • Currently no support for real-time monitoring. Data is updated throughout the day. If you don’t see a log for a recent event, check back later.

  • To enable system tables, you might need to grant network access to the system tables Blob storage endpoint. To view a list of every region’s system tables’ storage endpoint, see Storage endpoint IP addresses.

  • The system schemas system.operational_data and system.lineage are deprecated and will contain empty tables.