event_log table-valued function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 13.3 LTS and above

Returns the event log for materialized views, streaming tables, and DLT pipelines.

Learn more about the Delta Live Tables event log.

Syntax

event_log( { TABLE ( table_name ) | pipeline_id } )

Arguments

  • table_name: The name of a materialized view or streaming table. The name must not include a temporal specification. If the name is not qualified, the current catalog and schema are used to qualify the identifier.
  • pipeline_id: The string identifier of a Delta Live Tables pipeline.

Returns

  • id STRING NOT NULL: A unique identifier for the event log record.
  • sequence STRING NOT NULL: A JSON object containing metadata to identify and order events.
  • origin STRING NOT NULL: A JSON object containing metadata for the origin of the event, for example, cloud provider, region, user_id, or pipeline_id.
  • timestamp TIMESTAMP NOT NULL: The time the event was recorded in UTC.
  • message STRING NOT NULL: A human-readable message describing the event.
  • level STRING NOT NULL: The level of logging, for example, INFO, WARN, ERROR, or METRICS.
  • maturity_level STRING NOT NULL: The stability of the event schema. The possible values are:
    • STABLE: The schema is stable and will not change.
    • NULL: The schema is stable and will not change. The value may be NULL if the record was created before the maturity_level field was added (release 2022.37).
    • EVOLVING: The schema is not stable and may change.
    • DEPRECATED: The schema is deprecated and the Delta Live Tables runtime may stop producing this event at any time.
  • error STRING: If an error occurred, details describing the error.
  • details STRING NOT NULL: A JSON object containing structured details of the event. This is the primary field used for analyzing events.
  • event_type STRING NOT NULL: The event type.

Usage

Only owners of the pipeline, streaming table, or materialized view can view the event log. Create a view and grant users access on the view to allow other users to query the event log.

> CREATE VIEW event_log_raw AS SELECT * FROM event_log(table(my_mv));
> GRANT SELECT ON VIEW event_log_raw TO `user@databricks.com`;

Examples

For more examples, refer to Querying the event log.

-- View the events on a materialized view
> SELECT timestamp, message, details
  FROM event_log(table(my_mv))
  WHERE level in ('INFO', 'WARN', 'ERROR')
  ORDER BY timestamp;

timestamp, message, details
---------------------------
2023-08-12 01:03:05.000, 'Flow "my_mv" is STARTING.', '{"flow_progress":{"status":"STARTING"}}'

-- Create a temp view with the latest update to the table/pipeline
> CREATE OR REPLACE TEMP VIEW latest_update AS
  SELECT origin.update_id AS id FROM event_log('<pipeline-ID>')
  WHERE event_type = 'create_update' ORDER BY timestamp DESC LIMIT 1;

-- Query lineage information
> SELECT
  details:flow_definition.output_dataset as output_dataset,
  details:flow_definition.input_datasets as input_dataset
FROM
  event_log('<pipeline-ID>'),
  latest_update
WHERE
  event_type = 'flow_definition' AND origin.update_id = latest_update.id;

output_dataset, input_dataset
-----------------------------
customers, null
sales_orders_raw, null
sales_orders_cleaned, ["customers", "sales_orders_raw"]
sales_order_in_la, ["sales_orders_cleaned"]

-- Query data quality expectation history for a streaming table
> WITH expectations_parsed AS (
    SELECT
      explode(
        from_json(
          details:flow_progress.data_quality.expectations,
          "array<struct<name: string, dataset: string, passed_records: int, failed_records: int>>"
        )
      ) row_expectations
    FROM
      event_log(table(my_st)),
      latest_update
    WHERE
      event_type = 'flow_progress'
      AND origin.update_id = latest_update.id
  )
  SELECT
    row_expectations.dataset as dataset,
    row_expectations.name as expectation,
    SUM(row_expectations.passed_records) as passing_records,
    SUM(row_expectations.failed_records) as failing_records
  FROM expectations_parsed
  GROUP BY
    row_expectations.dataset,
    row_expectations.name;

dataset, expectation, passing_records, failing_records
------------------------------------------------------
sales_orders_cleaned, valid_order_number, 4083, 0