Azure SQL Database Data Discovery and Classification

Data Discovery & Classification (currently in preview) provides advanced capabilities built into Azure SQL Database for discovering, classifying, labeling & protecting the sensitive data in your databases. Discovering and classifying your most sensitive data (business, financial, healthcare, PII, etc.) can play a pivotal role in your organizational information protection stature. It can serve as infrastructure for:

  • Helping meet data privacy standards and regulatory compliance requirements.
  • Various security scenarios, such as monitoring (auditing) and alerting on anomalous access to sensitive data.
  • Controlling access to and hardening the security of databases containing highly sensitive data.

Data Discovery & Classification is part of the SQL Advanced Threat Protection (ATP) offering, which is a unified package for advanced SQL security capabilities. Data Discovery & Classification can be accessed and managed via the central SQL ATP portal.

Note

This document relates to Azure SQL Database only. For SQL Server (on-prem), see SQL Data Discovery and Classification.

What is Data Discovery and Classification?

Data Discovery & Classification introduces a set of advanced services and new SQL capabilities, forming a new SQL Information Protection paradigm aimed at protecting the data, not just the database:

  • Discovery & recommendations – The classification engine scans your database and identifies columns containing potentially sensitive data. It then provides you an easy way to review and apply the appropriate classification recommendations via the Azure portal.
  • Labeling – Sensitivity classification labels can be persistently tagged on columns using new classification metadata attributes introduced into the SQL Engine. This metadata can then be utilized for advanced sensitivity-based auditing and protection scenarios.
  • Query result set sensitivity – The sensitivity of query result set is calculated in real time for auditing purposes.
  • Visibility - The database classification state can be viewed in a detailed dashboard in the portal. Additionally, you can download a report (in Excel format) to be used for compliance & auditing purposes, as well as other needs.

Discover, classify & label sensitive columns

The following section describes the steps for discovering, classifying, and labeling columns containing sensitive data in your database, as well as viewing the current classification state of your database and exporting reports.

The classification includes two metadata attributes:

  • Labels – The main classification attributes, used to define the sensitivity level of the data stored in the column.
  • Information Types – Provide additional granularity into the type of data stored in the column.

Classify your SQL Database

  1. Go to the Azure portal.

  2. Navigate to Advanced Threat Protection under the Security heading in your Azure SQL Database pane. Click to enable Advanced Threat Protection, and then click on the Data discovery & classification (preview) card.

    Scan a database

  3. The Overview tab includes a summary of the current classification state of the database, including a detailed list of all classified columns, which you can also filter to view only specific schema parts, information types and labels. If you haven’t yet classified any columns, skip to step 5.

    Summary of current classification state

  4. To download a report in Excel format, click on the Export option in the top menu of the window.

    Export to Excel

  5. To begin classifying your data, click on the Classification tab at the top of the window.

    Classify you data

  6. The classification engine scans your database for columns containing potentially sensitive data and provides a list of recommended column classifications. To view and apply classification recommendations:

    • To view the list of recommended column classifications, click on the recommendations panel at the bottom of the window:

      Classify your data

    • Review the list of recommendations – to accept a recommendation for a specific column, check the checkbox in the left column of the relevant row. You can also mark all recommendations as accepted by checking the checkbox in the recommendations table header.

      Review recommendation list

    • To apply the selected recommendations, click on the blue Accept selected recommendations button.

      Apply recommendations

  7. You can also manually classify columns as an alternative, or in addition, to the recommendation-based classification:

    • Click on Add classification in the top menu of the window.

      Manually add classification

    • In the context window that opens, select the schema > table > column that you want to classify, and the information type and sensitivity label. Then click on the blue Add classification button at the bottom of the context window.

      Select column to classify

  8. To complete your classification and persistently label (tag) the database columns with the new classification metadata, click on Save in the top menu of the window.

    Save

Auditing access to sensitive data

An important aspect of the information protection paradigm is the ability to monitor access to sensitive data. Azure SQL Database Auditing has been enhanced to include a new field in the audit log called data_sensitivity_information, which logs the sensitivity classifications (labels) of the actual data that was returned by the query.

Audit log

Automated/Programmatic classification

You can use T-SQL to add/remove column classifications, as well as retrieve all classifications for the entire database.

Note

When using T-SQL to manage labels, there is no validation that labels added to a column exist in the organizational information protection policy (the set of labels that appear in the portal recommendations). It is therefor up to you to validate this.

Next steps