Register and scan an Azure SQL Database

This article outlines how to register an Azure SQL Database data source in Purview and set up a scan on it.

Supported Capabilities

The Azure SQL Database data source supports the following functionality:

  • Full and incremental scans to capture metadata and classification in Azure SQL Database.

  • Lineage between data assets for ADF copy and dataflow activities.

Known limitations

  • Azure Purview doesn't support over 300 columns in the Schema tab and it will show "Additional-Columns-Truncated".

Prerequisites

  1. Create a new Purview account if you don't already have one.

  2. Networking access between the Purview account and Azure SQL Database.

Set up authentication for a scan

Authentication to scan Azure SQL Database. If you need to create new authentication, you need to authorize database access to SQL Database. There are three authentication methods that Purview supports today:

  • SQL authentication
  • Service Principal
  • Managed Identity

SQL authentication

Note

Only the server-level principal login (created by the provisioning process) or members of the loginmanager database role in the master database can create new logins. It takes about 15 minutes after granting permission, the Purview account should have the appropriate permissions to be able to scan the resource(s).

You can follow the instructions in CREATE LOGIN to create a login for Azure SQL Database if you don't have this available. You will need username and password for the next steps.

  1. Navigate to your key vault in the Azure portal
  2. Select Settings > Secrets
  3. Select + Generate/Import and enter the Name and Value as the password from your Azure SQL Database
  4. Select Create to complete
  5. If your key vault is not connected to Purview yet, you will need to create a new key vault connection
  6. Finally, create a new credential using the username and password to setup your scan

Service principal and managed identity

There are several steps to allow Purview to use service principal or Purview's managed identity to scan your Azure SQL Database

Note

Purview will need the Application (client) ID and the client secret in order to scan.

Create or use an existing service principal

Note

Skip this step if you are using Purview's managed identity

To use a service principal, you can use an existing one or create a new one.

Note

If you have to create a new Service Principal, please follow these steps:

  1. Navigate to the Azure portal.
  2. Select Azure Active Directory from the left-hand side menu.
  3. Select App registrations.
  4. Select + New application registration.
  5. Enter a name for the application (the service principal name).
  6. Select Accounts in this organizational directory only.
  7. For Redirect URI select Web and enter any URL you want; it doesn't have to be real or work.
  8. Then select Register.
Configure Azure AD authentication in the database account

The service principal or managed identity must have permission to get metadata for the database, schemas and tables. It must also be able to query the tables to sample for classification.

  • Configure and manage Azure AD authentication with Azure SQL

  • If you are using managed identity, your Purview account has its own managed identity which is basically your Purview name when you created it. You must create an Azure AD user in Azure SQL Database with the exact Purview's managed identity or your own service principal by following tutorial on Create the service principal user in Azure SQL Database. You need to assign proper permission (e.g. db_datareader) to the identity. Example SQL syntax to create user and grant permission:

    CREATE USER [Username] FROM EXTERNAL PROVIDER
    GO
    
    EXEC sp_addrolemember 'db_datareader', [Username]
    GO
    

    Note

    The Username is your own service principal or Purview's managed identity. You can read more about fixed-database roles and their capabilities.

Add service principal to key vault and Purview's credential

Note

If you are planning to use Purview's managed identity, you can skip this step because the default Purview's managed identity is already in Purview-MSI credential.

It is required to get the service principal's application ID and secret:

  1. Navigate to your service principal in the Azure portal
  2. Copy the values the Application (client) ID from Overview and Client secret from Certificates & secrets.
  3. Navigate to your key vault
  4. Select Settings > Secrets
  5. Select + Generate/Import and enter the Name of your choice and Value as the Client secret from your Service Principal
  6. Select Create to complete
  7. If your key vault is not connected to Purview yet, you will need to create a new key vault connection
  8. Finally, create a new credential using the Service Principal to setup your scan

Firewall settings

If your database server has a firewall enabled, you will need to update the firewall to allow access in one of two ways:

  1. Allow Azure connections through the firewall.
  2. Install a Self-Hosted Integration Runtime and give it access through the firewall.

Allow Azure Connections

Enabling Azure connections will allow Azure Purview to reach and connect the server without updating the firewall itself. You can follow the How-to guide for Connections from inside Azure.

  1. Navigate to your database account

  2. Select the server name in the Overview page

  3. Select Security > Firewalls and virtual networks

  4. Select Yes for Allow Azure services and resources to access this server

    Allow Azure services and resources to access this server.

Self-Hosted Integration Runtime

A self-hosted integration runtime (SHIR) can be installed on a machine to connect with a resource in a private network.

  1. Create and install a self-hosted integration runtime on a personal machine, or a machine inside the same VNet as your database server.
  2. Check your database server firewall to confirm that the SHIR machine has access through the firewall. Add the IP of the machine if it does not already have access.
  3. If your Azure SQL Server is behind a private endpoint or in a VNet, you can use an ingestion private endpoint to ensure end-to-end network isolation.

Register an Azure SQL Database data source

To register a new Azure SQL Database in your data catalog, do the following:

  1. Navigate to your Purview account.

  2. Select Data Map on the left navigation.

  3. Select Register

  4. On Register sources, select Azure SQL Database. Select Continue.

register new data source

On the Register sources (Azure SQL Database) screen, do the following:

  1. Enter a Name that the data source will be listed with in the Catalog.
  2. Select From Azure subscription, select the appropriate subscription from the Azure subscription drop-down box and the appropriate server from the Server name drop-down box.
  3. Select Register to register the data source.

register sources options

Creating and running a scan

Note

The steps and screenshots shown below illustrate the general process for managing scans across different data source types. Your options may differ slightly depending on the types of data sources that you are working with.

To create and run a new scan, do the following:

  1. Select the Data Map tab on the left pane in the Purview Studio.

  2. Select the data source that you registered.

  3. Select New scan

  4. Select the credential to connect to your data source.

    Set up scan

  5. You can scope your scan to specific parts of the data source such as folders, collections or schemas by checking the appropriate items in the list.

    Scope your scan

  6. The select a scan rule set for you scan. You can choose between the system default, the existing custom ones or create a new one inline.

    Scan rule set

  7. Choose your scan trigger. You can set up a schedule or run the scan once.

    trigger

  8. Review your scan and select Save and run.

Viewing your scans and scan runs

To view existing scans, do the following:

  1. Navigate to the management center. Select Data sources under the Sources and scanning section.

  2. Select the desired data source. You will see a list of existing scans on that data source.

  3. Select the scan whose results you are interested to view.

  4. This page will show you all of the previous scan runs along with metrics and status for each scan run. It will also display whether your scan was scheduled or manual, how many assets had classifications applied, how many total assets were discovered, the start and end time of the scan, and the total scan duration.

Manage your scans - edit, delete, or cancel

To manage or delete a scan, do the following:

  1. Navigate to the management center. Select Data sources under the Sources and scanning section then select on the desired data source.

  2. Select the scan you would like to manage. You can edit the scan by selecting Edit.

  3. You can delete your scan by selecting Delete.

Note

  • Deleting your scan does not delete your assets from previous Azure SQL Database scans.
  • The asset will no longer be updated with schema changes if your source table be changed and rescan the source table after editing the description in the schema tab of Purview.

Next steps