Register and scan Dedicated SQL pools (formerly SQL DW)

Note

If you are looking to register and scan a dedicated SQL database within a Synapse workspace, you must follow instructions here.

This article discusses how to register and scan an instance of Dedicated SQL pool (formerly SQL DW) in Purview.

Supported capabilities

Azure Synapse Analytics (formerly SQL DW) supports full and incremental scans to capture the metadata and schema. Scans also classify the data automatically based on system and custom classification rules.

Known limitations

  • Azure Purview doesn't support over 300 columns in the Schema tab and it will show "Additional-Columns-Truncated".

Prerequisites

  • Before registering data sources, create an Azure Purview account. For more information on creating a Purview account, see Quickstart: Create an Azure Purview account.
  • You need to be an Azure Purview Data Source Admin
  • Networking access between the Purview account and Azure Synapse Analytics.

Setting up authentication for a scan

There are three ways to set up authentication for Azure Synapse Analytics:

  • Managed Identity

  • SQL Authentication

  • Service Principal

    Note

    Only the server-level principal login (created by the provisioning process) or members of the loginmanager database role in the master database can create new logins. It takes about 15 minutes after granting permission, the Purview account should have the appropriate permissions to be able to scan the resource(s).

Your Purview account has its own Managed Identity which is basically your Purview name when you created it. You must create an Azure AD user in Azure Synapse Analytics (formerly SQL DW) with the exact Purview's Managed Identity name by following the prerequisites and tutorial on Create Azure AD users using Azure AD applications.

Example SQL syntax to create user and grant permission:

CREATE USER [PurviewManagedIdentity] FROM EXTERNAL PROVIDER
GO

EXEC sp_addrolemember 'db_datareader', [PurviewManagedIdentity]
GO

The authentication must have permission to get metadata for the database, schemas and tables. It must also be able to query the tables to sample for classification. The recommendation is to assign db_datareader permission to the identity.

Service Principal

To use service principal authentication for scans, you can use an existing one or create a new one.

Note

If you have to create a new Service Principal, please follow these steps:

  1. Navigate to the Azure portal.
  2. Select Azure Active Directory from the left-hand side menu.
  3. Select App registrations.
  4. Select + New application registration.
  5. Enter a name for the application (the service principal name).
  6. Select Accounts in this organizational directory only.
  7. For Redirect URI select Web and enter any URL you want; it doesn't have to be real or work.
  8. Then select Register.

It is required to get the Service Principal's application ID and secret:

  1. Navigate to your Service Principal in the Azure portal
  2. Copy the values the Application (client) ID from Overview and Client secret from Certificates & secrets.
  3. Navigate to your key vault
  4. Select Settings > Secrets
  5. Select + Generate/Import and enter the Name of your choice and Value as the Client secret from your Service Principal
  6. Select Create to complete
  7. If your key vault is not connected to Purview yet, you will need to create a new key vault connection
  8. Finally, create a new credential using the Service Principal to setup your scan

Granting the Service Principal access to your Azure Synapse Analytics (formerly SQL DW)

In addition, you must also create an Azure AD user in Azure Synapse Analytics by following the prerequisites and tutorial on Create Azure AD users using Azure AD applications. Example SQL syntax to create user and grant permission:

CREATE USER [ServicePrincipalName] FROM EXTERNAL PROVIDER
GO

ALTER ROLE db_datareader ADD MEMBER [ServicePrincipalName]
GO

Note

Purview will need the Application (client) ID and the client secret in order to scan.

SQL authentication

You can follow the instructions in CREATE LOGIN to create a login for Azure Synapse Analytics (formerly SQL DW) if you don't already have one.

When authentication method selected is SQL Authentication, you need to get your password and store in the key vault:

  1. Get the password for your SQL login
  2. Navigate to your key vault
  3. Select Settings > Secrets
  4. Select + Generate/Import and enter the Name and Value as the password for your SQL login
  5. Select Create to complete
  6. If your key vault is not connected to Purview yet, you will need to create a new key vault connection
  7. Finally, create a new credential using the key to setup your scan

Register a SQL dedicated pool (formerly SQL DW)

To register a new SQL dedicated pool in Purview, do the following:

  1. Navigate to your Purview account.
  2. Select Data Map on the left navigation.
  3. Select Register
  4. On Register sources, select SQL dedicated pool (formerly SQL DW).
  5. Select Continue

On the Register sources (Azure Synapse Analytics) screen, do the following:

  1. Enter a Name that the data source will be listed with in the Catalog.
  2. Choose your Azure subscription to filter down Azure Synapse workspaces.
  3. Select an Azure Synapse workspace.
  4. Select a collection or create a new one (Optional).
  5. Select Register to register the data source.

register sources options

Creating and running a scan

To create and run a new scan, do the following:

  1. Select the Data Map tab on the left pane in the Purview Studio.

  2. Select the SQL dedicated pool source that you registered.

  3. Select New scan

  4. Select the credential to connect to your data source.

    Set up scan

  5. You can scope your scan to specific tables by choosing the appropriate items in the list.

    Scope your scan

  6. Then select a scan rule set. You can choose between the system default, existing custom rule sets, or create a new rule set inline.

    Scan rule set

  7. Choose your scan trigger. You can set up a schedule or run the scan once.

    trigger

  8. Review your scan and select Save and run.

Viewing your scans and scan runs

To view existing scans, do the following:

  1. Go to the Purview studio. Select the Data Map tab under the left pane.

  2. Select the desired data source. You will see a list of existing scans on that data source under Recent scans, or can view all scans under the Scans tab.

  3. Select the scan that has results you want to view.

  4. This page will show you all of the previous scan runs along with the status and metrics for each scan run. It will also display whether your scan was scheduled or manual, how many assets had classifications applied, how many total assets were discovered, the start and end time of the scan, and the total scan duration.

Manage your scans - edit, delete, or cancel

To manage or delete a scan, do the following:

  1. Go to the Purview studio. Select the Data Map tab under the left pane.

  2. Select the desired data source. You will see a list of existing scans on that data source under Recent scans, or can view all scans under the Scans tab.

  3. Select the scan you would like to manage. You can edit the scan by selecting Edit scan.

  4. You can cancel an in progress scan by selecting Cancel scan run.

  5. You can delete your scan by selecting Delete scan.

Next steps