Connect to Azure SQL Database in Microsoft Purview
This article outlines the process to register an Azure SQL data source in Microsoft Purview including instructions to authenticate and interact with the Azure SQL database source
Supported capabilities
Metadata Extraction | Full Scan | Incremental Scan | Scoped Scan | Classification | Access Policy | Lineage |
---|---|---|---|---|---|---|
Yes | Yes | Yes | Yes | Yes | Yes (Preview) | Yes(Preview)** |
** Lineage is also supported if Azure SQL tables/views used as source/sink in Data Factory Copy and Data Flow activities
- Data lineage extraction is currently supported only for Stored procedure runs
When scanning Azure SQL Database, Microsoft Purview supports:
Extracting technical metadata including:
- Server
- Database
- Schemas
- Tables including the columns
- Views including the columns
- Store procedures (with lineage extraction enabled)
- Store procedure runs (with lineage extraction enabled)
When setting up scan, you can further scope the scan after providing the database name by selecting tables and views as needed.
Known limitations
- Microsoft Purview doesn't support over 300 columns in the Schema tab and it will show "Additional-Columns-Truncated" if there are more than 300 columns.
- Column level lineage is currently not supported in the lineage tab. However, the columnMapping attribute in properties tab of Azure SQL Stored Procedure Run captures column lineage in plain text.
- Stored procedures running remotely from data integration tools like Azure Data Factory is currently not supported
- Data lineage extraction is currently not supported for Functions, Triggers.
- Lineage extraction scan is scheduled and defaulted to run every six hours. Frequency can't be changed.
- If sql views are referenced in stored procedures, they're captured as sql tables currently.
- Lineage extraction is currently not supported, if Azure SQL Server is configured behind a private endpoint.
Prerequisites
An Azure account with an active subscription. Create an account for free.
An active Microsoft Purview account.
You'll need to be a Data Source Administrator and Data Reader to register a source and manage it in the Microsoft Purview governance portal. See our Microsoft Purview Permissions page for details.
Register
This section will enable you to register the Azure SQL DB data source and set up authentication to scan.
Steps to register
It's important to register the data source in Microsoft Purview before setting up a scan.
Go to the Azure portal, and navigate to the Microsoft Purview accounts page and select your Purview account
Open Microsoft Purview governance portal and navigate to the Data Map
Create the Collection hierarchy using the Collections menu and assign permissions to individual subcollections, as required
Navigate to the appropriate collection under the Sources menu and select the Register icon to register a new Azure SQL DB
Select the Azure SQL Database data source and select Continue
Provide a suitable Name for the data source, select the relevant Azure subscription, Server name for the SQL server and the collection and select on Apply
The Azure SQL Server Database will be shown under the selected Collection
Scan
Tip
To troubleshoot any issues with scanning:
- Confirm you have followed all prerequisites.
- Check network by confirming firewall, Azure connections, or integration runtime settings.
- Confirm authentication is properly set up.
- Review our scan troubleshooting documentation.
Firewall settings
If your database server has a firewall enabled, you'll need to update the firewall to allow access in one of two ways:
- Allow Azure connections through the firewall - a straightforward option to route traffic through Azure networking, without needing to manage virtual machines.
- Install a Self-Hosted Integration Runtime on a machine in your network and give it access through the firewall - if you have a private VNet set up within Azure, or have any other closed network set up, using a self-hosted integration runtime on a machine within that network will allow you to fully manage traffic flow and utilize your existing network.
- Use a managed virtual network - setting up a managed virtual network with your Microsoft Purview account will allow you to connect to Azure SQL using the Azure integration runtime in a closed network.
For more information about the Azure SQL Firewall, see the SQL Database firewall documentation. To connect Microsoft Purview through the firewall, follow the steps below.
Allow Azure Connections
Enabling Azure connections will allow Microsoft Purview to reach and connect the server without updating the firewall itself. You can follow the How-to guide for Connections from inside Azure.
- Navigate to your database account
- Select the server name in the Overview page
- Select Security > Firewalls and virtual networks
- Select Yes for Allow Azure services and resources to access this server
Self-Hosted Integration Runtime
A self-hosted integration runtime (SHIR) can be installed on a machine to connect with a resource in a private network.
- Create and install a self-hosted integration runtime on a personal machine, or a machine inside the same VNet as your database server.
- Check your database server networking configuration to confirm that there's a private endpoint accessible to the SHIR machine. Add the IP of the machine if it doesn't already have access.
- If your Azure SQL Server is behind a private endpoint or in a VNet, you can use an ingestion private endpoint to ensure end-to-end network isolation.
Authentication for a scan
To scan your data source, you'll need to configure an authentication method in the Azure SQL Database.
Important
If you are using a self-hosted integration runtime to connect to your resource, system-assigned and user-assigned managed identities will not work. You need to use Service Principal authentication or SQL authentication.
The following options are supported:
System-assigned managed identity (Recommended) - This is an identity associated directly with your Microsoft Purview account that allows you to authenticate directly with other Azure resources without needing to manage a go-between user or credential set. The system-assigned managed identity is created when your Microsoft Purview resource is created, is managed by Azure, and uses your Microsoft Purview account's name. The SAMI can't currently be used with a self-hosted integration runtime for Azure SQL. For more information, see the managed identity overview.
User-assigned managed identity (preview) - Similar to a SAMI, a user-assigned managed identity (UAMI) is a credential resource that allows Microsoft Purview to authenticate against Azure Active Directory. The user-assigned managed by users in Azure, rather than by Azure itself, which gives you more control over security. The UAMI can't currently be used with a self-hosted integration runtime for Azure SQL. For more information, see our guide for user-assigned managed identities.
Service Principal- A service principal is an application that can be assigned permissions like any other group or user, without being associated directly with a person. Their authentication has an expiration date, and so can be useful for temporary projects. For more information, see the service principal documentation.
SQL Authentication - connect to the SQL database with a username and password. For more information about SQL Authentication, you can follow the SQL authentication documentation. If you need to create a login, follow this guide to query an Azure SQL database, and use this guide to create a login using T-SQL.
Note
Be sure to select the Azure SQL Database option on the page.
Select your chosen method of authentication from the tabs below for steps to authenticate with your Azure SQL Database.
Note
Only the server-level principal login (created by the provisioning process) or members of the loginmanager
database role in the master database can create new logins. It takes about 15 minutes after granting permission, the Microsoft Purview account should have the appropriate permissions to be able to scan the resource(s).
You'll need a SQL login with at least
db_datareader
permissions to be able to access the information Microsoft Purview needs to scan the database. You can follow the instructions in CREATE LOGIN to create a sign-in for Azure SQL Database. You'll need to save the username and password for the next steps.Navigate to your key vault in the Azure portal.
Select Settings > Secrets and select + Generate/Import
Enter the Name and Value as the password from your Azure SQL Database.
Select Create to complete
If your key vault isn't connected to Microsoft Purview yet, you'll need to create a new key vault connection
Finally, create a new credential using the key to set up your scan.
Creating the scan
Open your Microsoft Purview account and select the Open Microsoft Purview governance portal
Navigate to the Data map --> Sources to view the collection hierarchy
Select the New Scan icon under the Azure SQL DB registered earlier
Navigate to lineage section to learn more about data lineage from Azure SQL DB
Select your method of authentication from the tabs below for scanning steps.
Provide a Name for the scan, select Database selection method as Enter manually, enter the Database name and the Credential created earlier, choose the appropriate collection for the scan and select Test connection to validate the connection. Once the connection is successful, select Continue
Scoping and running the scan
You can scope your scan to specific database objects by choosing the appropriate items in the list.
Then select a scan rule set. You can choose between the system default, existing custom rule sets, or create a new rule set inline.
If creating a new scan rule set
You can select the classification rules to be included in the scan rule
Choose your scan trigger. You can set up a schedule or run the scan once.
Review your scan and select Save and run.
View Scan
Navigate to the data source in the Collection and select View Details to check the status of the scan
The scan details indicate the progress of the scan in the Last run status and the number of assets scanned and classified
The Last run status will be updated to In progress and then Completed once the entire scan has run successfully
Manage Scan
Scans can be managed or run again on completion
Select the Scan name to manage the scan
You can run the scan again, edit the scan, delete the scan
You can run an incremental scan or a full scan again
Access policy
To create an access policy for Azure Data Lake Storage Gen 2, follow these guides:
- Single SQL account - This guide will allow you to enable access policies on a single Azure SQL Database account in your subscription.
- All sources in a subscription or resource group - This guide will allow you to enable access policies on all enabled and available sources in a resource group, or across an Azure subscription.
Lineage (Preview)
Microsoft Purview supports lineage from Azure SQL Database. At the time of setting up a scan, enable lineage extraction toggle button to extract lineage.
Prerequisites for setting up scan with Lineage extraction
Follow steps under authentication for a scan using Managed Identity section to authorize Microsoft Purview scan your Azure SQL Database
Sign in to Azure SQL Database with Azure AD account and assign db_owner permissions to the Microsoft Purview Managed identity. Use below example SQL syntax to create user and grant permission by replacing 'purview-account' with your Account name:
Create user <purview-account> FROM EXTERNAL PROVIDER GO EXEC sp_addrolemember 'db_owner', <purview-account> GO
Run below command on your Azure SQL Database to create master Key
Create master key Go
Creating scan with lineage extraction toggle turned on
Enable lineage extraction toggle in the scan screen
Select your method of authentication by following steps in the scan section
Once the scan is successfully set up from previous step, a new scan type called Lineage extraction will run incremental scans every 6 hours to extract lineage from Azure SQL Database. Lineage is extracted based on the actual stored procedure runs in the Azure SQL Database
Search Azure SQL Database assets and view runtime lineage
You can browse data catalog or search data catalog to view asset details for Azure SQL Database. The following steps describe how-to view runtime lineage details.
Go to asset -> lineage tab, you can see the asset lineage when applicable. Refer to the supported capabilities section on the supported Azure SQL Database lineage scenarios. For more information about lineage in general, see data lineage and lineage user guide.
Go to stored procedure asset -> Properties -> Related assets to see the latest run details of stored procedures.
Select the stored procedure hyperlink next to Runs to see Azure SQL Stored Procedure Run Overview. Go to properties tab to see enhanced run time information from stored procedure. For example: executedTime, rowcount, Client Connection, and so on.
Troubleshooting steps
- If no lineage is captured after a successful Lineage extraction run, it's possible that no stored procedures have run at least once since the scan is set up.
- Lineage is captured for stored procedure runs that happened after a successful scan is set up. Lineage from past Stored procedure runs isn't captured.
- If your database is processing heavy workloads with lots of stored procedure runs, lineage extraction will filter only the most recent runs. Stored procedure runs early in the 6 hour window or the run instances that create heavy query load won't be extracted. Contact support if you're missing lineage from any stored procedure runs.
Next steps
Now that you've registered your source, follow the below guides to learn more about Microsoft Purview and your data.
Feedback
Submit and view feedback for