Connect to and manage Azure Files in Microsoft Purview

This article outlines how to register Azure Files, and how to authenticate and interact with Azure Files in Microsoft Purview. For more information about Microsoft Purview, read the introductory article.

Supported capabilities

Metadata Extraction Full Scan Incremental Scan Scoped Scan Classification Access Policy Lineage Data Sharing
Yes Yes Yes Yes Yes No Limited** No

** Lineage is supported if dataset is used as a source/sink in Data Factory Copy activity

Azure Files supports full and incremental scans to capture the metadata and classifications, based on system default and custom classification rules.

For file types such as csv, tsv, psv, ssv, the schema is extracted when the following logics are in place:

  1. First row values are non-empty
  2. First row values are unique
  3. First row values are neither a date nor a number

Prerequisites

Register

This section describes how to register Azure Files in Microsoft Purview using the Microsoft Purview governance portal.

Authentication for registration

Currently there's only one way to set up authentication for Azure file shares:

  • Account Key

Account Key to register

When authentication method selected is Account Key, you need to get your access key and store in the key vault:

  1. Navigate to your storage account
  2. Select Settings > Access keys
  3. Copy your key and save it somewhere for the next steps
  4. Navigate to your key vault
  5. Select Settings > Secrets
  6. Select + Generate/Import and enter the Name and Value as the key from your storage account
  7. Select Create to complete
  8. If your key vault isn't connected to Microsoft Purview yet, you will need to create a new key vault connection
  9. Finally, create a new credential using the key to set up your scan

Steps to register

To register a new Azure Files account in your data catalog, follow these steps:

  1. Navigate to your Microsoft Purview Data Studio.
  2. Select Data Map on the left navigation.
  3. Select Register
  4. On Register sources, select Azure Files
  5. Select Continue

register new data source

On the Register sources (Azure Files) screen, follow these steps:

  1. Enter a Name that the data source will be listed with in the Catalog.
  2. Choose your Azure subscription to filter down Azure Storage Accounts.
  3. Select an Azure Storage Account.
  4. Select a collection or create a new one (Optional).
  5. Select Register to register the data source.

register sources options

Scan

Follow the steps below to scan Azure Files to automatically identify assets and classify your data. For more information about scanning in general, see our introduction to scans and ingestion

Create and run scan

To create and run a new scan, follow these steps:

  1. Select the Data Map tab on the left pane in the Microsoft Purview governance portal.

  2. Select the Azure Files source that you registered.

  3. Select New scan

  4. Select the account key credential to connect to your data source.

    Set up scan

  5. You can scope your scan to specific databases by choosing the appropriate items in the list.

    Scope your scan

  6. Then select a scan rule set. You can choose between the system default, existing custom rule sets, or create a new rule set inline.

    Scan rule set

  7. Choose your scan trigger. You can set up a schedule to reoccur, or run the scan once.

    trigger

  8. Review your scan and select Save and run.

View your scans and scan runs

To view existing scans, do the following:

  1. Go to the Microsoft Purview governance portal. Select the Data Map tab under the left pane.

  2. Select the desired data source. You will see a list of existing scans on that data source under Recent scans, or can view all scans under the Scans tab.

  3. Select the scan that has results you want to view.

  4. This page will show you all of the previous scan runs along with the status and metrics for each scan run. It will also display whether your scan was scheduled or manual, how many assets had classifications applied, how many total assets were discovered, the start and end time of the scan, and the total scan duration.

Manage your scans - edit, delete, or cancel

To manage or delete a scan, do the following:

  1. Go to the Microsoft Purview governance portal. Select the Data Map tab under the left pane.

  2. Select the desired data source. You will see a list of existing scans on that data source under Recent scans, or can view all scans under the Scans tab.

  3. Select the scan you would like to manage. You can edit the scan by selecting Edit scan.

  4. You can cancel an in progress scan by selecting Cancel scan run.

  5. You can delete your scan by selecting Delete scan.

Note

  • Deleting your scan does not delete catalog assets created from previous scans.
  • The asset will no longer be updated with schema changes if your source table has changed and you re-scan the source table after editing the description in the schema tab of Microsoft Purview.

Next steps

Now that you have registered your source, follow the below guides to learn more about Microsoft Purview and your data.