什么是 Azure 信息保护统一标签扫描器?

适用于: Azure 信息保护、windows server 2019、windows server 2016、windows Server 2012 R2


如果你使用的是经典扫描程序,请参阅什么是 Azure 信息保护经典扫描器?If you're using the classic scanner, see What is the Azure Information Protection classic scanner?.

若要在云存储库上扫描并标记文件,请使用 Cloud App Security,而不是扫描程序。To scan and label files on cloud repositories, use Cloud App Security instead of the scanner.

使用此部分中的信息来了解 Azure 信息保护统一标签扫描程序,然后了解如何成功进行安装、配置、运行,并在必要时对其进行故障排除。Use the information in this section to learn about the Azure Information Protection unified labeling scanner, and then how to successfully install, configure, run and if necessary, troubleshoot it.

AIP 扫描程序在 Windows Server 上以服务的形式运行,并允许你发现、分类和保护以下数据存储中的文件:The AIP scanner runs as a service on Windows Server and lets you discover, classify, and protect files on the following data stores:

  • 使用服务器消息块(SMB)协议的网络共享的UNC 路径UNC paths for network shares that use the Server Message Block (SMB) protocol.

  • Sharepoint文档库和sharepoint server 2019 通过 sharepoint server 2013 的文件夹。SharePoint document libraries and folder for SharePoint Server 2019 through SharePoint Server 2013. 如果客户扩展了针对此版 SharePoint 的支持,则 SharePoint 2010 也会受到支持。SharePoint 2010 is also supported for customers who have extended support for this version of SharePoint.

Azure 信息保护统一标记扫描器概述Azure Information Protection unified labeling scanner overview

AIP 扫描程序可以检查 Windows 可以为其编制索引的任何文件。The AIP scanner can inspect any files that Windows can index. 如果已配置了应用自动分类的标签,则扫描程序可以标记发现的文件以应用该分类,还可以选择 "应用" 或 "删除保护"。If you've configured labels that apply automatic classification, the scanner can label discovered files to apply that classification, and optionally apply or remove protection.

下图显示了 AIP 扫描程序体系结构,扫描程序在该体系结构中发现本地和 SharePoint 服务器上的文件。The following image shows the AIP scanner architecture, where the scanner discovers files across your on-premises and SharePoint servers.

Azure 信息保护统一标签扫描程序体系结构

若要检查文件,扫描程序使用计算机上安装的 Ifilter。To inspect your files, the scanner uses IFilters installed on the computer. 若要确定是否需要标记文件,扫描程序将使用 Office 365 内置数据丢失防护(DLP)敏感信息类型和模式检测,或 Office 365 regex 模式。To determine whether the files need labeling, the scanner uses the Office 365 built-in data loss prevention (DLP) sensitivity information types and pattern detection, or Office 365 regex patterns.

扫描程序使用 Azure 信息保护客户端,并可以对与客户端相同的文件类型进行分类和保护。The scanner uses the Azure Information Protection client, and can classify and protect the same types of files as the client. 有关详细信息,请参阅Azure 信息保护统一标签客户端支持的文件类型For more information, see File types supported by the Azure Information Protection unified labeling client.

执行以下任一操作以根据需要配置扫描:Do any of the following to configure your scans as needed:

  • 仅在发现模式下运行扫描程序,以创建查看文件标记时所发生的情况的报表。Run the scanner in discovery mode only to create reports that check to see what happens when your files are labeled.
  • 运行扫描程序以发现包含敏感信息的文件, 而不配置应用自动分类的标签。Run the scanner to discover files with sensitive information, without configuring labels that apply automatic classification.
  • 自动运行扫描程序以按配置应用标签。Run the scanner automatically to apply labels as configured.
  • 定义 "文件类型" 列表以指定要扫描或排除的特定文件。Define a file types list to specify specific files to scan or to exclude.


扫描程序不会实时发现和标记。The scanner does not discover and label in real time. 它会在你指定的数据存储中通过文件系统地进行爬网。It systematically crawls through files on data stores that you specify. 将此循环配置为运行一次或重复运行。Configure this cycle to run once, or repeatedly.


统一的标记扫描器支持具有多个节点的扫描程序群集,使你的组织能够横向扩展,实现更快的扫描时间和更广泛的作用域。The unified labeling scanner supports scanner clusters with multiple nodes, enabling your organization to scale out, achieving faster scan times and broader scope.

从一开始就部署多个节点,或者从单节点群集开始,并在以后增长时添加更多节点。Deploy multiple nodes right from the start, or start with a single-node cluster and add additional nodes later on as you grow. 使用install-aipscanner cmdlet 的相同群集名称和数据库部署多个节点。Deploy multiple nodes by using the same cluster name and database for the Install-AIPScanner cmdlet.

AIP 扫描过程AIP scanning process

扫描文件时,AIP 扫描程序会执行以下步骤:When scanning files, the AIP scanner runs through the following steps:

1. 确定是包括还是排除文件以进行扫描1. Determine whether files are included or excluded for scanning

2. 检查并标记文件2. Inspect and label files

3. 无法检查的标签文件3. Label files that can't be inspected

有关详细信息,请参阅扫描仪未标记的文件For more information, see Files not labeled by the scanner.

1. 确定是包括还是排除文件以进行扫描1. Determine whether files are included or excluded for scanning

扫描程序自动跳过从分类和保护中排除的文件,如可执行文件和系统文件。The scanner automatically skips files that are excluded from classification and protection, such as executable files and system files. 有关详细信息,请参阅从分类和保护中排除的文件类型For more information, see File types that are excluded from classification and protection.

扫描器还会将显式定义的任何文件列表视为扫描,或从扫描中排除。The scanner also considers any file lists explicitly defined to scan, or exclude from scanning. 默认情况下,文件列表适用于所有数据存储库,并且只能为特定的存储库定义。File lists apply for all data repositories by default, and can also be defined for specific repositories only.

若要定义用于扫描或排除的文件列表,请使用内容扫描作业中的 "文件类型" 来扫描设置。To define file lists for scanning or exclusion, use the File types to scan setting in the content scan job. 例如:For example:

配置 Azure 信息保护扫描程序要扫描的文件类型

有关详细信息,请参阅部署 Azure 信息保护扫描程序以自动对文件进行分类和保护For more information, see Deploying the Azure Information Protection scanner to automatically classify and protect files.

2. 检查并标记文件2. Inspect and label files

标识排除的文件后,扫描程序会再次筛选以识别检查支持的文件。After identifying excluded files, the scanner filters again to identify files supported for inspection.

这些附加筛选器与操作系统用于 Windows 搜索和索引的筛选器相同,无需其他配置。These additional filters are the same ones used by the operating system for Windows Search and indexing, and require no additional configuration. Windows IFilter 还用于扫描 Word、Excel 和 PowerPoint 使用的文件类型以及 PDF 文档和文本文件。Windows IFilter is also used to scan file types that are used by Word, Excel, and PowerPoint, and for PDF documents and text files.

有关检查支持的文件类型的完整列表,以及用于将筛选器配置为包含 .zip 和 tiff 文件的其他说明,请参阅检查支持的文件类型For a full list of file types supported for inspection, and additional instructions for configuring filters to include .zip and .tiff files, see File types supported for inspection.

检查后,支持的文件类型使用为标签指定的条件进行标记。After inspection, supported file types are labeled using the conditions specified for your labels. 如果你使用的是发现模式,则可以将这些文件报告为包含为你的标签指定的条件,或者报告为包含任何已知的敏感信息类型。If you're using discovery mode, these files can either be reported to contain the conditions specified for your labels, or reported to contain any known sensitive information types.

3. 无法检查的标签文件3. Label files that can't be inspected

对于无法检查的任何文件类型,AIP 扫描器会应用 Azure 信息保护策略中的默认标签,或为扫描程序配置的默认标签。For any file types that can't be inspected, the AIP scanner applies the default label in the Azure Information Protection policy, or the default label configured for the scanner.

扫描仪未标记的文件Files not labeled by the scanner

AIP 扫描程序在以下情况下无法标记文件:The AIP scanner cannot label files under the following circumstances:

  • 如果标签应用分类,但不支持保护,而文件类型不支持客户端分类。When the label applies classification, but not protection, and the file type does not support classification-only by the client. 有关详细信息,请参阅统一标签客户端文件类型For more information, see Unified labeling client file types.

  • 标签应用分类和保护时,但扫描程序不支持文件类型。When the label applies classification and protection, but the scanner does not support the file type.

    默认情况下,扫描程序仅保护 Office 文件类型,以及 PDF 文件(使用 ISO PDF 加密标准进行保护时)。By default, the scanner protects only Office file types, and PDF files when they are protected by using the ISO standard for PDF encryption.

    更改要保护的文件类型时,可以添加其他类型的文件以进行保护。Other types of files can be added for protection when you change the types of files to protect.

示例: 检查 .txt 文件后,扫描程序无法应用配置为仅用于分类的标签,因为 .txt 文件类型不支持分类。Example: After inspecting .txt files, the scanner can't apply a label that's configured for classification only, because the .txt file type doesn't support classification only.

但是,如果将标签配置为分类和保护,并包含该文件类型以保护扫描程序,则扫描程序可以对文件进行标记。However, if the label is configured for both classification and protection, and the .txt file type is included for the scanner to protect, the scanner can label the file.

