部署 Azure 信息保护扫描程序的以前版本Deploying previous versions of the Azure Information Protection scanner

适用于: Azure 信息保护、windows server 2019、windows server 2016、windows Server 2012 R2Applies to: Azure Information Protection, Windows Server 2019, Windows Server 2016, Windows Server 2012 R2

说明: 适用于 Windows 的 Azure 信息保护客户端Instructions for: Azure Information Protection client for Windows

备注

为了提供统一、简化的客户体验,Azure 门户中的 Azure 信息保护客户端(经典) 和标签管理 将于 2021 年 3 月 31 日 弃用 。To provide a unified and streamlined customer experience, Azure Information Protection client (classic) and Label Management in the Azure Portal are being deprecated as of March 31, 2021. 在此时间框架内,所有 Azure 信息保护客户都可以使用 Microsoft 信息保护统一标记平台转换到我们的统一标记解决方案。This time-frame allows all current Azure Information Protection customers to transition to our unified labeling solution using the Microsoft Information Protection Unified Labeling platform. 有关详细信息,请参阅官方弃用通知Learn more in the official deprecation notice.

备注

本文适用于低于版本 1.48.204.0 但仍在支持中的 Azure 信息保护扫描程序的版本。This article is for versions of the Azure Information Protection scanner that are earlier than version 1.48.204.0 but still in support. 若要将早期版本升级到当前版本,请参阅 升级 Azure 信息保护扫描程序To upgrade earlier versions to the current version, see Upgrading the Azure Information Protection scanner.

如果你正在寻找当前版本的扫描程序(包括 Azure 门户的配置)的部署说明,请参阅 部署 Azure 信息保护扫描程序以自动对文件进行分类和保护If you are looking for deployment instructions for the current version of the scanner, which includes configuration from the Azure portal, see Deploying the Azure Information Protection scanner to automatically classify and protect files.

利用此信息了解 Azure 信息保护扫描程序,并了解如何成功安装、配置和运行该扫描程序。Use this information to learn about the Azure Information Protection scanner, and then how to successfully install, configure, and run it.

此扫描程序在 Windows Server 上作为服务运行,使你能够发现和保护以下数据存储中的文件并对其进行分类:This scanner runs as a service on Windows Server and lets you discover, classify, and protect files on the following data stores:

  • 使用服务器消息块 (SMB) 协议的网络共享 UNC 路径。UNC paths for network shares that use the Server Message Block (SMB) protocol.

  • 通过 SharePoint Server 2013 为 SharePoint Server 2019 的文档库和文件夹提供文档。Document libraries and folders for SharePoint Server 2019 through SharePoint Server 2013. 如果客户扩展了针对此版 SharePoint 的支持,则 SharePoint 2010 也会受到支持。SharePoint 2010 is also supported for customers who have extended support for this version of SharePoint.

若要在云存储库上扫描并标记文件,请使用 Cloud App Security,而不是扫描程序。To scan and label files on cloud repositories, use Cloud App Security instead of the scanner.

Azure 信息保护扫描程序概述Overview of the Azure Information Protection scanner

为应用自动分类的标签配置 Azure 信息保护策略后,可为此扫描程序发现的文件设置标签。When you have configured your Azure Information Protection policy for labels that apply automatic classification, files that this scanner discovers can then be labeled. 标签可应用分类,并且可以应用保护或移除保护:Labels apply classification, and optionally, apply protection or remove protection:

Azure 信息保护扫描程序体系结构概述

通过使用计算机上安装的 iFilter,扫描程序可检查 Windows 能编制索引的任何文件。The scanner can inspect any files that Windows can index, by using IFilters that are installed on the computer. 然后,为了确定是否需要标记文件,扫描程序会使用 Office 365 内置数据丢失防护 (DLP) 敏感信息类型和模式检测,或 Office 365 正则表达式模式。Then, to determine if the files need labeling, the scanner uses the Office 365 built-in data loss prevention (DLP) sensitivity information types and pattern detection, or Office 365 regex patterns. 因为扫描程序使用 Azure 信息保护客户端,所以它可以对相同文件类型进行分类和保护。Because the scanner uses the Azure Information Protection client, it can classify and protect the same file types.

可仅在发现模式下运行扫描程序,利用报告确认对文件设置标签时会发生什么情况。You can run the scanner in discovery mode only, where you use the reports to check what would happen if the files were labeled. 或者,可运行扫描程序自动应用标签。Or, you can run the scanner to automatically apply the labels. 还可以运行扫描程序来发现包含敏感信息类型的文件,无需配置条件标签来应用自动分类。You can also run the scanner to discover files that contain sensitive information types, without configuring labels for conditions that apply automatic classification.

注意,扫描程序不会实时发现和标记。Note that the scanner does not discover and label in real time. 它会系统地浏览指定数据存储中的文件,可将此周期配置为运行一次或多次。It systematically crawls through files on data stores that you specify, and you can configure this cycle to run once, or repeatedly.

可以指定要扫描或排除不进行扫描的文件类型。You can specify which file types to scan, or exclude from scanning. 若要限制扫描程序检查的文件,可使用 Set-AIPScannerScannedFileTypes 定义一个文件类型列表。To restrict which files the scanner inspects, define a file types list by using Set-AIPScannerScannedFileTypes.

Azure 信息保护扫描程序的先决条件Prerequisites for the Azure Information Protection scanner

安装 Azure 信息保护扫描程序之前,请确保已满足以下要求。Before you install the Azure Information Protection scanner, make sure that the following requirements are in place.

要求Requirement 更多信息More information
运行扫描程序服务的 Windows Server 计算机:Windows Server computer to run the scanner service:

- 4 核处理器- 4 core processors

- 8 GB RAM- 8 GB of RAM

- 临时文件 10GB 可用空间(平均)- 10 GB free space (average) for temporary files
Windows Server 2019、Windows Server 2016 或 Windows Server 2012 R2。Windows Server 2019, Windows Server 2016, or Windows Server 2012 R2.

注意:对于非生产环境中的测试或评估目的,可以使用 Azure 信息保护客户端支持的 Windows 客户端操作系统。Note: For testing or evaluation purposes in a non-production environment, you can use a Windows client operating system that is supported by the Azure Information Protection client.

此计算机可以是物理或虚拟计算机,需拥有快速可靠的网络,可连接到要进行扫描的数据存储。This computer can be a physical or virtual computer that has a fast and reliable network connection to the data stores to be scanned.

扫描程序需要足够的磁盘空间,才能为其扫描的每个文件(每个核心四个文件)创建临时文件。The scanner requires sufficient disk space to create temporary files for each file that it scans, four files per core. 借助建议的 10GB 磁盘空间,4 核处理器可以扫描 16 个文件,每个文件的大小为 625MB。The recommended disk space of 10 GB allows for 4 core processors scanning 16 files that each have a file size of 625 MB.

如果由于组织策略而无法建立 internet 连接,请参阅 部署带有备用配置的扫描程序 部分。If internet connectivity is not possible because of your organization policies, see the Deploying the scanner with alternative configurations section. 否则,请确保此计算机具有 internet 连接,允许通过 HTTPS (端口 443) 的以下 Url:Otherwise, make sure that this computer has internet connectivity that allows the following URLs over HTTPS (port 443):
*.aadrm.com*.aadrm.com
*.azurerms.com*.azurerms.com
*.informationprotection.azure.com*.informationprotection.azure.com
informationprotection.hosting.portal.azure.netinformationprotection.hosting.portal.azure.net
*.aria.microsoft.com*.aria.microsoft.com
运行扫描程序服务的服务帐户Service account to run the scanner service 除了在 Windows Server 计算机上运行扫描程序服务外,此 Windows 帐户还对 Azure AD 进行身份验证,并下载 Azure 信息保护策略。In addition to running the scanner service on the Windows Server computer, this Windows account authenticates to Azure AD and downloads the Azure Information Protection policy. 此帐户必须是同步到 Azure AD 的 Active Directory 帐户。This account must be an Active Directory account and synchronized to Azure AD. 如果由于组织策略而无法同步此帐户,请参阅使用备用配置部署扫描程序部分。If you cannot synchronize this account because of your organization policies, see the Deploying the scanner with alternative configurations section.

此服务帐户有以下要求:This service account has the following requirements:

- 在本地登录**** 的用户权限分配。- Log on locally user right assignment. 此权限是安装和配置扫描程序所必需的,但不可用于操作。This right is required for the installation and configuration of the scanner, but not for operation. 必须将此权限授予服务帐户,但当确认扫描程序可发现、保护文件并对其进行分类后,可删除此权限。You must grant this right to the service account but you can remove this right after you have confirmed that the scanner can discover, classify, and protect files. 如果由于组织策略的限制而甚至无法在短时间内授予此权限,请参阅使用备用配置部署扫描程序部分。If granting this right even for a short period of time is not possible because of your organization policies, see the Deploying the scanner with alternative configurations section.

- 作为服务登录**** 的用户权限分配。- Log on as a service user right assignment. 扫描程序安装过程中会自动将此权限授予服务帐户,此权限是安装、配置和操作扫描程序所必需的。This right is automatically granted to the service account during the scanner installation and this right is required for the installation, configuration, and operation of the scanner.

-对数据存储库的权限:对于本地 SharePoint 上的数据存储库,如果为站点选择了 "添加" 和 "自定义" 页,则始终授予编辑权限,或授予设计权限。- Permissions to the data repositories: For data repositories on SharePoint on-premises, always grant the Edit permission if Add and Customize Pages is selected for the site, or grant the Design permission. 对于其他数据存储库,请授予 " 读取 " 和 " 写入 " 权限以扫描文件,然后将分类和保护应用到满足 Azure 信息保护策略中的条件的文件。For other data repositories, grant Read and Write permissions for scanning the files and then applying classification and protection to the files that meet the conditions in the Azure Information Protection policy. 若要仅在发现模式下运行扫描程序的其他数据存储库,请 阅读 权限。To run the scanner in discovery mode only for these other data repositories, Read permission is sufficient.

- 对于可重新保护或移除保护的标签:要确保扫描程序始终能够访问受保护的文件,请将此帐户设置为 Azure Rights Management 服务的超级用户,并确保已启用超级用户功能。- For labels that reprotect or remove protection: To ensure that the scanner always has access to protected files, make this account a super user for the Azure Rights Management service, and ensure that the super user feature is enabled. 要详细了解应用保护的帐户要求,请参阅准备用户和组以便使用 Azure 信息保护For more information about the account requirements for applying protection, see Preparing users and groups for Azure Information Protection. 此外,如果对分阶段部署实现了载入控件,还请确保已配置的载入控件中包含此帐户。In addition, if you have implemented onboarding controls for a phased deployment, make sure that this account is included in your onboarding controls you've configured.
存储扫描程序配置的 SQL Server:SQL Server to store the scanner configuration:

- 本地或远程实例- Local or remote instance

- 不区分大小写排序规则- Case insensitive collation

- 安装扫描程序的 Sysadmin 角色- Sysadmin role to install the scanner
SQL Server 2012 是以下版本的最低版本:SQL Server 2012 is the minimum version for the following editions:

- SQL Server Enterprise- SQL Server Enterprise

- SQL Server Standard- SQL Server Standard

- SQL Server Express- SQL Server Express

如果安装了多个扫描程序实例,则每个扫描程序实例都需要自己的 SQL Server 实例。If you install more than one instance of the scanner, each scanner instance requires its own SQL Server instance.

如果安装扫描程序且帐户拥有 Sysadmin 角色,那么在安装过程中会自动创建 AzInfoProtectionScanner 数据库,并向运行扫描程序的服务帐户授予相应 db_owner 角色。When you install the scanner and your account has the Sysadmin role, the installation process automatically creates the AzInfoProtectionScanner database and grants the required db_owner role to the service account that runs the scanner. 如果无法获得 Sysadmin 角色或组织策略要求手动创建和配置数据库,请参阅使用备用配置部署扫描程序部分。If you cannot be granted the Sysadmin role or your organization policies require databases to be created and configured manually, see the Deploying the scanner with alternative configurations section.

每个部署的配置数据库大小不同,建议为要扫描的每 1 百万个文件分配 500 MB。The size of the configuration database will vary for each deployment but we recommend you allocate 500 MB for every 1,000,000 files that you want to scan.
Windows Server 计算机上安装了 Azure 信息保护客户端 (经典) The Azure Information Protection client (classic) is installed on the Windows Server computer 必须安装扫描程序的完整客户端。You must install the full client for the scanner. 请勿安装只带有 PowerShell 模块的客户端。Do not install the client with just the PowerShell module.

有关客户端安装说明,请参阅管理员指南For client installation instructions, see the admin guide. 如果现在需要将已安装的旧扫描程序升级到更高版本,请参阅升级 Azure 信息保护扫描程序If you have previously installed the scanner and now need to upgrade it to a later version, see Upgrading the Azure Information Protection scanner.
已配置可应用自动分类和保护(可选)的标签Configured labels that apply automatic classification, and optionally, protection 有关如何为条件配置标签以及如何应用保护的更多信息:For more information about how to configure a label for conditions and to apply protection:
- 如何为自动和建议分类配置条件- How to configure conditions for automatic and recommended classification
- 如何为 Rights Management 保护配置标签- How to configure a label for Rights Management protection

提示:可以使用 教程 中的说明来测试扫描仪,并在已准备好的 Word 文档中查找信用卡号。Tip: You can use the instructions from the tutorial to test the scanner with a label that looks for credit card numbers in a prepared Word document. 但是,需要更改标签配置,以便将“选择应用此标签的方式”**** 设置为“自动”**** 而不是“推荐”****。However, you will need to change the label configuration so that Select how this label is applied is set to Automatic rather than Recommended. 然后从文档中删除标签(如果已应用),并将文件复制到扫描程序的数据存储库。Then remove the label from the document (if it is applied) and copy the file to a data repository for the scanner.

尽管即使你尚未配置应用自动分类的标签,仍然可以运行扫描程序,但这些说明中并未涵盖此方案。Although you can run the scanner even if you haven't configured labels that apply automatic classification, this scenario is not covered with these instructions. 详细信息More information
对于要扫描的 SharePoint 文档库和文件夹:For SharePoint document libraries and folders to be scanned:

-SharePoint 2019- SharePoint 2019

- SharePoint 2016- SharePoint 2016

- SharePoint 2013- SharePoint 2013

- SharePoint 2010- SharePoint 2010
扫描程序不支持其他版本的 SharePoint。Other versions of SharePoint are not supported for the scanner.

使用 版本控制时,扫描程序会检查并标记上次发布的版本。When you use versioning, the scanner inspects and labels the last published version. 如果扫描程序标签文件和 内容审批 是必需的,则必须向用户批准标记为 "文件" 的文件。If the scanner labels a file and content approval is required, that labeled file must be approved to be available for users.

对于大型 SharePoint 场,请检查是否需要增加列表视图阈值(默认为 5,000),以便扫描程序访问所有文件。For large SharePoint farms, check whether you need to increase the list view threshold (by default, 5,000) for the scanner to access all files. 有关详细信息,请参阅以下 SharePoint 文档: 在 sharepoint 中管理大型列表和库For more information, see the following SharePoint documentation: Manage large lists and libraries in SharePoint
对于要扫描的 Office 文档:For Office documents to be scanned:

- Word、Excel 和 PowerPoint 的 97-2003 文件格式和 Office Open XML 格式- 97-2003 file formats and Office Open XML formats for Word, Excel, and PowerPoint
有关扫描程序为这些文件格式支持的文件类型的详细信息,请参阅 Azure 信息保护客户端支持的文件类型For more information about the file types that the scanner supports for these file formats, see File types supported by the Azure Information Protection client
对于长路径:For long paths:

- 最多 260 个字符,除非扫描程序安装在 Windows 2016 上,并且该计算机配置为支持长路径- Maximum of 260 characters, unless the scanner is installed on Windows 2016 and the computer is configured to support long paths
Windows 10 和 windows Server 2016 支持使用以下组策略设置的路径长度大于260个字符:本地计算机策略 > 计算机配置 > 管理模板 > 所有设置 > 启用 Win32 长路径Windows 10 and Windows Server 2016 support path lengths greater than 260 characters with the following group policy setting: Local Computer Policy > Computer Configuration > Administrative Templates > All Settings > Enable Win32 long paths

有关支持长文件路径的详细信息,请参阅 Windows 10 开发人员文档中的最大路径长度限制一节。For more information about supporting long file paths, see the Maximum Path Length Limitation section from the Windows 10 developer documentation.

如果由于组织策略禁止而无法满足表中的所有要求,请参阅下一部分中介绍的备用配置。If you can't meet all the requirements in the table because they are prohibited by your organization policies, see the next section for alternatives.

如果满足所有要求,请直接转到安装部分If all the requirements are met, go straight to the installation section.

使用备用配置部署扫描程序Deploying the scanner with alternative configurations

表中列出的先决条件是扫描程序的默认要求,之所以建议是因为它们是用于扫描程序部署的最简单配置。The prerequisites listed in the table are the default requirements for the scanner and recommended because they are the simplest configuration for the scanner deployment. 它们应适用于初始测试,以便能够检查扫描程序的功能。They should be suitable for initial testing, so that you can check the capabilities of the scanner. 不过,在产品环境中,组织策略可能会禁止这些默认要求,因为存在下列一个或多个限制:However, in a product environment, your organization policies might prohibit these default requirements because of one or more of the following restrictions:

  • 不允许服务器连接到 internetServers are not allowed internet connectivity

  • 无法向你授予 Sysadmin 角色或者必须手动创建并配置数据库You cannot be granted Sysadmin or databases must be created and configured manually

  • 无法将服务帐户授予 本地登录 权限Service accounts cannot be granted the Log on locally right

  • 服务帐户无法同步到 Azure Active Directory 但服务器具有 internet 连接Service accounts cannot be synchronized to Azure Active Directory but servers have internet connectivity

虽然扫描程序可以适应这些限制,但需要其他配置。The scanner can accommodate these restrictions but they require additional configuration.

限制:扫描仪服务器不能连接到 internetRestriction: The scanner server cannot have internet connectivity

请按照适用于已断开连接的计算机的说明操作。Follow the instructions for a disconnected computer.

请注意,在此配置中,扫描程序无法使用组织的基于云的密钥来应用(或删除)保护配置。Note that in this configuration, the scanner cannot apply protection (or remove protection) by using your organization's cloud-based key. 相反,扫描程序只能使用应用分类或 HYOK 保护的标签。Instead, the scanner is limited to using labels that apply classification only, or protection that uses HYOK.

限制:无法获得 Sysadmin 角色,或必须手动创建和配置数据库Restriction: You cannot be granted Sysadmin or databases must be created and configured manually

如果可以获得 Sysadmin 角色来安装扫描程序,就能在扫描程序安装完成后删除此角色。If you can be granted the Sysadmin role temporarily to install the scanner, you can remove this role when the scanner installation is complete. 使用此配置时,数据库会自动创建,而且扫描程序的服务帐户也会自动获得相应权限。When you use this configuration, the database is automatically created for you and the service account for the scanner is automatically granted the required permissions. 不过,配置扫描程序的用户帐户需要拥有 AzInfoProtectionScanner 数据库的 db_owner 角色,你必须手动向用户帐户授予此角色。However, the user account that configures the scanner requires the db_owner role for the AzInfoProtectionScanner database, and you must manually grant this role to the user account.

如果你不能暂时授予 Sysadmin 角色,则必须在安装 scanner 之前要求具有 Sysadmin 权限的用户手动创建名为 Azinfoprotectionscanner.exe 的数据库。If you cannot be granted the Sysadmin role even temporarily, you must ask a user with Sysadmin rights to manually create a database named AzInfoProtectionScanner before you install the scanner. 对于此配置,必须分配以下角色:For this configuration, the following roles must be assigned:

帐户Account 数据库级角色Database-level role
扫描程序的服务帐户Service account for the scanner db_ownerdb_owner
用于安装扫描程序的用户帐户User account for scanner installation db_ownerdb_owner
用于配置扫描程序的用户帐户User account for scanner configuration db_ownerdb_owner

用于安装和配置扫描程序的用户帐户通常是相同的。Typically, you will use the same user account to install and configure the scanner. 不过,如果使用不同的帐户,它们都需要拥有 AzInfoProtectionScanner 数据库的 db_owner 角色。But if you use different accounts, they both require the db_owner role for the AzInfoProtectionScanner database.

若要创建用户并授予对此数据库的 db_owner 权限,请要求 Sysadmin 运行以下 SQL 脚本两次。To create a user and grant db_owner rights on this database, ask the Sysadmin to run the following SQL script twice. 第一次,对于运行扫描程序的服务帐户,以及第二次用于安装和管理扫描仪。The first time, for the service account that runs the scanner, and the second time for you to install and manage the scanner. 运行该脚本之前,请将 domain\user 替换为服务帐户或用户帐户的域名和用户帐户名:Before running the script, replace domain\user with the domain name and user account name of the service account or user account:

if not exists(select * from master.sys.server_principals where sid = SUSER_SID('domain\user')) BEGIN declare @T nvarchar(500) Set @T = 'CREATE LOGIN ' + quotename('domain\user') + ' FROM WINDOWS ' exec(@T) END
USE AzInfoProtectionScanner IF NOT EXISTS (select * from sys.database_principals where sid = SUSER_SID('domain\user')) BEGIN declare @X nvarchar(500) Set @X = 'CREATE USER ' + quotename('domain\user') + ' FROM LOGIN ' + quotename('domain\user'); exec sp_addrolemember 'db_owner', 'domain\user' exec(@X) END

此外:Additionally:

  • 您必须是将运行扫描程序的服务器上的本地管理员You must be a local administrator on the server that will run the scanner

  • 必须为将运行扫描程序的服务帐户授予对以下注册表项的 "完全控制" 权限:The service account that will run the scanner must be granted Full Control permissions to the following registry keys:

    • HKEY_LOCAL_MACHINE \SOFTWARE\WOW6432Node\Microsoft\MSIPC\ServerHKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\MSIPC\Server
    • HKEY_LOCAL_MACHINE \SOFTWARE\Microsoft\MSIPC\ServerHKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSIPC\Server

如果在配置这些权限后出现错误,则在安装扫描程序时,可以忽略该错误,并且可以手动启动 scanner 服务。If, after configuring these permissions, you see an error when you install the scanner, the error can be ignored and you can manually start the scanner service.

限制:无法向扫描程序的服务帐户授予“本地登录”**** 权限Restriction: The service account for the scanner cannot be granted the Log on locally right

如果组织策略禁止向服务帐户授予“本地登录”**** 权限,但允许授予“以批处理作业形式登录”**** 权限,请按照管理员指南内对 Set-AIPAuthentication 指定和使用 Token 参数一文中的说明操作。If your organization policies prohibit the Log on locally right for service accounts but allow the Log on as a batch job right, follow the instructions for Specify and use the Token parameter for Set-AIPAuthentication from the admin guide.

限制:扫描仪服务帐户无法同步到 Azure Active Directory 但服务器具有 internet 连接Restriction: The scanner service account cannot be synchronized to Azure Active Directory but the server has internet connectivity

可以使用一个帐户来运行扫描程序服务,并使用另一个帐户对 Azure Active Directory 进行身份验证:You can have one account to run the scanner service and use another account to authenticate to Azure Active Directory:

安装扫描程序Install the scanner

  1. 登录到将要运行扫描程序的 Windows Server 计算机。Sign in to the Windows Server computer that will run the scanner. 使用具有本地管理员权限并具有写入到 SQL Server master 数据库权限的帐户。Use an account that has local administrator rights and that has permissions to write to the SQL Server master database.

  2. 使用“以管理员身份运行”选项打开 Windows PowerShell 会话****。Open a Windows PowerShell session with the Run as an administrator option.

  3. 运行 install-aipscanner cmdlet,并指定要在其中为 Azure 信息保护扫描程序创建数据库的 SQL Server 实例:Run the Install-AIPScanner cmdlet, specifying your SQL Server instance on which to create a database for the Azure Information Protection scanner:

    Install-AIPScanner -SqlServerInstance <name>
    

    例如:For example:

    • 对于默认实例:Install-AIPScanner -SqlServerInstance SQLSERVER1For a default instance: Install-AIPScanner -SqlServerInstance SQLSERVER1

    • 对于命名实例:Install-AIPScanner -SqlServerInstance SQLSERVER1\AIPSCANNERFor a named instance: Install-AIPScanner -SqlServerInstance SQLSERVER1\AIPSCANNER

    • 对于 SQL Server Express:Install-AIPScanner -SqlServerInstance SQLSERVER1\SQLEXPRESSFor SQL Server Express: Install-AIPScanner -SqlServerInstance SQLSERVER1\SQLEXPRESS

    出现提示时,请提供扫描程序服务帐户的凭据 (<domain\user name>) 和密码。When you are prompted, provide the credentials for the scanner service account (<domain\user name>) and password.

  4. 使用管理工具服务验证是否已安装该服务 > ServicesVerify that the service is now installed by using Administrative Tools > Services.

    已安装的服务被命名为 Azure信息保护扫描程序,并被配置为使用你创建的扫描程序服务帐户运行****。The installed service is named Azure Information Protection Scanner and is configured to run by using the scanner service account that you created.

现已安装扫描程序,需获取 Azure AD 令牌以便扫描程序服务帐户进行身份验证,从而实现以无人参与的方式运行。Now that you have installed the scanner, you need to get an Azure AD token for the scanner service account to authenticate so that it can run unattended.

获取扫描程序的 Azure AD 令牌Get an Azure AD token for the scanner

借助 Azure AD 令牌,Azure 信息保护服务可以验证扫描程序服务帐户。The Azure AD token lets the scanner service account authenticate to the Azure Information Protection service.

  1. 通过同一台 Windows Server 计算机或通过桌面登录 Azure 门户,创建 2 个 Azure AD 应用程序 - 指定用于身份验证的访问令牌时需使用这两个应用程序。From the same Windows Server computer, or from your desktop, sign in to the Azure portal to create two Azure AD applications that are needed to specify an access token for authentication. 首次以交互方式登录后,此令牌将允许扫描程序以非交互方式运行。After an initial interactive sign-in, this token lets the scanner run non-interactively.

    要创建这些应用程序,请按照管理员指南中如何以非交互方式为 Azure 信息保护标记文件的说明执行操作。To create these applications, follow the instructions in How to label files non-interactively for Azure Information Protection from the admin guide.

  2. 在 Windows Server 计算机中,如果你的扫描程序服务帐户已就安装授予了****“本地登录”权限:使用此帐户登录并启动 PowerShell 会话。From the Windows Server computer, if your scanner service account has been granted the Log on locally right for the installation: Sign in with this account and start a PowerShell session. 运行 Set-AIPAuthentication,指定从上一步骤中复制的值:Run Set-AIPAuthentication, specifying the values that you copied from the previous step:

    Set-AIPAuthentication -webAppId <ID of the "Web app / API" application> -webAppKey <key value generated in the "Web app / API" application> -nativeAppId <ID of the "Native" application>
    

    系统提示时,请为 Azure AD 的服务帐户凭据指定密码,然后单击“接受”****。When prompted, specify the password for your service account credentials for Azure AD, and then click Accept.

    如果你的扫描程序服务帐户无法就安装授予“本地登录”**** 权限:请按照管理员指南中指定和使用 Set-AIPAuthentication 的令牌参数一节中的说明来操作。If your scanner service account cannot be granted the Log on locally right for the installation: Follow the instructions in the Specify and use the Token parameter for Set-AIPAuthentication section from the admin guide.

扫描程序现已拥有一个令牌,可向 Azure AD 进行身份验证。该令牌的有效期为 1 年、2 年或永不过期,具体取决于 Azure AD 中“Web 应用/API”的配置****。The scanner now has a token to authenticate to Azure AD, which is valid for one year, two years, or never expires, according to your configuration of the Web app /API in Azure AD. 如果令牌过期,则须重复步骤 1 和步骤 2。When the token expires, you must repeat steps 1 and 2.

现可指定要扫描的数据存储。You're now ready to specify the data stores to scan.

指定扫描程序的数据存储Specify data stores for the scanner

使用 Add-AIPScannerRepository cmdlet 指定将由 Azure 信息保护扫描程序进行扫描的数据存储。Use the Add-AIPScannerRepository cmdlet to specify the data stores to be scanned by the Azure Information Protection scanner. 您可以为 SharePoint 文档库和文件夹指定 UNC 路径和 SharePoint Server Url。You can specify UNC paths and SharePoint Server URLs for SharePoint document libraries and folders.

SharePoint 支持的 SharePoint 版本: SharePoint Server 2019、SharePoint Server 2016 和 SharePoint Server 2013。Supported versions for SharePoint: SharePoint Server 2019, SharePoint Server 2016, and SharePoint Server 2013. 对于具有对此版本 SharePoint 的延长支持的客户,还支持 SharePoint Server 2010。SharePoint Server 2010 is also supported for customers who have extended support for this version of SharePoint.

  1. 在同一台 Windows Server 计算机中,通过在 PowerShell 会话中运行以下命令来添加第一个数据存储:From the same Windows Server computer, in your PowerShell session, add your first data store by running the following command:

    Add-AIPScannerRepository -Path <path>
    

    例如: Add-AIPScannerRepository -Path \\NAS\DocumentsFor example, Add-AIPScannerRepository -Path \\NAS\Documents

    有关其他示例,请使用此 cmdlet 的 PowerShell help 命令 Get-Help Add-AIPScannerRepository -examplesFor other examples, use the PowerShell help command Get-Help Add-AIPScannerRepository -examples for this cmdlet.

  2. 对想要扫描的所有数据存储重复此命令。Repeat this command for all the data stores that you want to scan. 如果要删除已添加的数据存储,请使用 Remove-AIPScannerRepository cmdlet。If you need to remove a data store that you added, use the Remove-AIPScannerRepository cmdlet.

  3. 确认已正确指定所有数据存储,方法是运行 Get-AIPScannerRepository cmdlet:Confirm that you have specified all the data stores correctly, by running the Get-AIPScannerRepository cmdlet:

    Get-AIPScannerRepository
    

利用扫描程序的默认配置,现在可在发现模式下运行首次扫描。With the scanner's default configuration, you're now ready to run your first scan in discovery mode.

运行发现周期并查看扫描程序报告Run a discovery cycle and view reports for the scanner

  1. 在 PowerShell 会话中,通过运行以下命令启动扫描程序:In your PowerShell session, start the scanner by running the following command:

    Start-AIPScan
    

    或者,可以从 Azure 门户启动扫描程序。Alternatively, you can start the scanner from the Azure portal. 从 " Azure 信息保护-节点 " 窗格中,选择扫描仪节点,然后单击 " 立即扫描 " 选项:From the Azure Information Protection - Nodes pane, select your scanner node, and then the Scan now option:

    启动 Azure 信息保护扫描程序扫描

  2. 运行以下命令,等待扫描程序完成其周期:Wait for the scanner to complete its cycle by running the following command:

    Get-AIPScannerStatus
    

    或者,你可以通过检查 "状态" 列,在 "Azure 门户中的" Azure 信息保护-节点"窗格中查看状态。Alternatively, you can view the status from the Azure Information Protection - Nodes pane in the Azure portal, by checking the STATUS column.

    查找显示“空闲”而非“正在扫描”的状态。********Look for the status to show Idle rather than Scanning.

    当扫描程序浏览完指定数据存储中所有文件时,扫描程序停止(尽管扫描程序服务仍在运行)。When the scanner has crawled through all the files in the data stores that you specified, the scanner stops although the scanner service remains running.

    查看本地 Windows 应用程序和服务事件日志和 Azure 信息保护。Check the local Windows Applications and Services event log, Azure Information Protection. 另外,此日志还会报告扫描程序完成扫描的时间,以及结果摘要。This log also reports when the scanner has finished scanning, with a summary of results. 请查看信息事件 ID 911****。Look for the informational event ID 911.

  3. 查看存储在 %** localappdata%\Microsoft\MSIP\Scanner\Reports 中的报表。Review the reports that are stored in %localappdata%\Microsoft\MSIP\Scanner\Reports. .txt 摘要文件包括扫描所用的时间、扫描的文件数以及匹配信息类型的文件数量。The .txt summary files include the time taken to scan, the number of scanned files, and how many files had a match for the information types. .csv 文件包含每个文件的更多详细信息。The .csv files have more details for each file. 此文件夹为每个扫描周期最多存储 60 个报表,并且压缩除最新报表之外的所有报表,以帮助最大程度地减少所需的磁盘空间。This folder stores up to 60 reports for each scanning cycle and all but the latest report is compressed to help minimize the required disk space.

    备注

    可以将 ** ReportLevel 参数和 Set-AIPScannerConfiguration 结合使用来更改日志记录级别,但不能更改报表文件夹位置或名称。You can change the level of logging by using the ReportLevel parameter with Set-AIPScannerConfiguration, but you can't change the report folder location or name. 如果要将报表存储在其他卷或分区上,请考虑使用该文件夹的目录交叉点。Consider using a directory junction for the folder if you want to store the reports on a different volume or partition.

    例如,使用 Mklink 命令:mklink /j D:\Scanner_reports C:\Users\aipscannersvc\AppData\Local\Microsoft\MSIP\Scanner\ReportsFor example, using the Mklink command: mklink /j D:\Scanner_reports C:\Users\aipscannersvc\AppData\Local\Microsoft\MSIP\Scanner\Reports

    使用默认设置时,只有满足为自动分类配置的条件的文件才会被包括在详细报表中。With the default setting, only files that meet the conditions you've configured for automatic classification are included in the detailed reports. 如果未在这些报表中看到应用任何标签,请检查标签配置是否包括自动分类而不是推荐分类。If you don't see any labels applied in these reports, check your label configuration includes automatic rather than recommended classification.

    提示

    扫描程序会每 5 分钟向 Azure 信息保护发送一次此信息,这样你就可以准实时地在 Azure 门户中查看结果。Scanners send this information to Azure Information Protection every five minutes, so that you can view the results in near real-time from the Azure portal. 有关详细信息,请参阅 Azure 信息保护报表For more information, see Reporting for Azure Information Protection.

    如果结果与预期不符,建议对在 Azure 信息保护策略中指定的条件进行微调。If the results are not as you expect, you might need to fine-tune the conditions that you specified in your Azure Information Protection policy. 如果是这种情况,请重复步骤 1 到 3,直到可更改配置以应用分类和保护(可选)。If that's the case, repeat steps 1 through 3 until you are ready to change the configuration to apply the classification and optionally, protection.

Azure 门户仅显示有关上次扫描的信息。The Azure portal displays information about the last scan only. 如果需要查看先前扫描的结果,请返回到扫描程序计算机上存储的报表,它位于 %localappdata**%\Microsoft\MSIP\Scanner\Reports 文件夹中。If you need to see the results of previous scans, return to the reports that are stored on the scanner computer, in the %localappdata%\Microsoft\MSIP\Scanner\Reports folder.

如果已准备好对扫描程序发现的文件进行自动标记,请继续下一步。When you're ready to automatically label the files that the scanner discovers, continue to the next procedure.

将扫描程序配置为应用分类和保护Configure the scanner to apply classification and protection

在其默认设置中,扫描程序在仅报告模式下运行一次。In its default setting, the scanner runs one time and in the reporting-only mode. 若要更改这些设置,请使用 set-aipscannerconfigurationTo change these settings, use the Set-AIPScannerConfiguration:

  1. 在 Windows Server 计算机上的 PowerShell 会话中,运行以下命令:On the Windows Server computer, in the PowerShell session, run the following command:

    Set-AIPScannerConfiguration -Enforce On -Schedule Always
    

    你可能还希望更改其他配置设置。There are other configuration settings that you might want to change. 例如,是否更改文件属性,以及报告中应记录的内容。For example, whether file attributes are changed and what is logged in the reports. 此外,如果 Azure 信息保护策略包括需要理由信息以降低分类级别或移除保护的设置,请使用此 cmdlet 指定该信息。In addition, if your Azure Information Protection policy includes the setting that requires a justification message to lower the classification level or remove protection, specify that message by using this cmdlet. 有关每个配置设置的详细信息,请使用以下 PowerShell 帮助命令: Get-Help Set-AIPScannerConfiguration -detailedUse the following PowerShell help command for more information about each configuration setting: Get-Help Set-AIPScannerConfiguration -detailed

  2. 记录当前时间,并通过运行以下命令重新启动扫描程序:Make a note of the current time and start the scanner again by running the following command:

    Start-AIPScan
    

    或者,可以从 Azure 门户启动扫描程序。Alternatively, you can start the scanner from the Azure portal. 从 " Azure 信息保护-节点 " 窗格中,选择扫描仪节点,然后单击 " 立即扫描 " 选项:From the Azure Information Protection - Nodes pane, select your scanner node, and then the Scan now option:

    启动 Azure 信息保护扫描程序扫描

  3. 重新监视 911 信息类型的事件日志,并且时间戳要晚于上个步骤启动扫描时的时间。****Monitor the event log for the informational type 911 again, with a time stamp later than when you started the scan in the previous step.

    然后查看报告,详细了解标记了哪些文件、向每个文件应用了什么分类,以及是否向它们应用了保护。Then check the reports to see details of which files were labeled, what classification was applied to each file, and whether protection was applied to them. 或者,使用 Azure 门户,更轻松地了解此信息。Or, use the Azure portal to more easily see this information.

因为我们将计划配置为持续运行,所以当扫描程序扫描完所有文件时,它将开始一个新周期,以便发现任何新文件和更改的文件。Because we configured the schedule to run continuously, when the scanner has worked its way through all the files, it starts a new cycle so that any new and changed files are discovered.

如何扫描文件How files are scanned

扫描程序在扫描文件时运行以下过程。The scanner runs through the following processes when it scans files.

1. 确定是包括还是排除文件以进行扫描1. Determine whether files are included or excluded for scanning

扫描程序自动跳过从分类和保护中排除的文件,如可执行文件和系统文件。The scanner automatically skips files that are excluded from classification and protection, such as executable files and system files.

可以定义要扫描或不要扫描的文件类型列表,从而更改此行为。You can change this behavior by defining a list of file types to scan, or exclude from scanning. 在指定此列表并且不指定数据存储库时,该列表适用于未指定其自己列表的所有数据存储库。When you specify this list and do not specify a data repository, the list applies to all data repositories that do not have their own list specified. 若要指定此列表,请使用 Set-AIPScannerScannedFileTypesTo specify this list, use Set-AIPScannerScannedFileTypes.

指定文件类型列表后,可以使用 Add-AIPScannerScannedFileTypes 向列表添加新文件类型,并能使用 Remove-AIPScannerScannedFileTypes 从列表中删除文件类型。After you have specified your file types list, you can add a new file type to the list by using Add-AIPScannerScannedFileTypes, and remove a file type from the list by using Remove-AIPScannerScannedFileTypes.

2. 检查并标记文件2. Inspect and label files

然后,扫描程序使用筛选器来扫描支持的文件类型。The scanner then uses filters to scan supported file types. 操作系统也可使用这些筛选器来进行 Windows 搜索和索引编制。These same filters are used by the operating system for Windows Search and indexing. 无需任何额外配置,即可使用 Windows IFilter 来扫描 Word、Excel、PowerPoint 使用文件类型,以及用于 PDF 文档和文本文件的文件类型。Without any additional configuration, Windows IFilter is used to scan file types that are used by Word, Excel, PowerPoint, and for PDF documents and text files.

有关默认支持的文件类型的完整列表,以及有关如何配置包括 .zip 文件和 .tiff 文件的现有筛选器的详细信息,请参阅支持检查的文件类型For a full list of file types that are supported by default, and additional information how to configure existing filters that include .zip files and .tiff files, see File types supported for inspection.

检查之后,可以使用为标签指定的条件为这些文件类型设置标签。After inspection, these file types can be labeled by using the conditions that you specified for your labels. 或者,如果要使用发现模式,可以报告这些文件,在其中包含为标签指定的条件,或所有已知敏感信息类型。Or, if you're using discovery mode, these files can be reported to contain the conditions that you specified for your labels, or all known sensitive information types.

但在以下情况下,扫描程序无法为文件设置标签:However, the scanner cannot label the files under the following circumstances:

  • 如果标签应用分类而不应用保护,并且文件类型不只支持分类If the label applies classification and not protection, and the file type does not support classification only.

  • 如果标签应用分类和保护,但扫描程序不保护该文件类型。If the label applies classification and protection, but the scanner does not protect the file type.

    默认情况下,扫描程序仅保护 Office 文件类型,以及 PDF 文件(使用 ISO PDF 加密标准进行保护时)。By default, the scanner protects only Office file types, and PDF files when they are protected by using the ISO standard for PDF encryption. 可通过编辑注册表来保护其他文件类型,如下一节所述。Other file types can be protected when you edit the registry as described in a following section.

例如,检查文件扩展名为 .txt 的文件之后,扫描程序无法应用为分类而非保护配置的标签,因为 .txt 文件类型不支持仅分类。For example, after inspecting files that have a file name extension of .txt, the scanner can't apply a label that's configured for classification but not protection, because the .txt file type doesn't support classification-only. 如果为分类和保护配置了标签,并且针对 .txt 文件类型编辑了注册表,则扫描程序可为文件设置标签。If the label is configured for classification and protection, and the registry is edited for the .txt file type, the scanner can label the file.

提示

在此过程中,如果扫描程序停止并且未完成对存储库中大量文件的扫描:During this process, if the scanner stops and doesn't complete scanning a large number of the files in a repository:

  • 可能需要增加托管文件的操作系统的动态端口数。You might need to increase the number of dynamic ports for the operating system hosting the files. SharePoint 的服务器强化可能是导致扫描程序超出允许的网络连接数并因此停止的一个原因。Server hardening for SharePoint can be one reason why the scanner exceeds the number of allowed network connections, and therefore stops.

    若要检查这是否是扫描程序停止的原因,请查看是否在%localappdata% \ Microsoft\MSIP\Logs\MSIPScanner.iplog 中为扫描程序记录以下错误消息 (如果有多个日志) : 无法连接到远程---服务器 > 系统 SocketException:每个套接字地址 (协议/网络地址/端口) 通常只允许使用 IP:端口To check whether this is the cause of the scanner stopping, look to see if the following error message is logged for the scanner in %localappdata%\Microsoft\MSIP\Logs\MSIPScanner.iplog (zipped if there are multiple logs): Unable to connect to the remote server ---> System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted IP:port

    有关如何查看当前端口范围并增加范围的详细信息,请参阅 可修改的设置以提高网络性能For more information about how to view the current port range and increase the range, see Settings that can be Modified to Improve Network Performance.

  • 对于大型 SharePoint 场,可能需要增加列表视图阈值(默认情况下为 5,000)。For large SharePoint farms, you might need to increase the list view threshold (by default, 5,000). 有关详细信息,请参阅以下 SharePoint 文档: 在 sharepoint 中管理大型列表和库For more information, see the following SharePoint documentation: Manage large lists and libraries in SharePoint.

3. 无法检查的标签文件3. Label files that can't be inspected

对于无法检查的文件类型,扫描程序应用 Azure 信息保护策略中的默认标签或为扫描程序配置的默认标签。For the file types that can't be inspected, the scanner applies the default label in the Azure Information Protection policy, or the default label that you configure for the scanner.

与上述步骤相同,在下列情况下,扫描程序无法为文件设置标签:As in the preceding step, the scanner cannot label the files under the following circumstances:

  • 如果标签应用分类而不应用保护,并且文件类型不只支持分类If the label applies classification and not protection, and the file type does not support classification only.

  • 如果标签应用分类和保护,但扫描程序不保护该文件类型。If the label applies classification and protection, but the scanner does not protect the file type.

    默认情况下,扫描程序仅保护 Office 文件类型,以及 PDF 文件(使用 ISO PDF 加密标准进行保护时)。By default, the scanner protects only Office file types, and PDF files when they are protected by using the ISO standard for PDF encryption. 可通过编辑注册表来保护其他文件类型,如下文所述。Other file types can be protected when you edit the registry as described next.

编辑扫描程序的注册表Editing the registry for the scanner

若要更改默认扫描程序行为以保护 Office 文件和 PDF 以外的文件类型,必须手动编辑注册表并指定想要保护的其他文件类型以及保护类型(本机或泛型)。To change the default scanner behavior for protecting file types other than Office files and PDFs, you must manually edit the registry and specify the additional file types that you want to be protected, and the type of protection (native or generic). 有关说明,请参阅开发人员指南中的文件 API 配置For instructions, see File API configuration from the developer guidance. 对于本文档中的开发人员,常规保护被称为“PFile”。In this documentation for developers, generic protection is referred to as "PFile". 此外,特定于扫描程序:In addition, specific for the scanner:

  • 扫描仪具有其自己的默认行为:默认情况下仅保护 Office 文件格式和 PDF 文档。The scanner has its own default behavior: Only Office file formats and PDF documents are protected by default. 如果未修改注册表,则扫描程序不会保护任何其他文件类型或为其设置标签。If the registry is not modified, any other file types will not be labeled or protected by the scanner.

  • 如果你希望 Azure 信息保护客户端使用与 Azure 信息保护客户端相同的默认保护行为,其中所有文件都自动使用本机保护或常规保护进行保护:将通配符指定为 * 注册表项, Encryption 作为值 (REG_SZ) ,并 Default 作为值数据。If you want the same default protection behavior as the Azure Information Protection client, where all files are automatically protected with native or generic protection: Specify the * wildcard as a registry key, Encryption as the value (REG_SZ), and Default as the value data.

编辑注册表时,如果 MSIPC**** 密钥和 FileProtection**** 密钥不存在,则手动创建这些密钥,并创建每个文件扩展名的密钥。When you edit the registry, manually create the MSIPC key and FileProtection key if they do not exist, as well as a key for each file name extension.

例如,除了 Office 文件和 PDF 之外,若要使扫描程序还保护 TIFF 图像,编辑后的注册表将与下图类似。For example, for the scanner to protect TIFF images in addition to Office files and PDFs, the registry after you have edited it, will look like the following picture. 作为图像文件,TIFF 文件支持本机保护,且生成的文件扩展名为 .ptiff。As an image file, TIFF files support native protection and the resulting file name extension is .ptiff.

编辑扫描程序的注册表以应用保护

有关同样支持本机保护但必须在注册表中进行指定的文本和图像文件类型列表,请参阅管理员指南中的分类和保护的支持文件类型For a list of text and images file types that similarly support native protection but must be specified in the registry, see Supported file types for classification and protection from the admin guide.

对于不支持本机保护的文件,请将文件扩展名指定为新密钥,并为 PFile 获取常规保护****。For files that don't support native protection, specify the file name extension as a new key, and PFile for generic protection. 对于受保护的文件,生成的文件扩展名为 .pfile。The resulting file name extension for the protected file is .pfile.

重新扫描文件时的情况When files are rescanned

在第一个扫描周期,扫描程序会检查所配置的数据存储中的所有文件,然后在后续扫描中仅检查新文件或修改后的文件。For the first scan cycle, the scanner inspects all files in the configured data stores and then for subsequent scans, only new or modified files are inspected.

可以通过使用Reset参数运行AIPScan来强制扫描程序再次检查所有文件。You can force the scanner to inspect all files again by running Start-AIPScan with the Reset parameter. 必须为手动计划配置扫描仪,这需要使用set-aipscannerconfigurationschedule参数设置为 "手动"。The scanner must be configured for a manual schedule, which requires the Schedule parameter to be set to Manual with Set-AIPScannerConfiguration.

或者,你可以强制扫描程序再次从 Azure 门户中的 " Azure 信息保护-节点 " 窗格检查所有文件。Alternatively, you can force the scanner to inspect all files again from the Azure Information Protection - Nodes pane in the Azure portal. 从列表中选择扫描程序,然后选择“重新扫描所有文件”**** 选项:Select your scanner from the list, and then select the Rescan all files option:

启动 Azure 信息保护扫描程序重新扫描

如果希望报告包含所有文件,再次检查所有文件非常有用;且当扫描程序在发现模式下运行时,通常会使用此配置选项。Inspecting all files again is useful when you want the reports to include all files and this configuration choice is typically used when the scanner runs in discovery mode. 完成全部扫描后,扫描类型自动更改为“增量”,以便后续扫描仅扫描新文件或修改后的文件。When a full scan is complete, the scan type automatically changes to incremental so that for subsequent scans, only new or modified files are scanned.

此外,在扫描程序下载具有新条件或更改后的条件时,会检查所有文件。In addition, all files are inspected when the scanner downloads an Azure Information Protection policy that has new or changed conditions. 扫描程序每小时刷新一次策略,当服务启动时以及策略执行一小时之后,也会刷新。The scanner refreshes the policy every hour, and when the service starts and the policy is older than one hour.

提示

如需以低于一小时的间隔刷新策略(例如在测试期间):请从 %LocalAppData%\Microsoft\MSIP\Policy.msip%LocalAppData%\Microsoft\MSIP\Scanner 手动删除策略文件 Policy.msipIf you need to refresh the policy sooner than this one hour interval, for example, during a testing period: Manually delete the policy file, Policy.msip from both %LocalAppData%\Microsoft\MSIP\Policy.msip and %LocalAppData%\Microsoft\MSIP\Scanner. 然后重新启动 Azure 信息扫描程序服务。Then restart the Azure Information Scanner service.

如果更改了此策略中的保护设置,请在保存保护设置后等待 15 分钟,再重新启动该服务。If you changed protection settings in the policy, also wait 15 minutes from when you saved the protection settings before you restart the service.

如果扫描程序下载了未配置任何自动条件的策略,不会更新扫描程序文件夹中的策略文件副本。If the scanner downloaded a policy that had no automatic conditions configured, the copy of the policy file in the scanner folder does not update. 在此方案中,必须从 %LocalAppData%\Microsoft\MSIP\Policy.msip%LocalAppData%\Microsoft\MSIP\Scanner 中删除策略文件 Policy.msip,然后扫描程序才能够使用正确配置了自动条件标签的新下载的策略文件。In this scenario, you must delete the policy file, Policy.msip from both %LocalAppData%\Microsoft\MSIP\Policy.msip and %LocalAppData%\Microsoft\MSIP\Scanner before the scanner can use a newly downloaded policy file that has labels correctly figured for automatic conditions.

使用具有备选配置的扫描程序Using the scanner with alternative configurations

Azure 信息保护扫描程序支持两种备选方案,在任何一种方案中都无需配置标签:There are two alternative scenarios that the Azure Information Protection scanner supports where labels do not need to be configured for any conditions:

  • 将默认标签应用于数据存储库中的所有文件。Apply a default label to all files in a data repository.

    对于此配置,使用 Set-AIPScannerRepository cmdlet,并将“MatchPolicy”** 参数设置为“关闭”****。For this configuration, use the Set-AIPScannerRepository cmdlet, and set the MatchPolicy parameter to Off.

    根据为数据存储库指定的默认标签(通过 SetDefaultLabel** 参数),不会检查文件的内容并会标记数据存储库中的所有文件;如果未指定,则将该默认标签指定为扫描程序帐户的策略设置。The contents of the files are not inspected and all files in the data repository are labeled according to the default label that you specify for the data repository (with the SetDefaultLabel parameter) or if this is not specify, the default label that is specified as a policy setting for the scanner account.

  • 标识所有自定义条件和已知敏感信息类型。Identify all custom conditions and known sensitive information types.

    对于此配置,使用 Set-AIPScannerConfiguration cmdlet,并将“DiscoverInformationTypes”** 参数设置为“全部”****。For this configuration, use the Set-AIPScannerConfiguration cmdlet, and set the DiscoverInformationTypes parameter to All.

    扫描程序使用为 Azure 信息保护策略中的标签指定的任何自定义条件以及可指定用于 Azure 信息保护策略中的标签的信息类型列表。The scanner uses any custom conditions that you have specified for labels in the Azure Information Protection policy, and the list of information types that are available to specify for labels in the Azure Information Protection policy.

    以下快速入门使用此配置,但它适用于当前版本的扫描程序: 快速入门:查找您拥有的敏感信息The following quickstart uses this configuration, although it's for the current version of the scanner: Quickstart: Find what sensitive information you have.

优化扫描程序性能Optimizing the performance of the scanner

使用以下指南有助于优化扫描程序的性能。Use the following guidance to help you optimize the performance of the scanner. 但是,如果你的优先级是扫描仪计算机的响应性而不是扫描程序性能,你可以使用 高级客户端设置 来限制扫描程序使用的线程数。However, if your priority is the responsiveness of the scanner computer rather than the scanner performance, you can use an advanced client setting to limit the number of threads used by the scanner.

若要最大程度实现扫描程序的性能:To maximize the scanner performance:

  • 在扫描程序计算机和被扫描的数据存储之间建立高速可靠的网络连接Have a high speed and reliable network connection between the scanner computer and the scanned data store

    例如,将扫描程序计算机置于所扫描的数据存储所在的 LAN 或(首选)网络段中。For example, place the scanner computer in the same LAN, or (preferred) in the same network segment as the scanned data store.

    网络连接的质量会影响扫描程序的性能,因为要检查文件,扫描程序需将文件内容传输到运行扫描程序服务的计算机中。The quality of the network connection affects the scanner performance because to inspect the files, the scanner transfers the contents of the files to the computer running the scanner service. 如果减少(或消除)此数据需传输的网络跃点数,网络上的负载随之减少。When you reduce (or eliminate) the number of network hops this data has to travel, you also reduce the load on your network.

  • 确保扫描程序计算机具有可用的处理器资源Make sure the scanner computer has available processor resources

    检查文件内容是否与你配置的条件相匹配,以及进行文件加密和解密,这些操作都是处理器密集型操作。Inspecting the file contents for a match against your configured conditions, and encrypting and decrypting files are processor-intensive actions. 请监视所指定的数据存储的典型扫描周期,以确定缺少处理器资源是否对扫描程序性能造成负面影响。Monitor typical scanning cycles for your specified data stores to identify whether a lack of processor resources is negatively affecting the scanner performance.

影响扫描程序性能的其他因素:Other factors that affect the scanner performance:

  • 包含要扫描的文件的数据存储的当前负载和响应时间The current load and response times of the data stores that contain the files to scan

  • 扫描程序是在发现模式还是强制模式下运行Whether the scanner runs in discovery mode or enforce mode

    通常,发现模式的扫描速率比强制模式的高,因为发现操作需要单一文件读取操作,而强制模式需要读取和写入操作。Discovery mode typically has a higher scanning rate than enforce mode because discovery requires a single file read action, whereas enforce mode requires read and write actions.

  • 更改 Azure 信息保护中的条件You change the conditions in the Azure Information Protection

    在第一个扫描周期,扫描程序必须检查每个文件,而后续扫描周期默认仅扫描新文件和更改的文件,因此第一个周期明显比后续周期耗时长。Your first scan cycle when the scanner must inspect every file will obviously take longer than subsequent scan cycles that by default, inspect only new and changed files. 但是,如果更改 Azure 信息保护中的条件,则重新扫描所有文件,如上一部分所述。However, if you change the conditions in the Azure Information Protection policy, all files are scanned again, as described in the preceding section.

  • 自定义条件的正则表达式构造The construction of regex expressions for custom conditions

    为避免占用过多内存并存在超时风险(每个文件 15 分钟),请查看正则表达式了解有效的模式匹配。To avoid heavy memory consumption and the risk of timeouts (15 minutes per file), review your regex expressions for efficient pattern matching. 例如:For example:

    • 避免贪婪限定符Avoid greedy quantifiers

    • 使用 (?:expression) 等非捕获组,而不是 (expression)Use non-capturing groups such as (?:expression) instead of (expression)

  • 所选的日志记录级别Your chosen logging level

    可对扫描程序报告选择“调试”、“信息”、“错误”或“关闭”****************。You can choose between Debug, Info, Error and Off for the scanner reports. “关闭”可使性能最佳;“调试”会明显减低扫描程序的速度,应仅用于故障排除********。Off results in the best performance; Debug considerably slows down the scanner and should be used only for troubleshooting. 有关详细信息,请参阅 Set-aipscannerconfiguration cmdlet 的 eportlevel 参数,方法是运行 Get-Help Set-AIPScannerConfiguration -detailedFor more information, see the ReportLevel parameter for the Set-AIPScannerConfiguration cmdlet by running Get-Help Set-AIPScannerConfiguration -detailed.

  • 文件自身:The files themselves:

    • 除了 Excel 文件,Office 文件的扫描速度比 PDF 文件更快。With the exception of Excel files, Office files are more quickly scanned than PDF files.

    • 扫描未受保护的文件比扫描受保护的文件耗时更短。Unprotected files are quicker to scan than protected files.

    • 扫描大型文件明显比扫描小文件耗时更多。Large files obviously take longer to scan than small files.

  • 此外:Additionally:

    • 确认运行扫描程序的服务帐户仅具有 " 扫描仪先决条件 " 部分中所述的权限,然后配置 " 高级客户端" 设置 以禁用扫描程序的低完整性级别。Confirm that the service account that runs the scanner has only the rights documented in the scanner prerequisites section, and then configure the advanced client setting to disable the low integrity level for the scanner.

    • 在使用备选配置将默认标签应用于所有文件时,扫描程序可以更快地运行,因为扫描程序不检查文件内容。The scanner runs more quickly when you use the alternative configuration to apply a default label to all files because the scanner does not inspect the file contents.

    • 如果你使用替换配置标识所有自定义条件和已知敏感信息类型,扫描程序的运行速度会更慢。The scanner runs more slowly when you use the alternative configuration to identify all custom conditions and known sensitive information types.

    • 您可以使用 高级客户端设置 减少扫描程序超时,以获得更好的扫描速率和更低的内存消耗,但有确认可能会跳过某些文件。You can decrease the scanner timeouts with advanced client settings for better scanning rates and lower memory consumption, but with the acknowledgment that some files might be skipped.

适用于扫描程序的 cmdlet 列表List of cmdlets for the scanner

利用其他适用于扫描程序的 cmdle,可更改该扫描程序的服务帐户和数据库、获取扫描程序的当前设置,以及卸载扫描程序服务。Other cmdlets for the scanner let you change the service account and database for the scanner, get the current settings for the scanner, and uninstall the scanner service. 扫描程序使用以下 cmdlet:The scanner uses the following cmdlets:

  • Add-AIPScannerScannedFileTypesAdd-AIPScannerScannedFileTypes

  • Add-AIPScannerRepositoryAdd-AIPScannerRepository

  • Get-AIPScannerConfigurationGet-AIPScannerConfiguration

  • Get-AIPScannerRepositoryGet-AIPScannerRepository

  • Get-AIPScannerStatusGet-AIPScannerStatus

  • Install-AIPScannerInstall-AIPScanner

  • Remove-AIPScannerRepositoryRemove-AIPScannerRepository

  • Remove-AIPScannerScannedFileTypesRemove-AIPScannerScannedFileTypes

  • Set-AIPScannerSet-AIPScanner

  • Set-AIPScannerConfigurationSet-AIPScannerConfiguration

  • Set-AIPScannerScannedFileTypesSet-AIPScannerScannedFileTypes

  • Set-AIPScannerRepositorySet-AIPScannerRepository

  • Start-AIPScanStart-AIPScan

  • Uninstall-AIPScannerUninstall-AIPScanner

  • Update-AIPScannerUpdate-AIPScanner

备注

目前,当前版本的扫描程序中已不推荐使用其中的许多 cmdlet,扫描程序 cmdlet 的联机帮助将反映此更改。Many of these cmdlets are now deprecated in the current version of the scanner, and the online help for the scanner cmdlets reflects this change. 对于早于 scanner 版本1.48.204.0 的 cmdlet 帮助,请 Get-Help <cmdlet name> 在 PowerShell 会话中使用内置命令。For cmdlet help earlier than version 1.48.204.0 of the scanner, use the built-in Get-Help <cmdlet name> command in your PowerShell session.

扫描程序的事件日志 ID 和说明Event log IDs and descriptions for the scanner

利用以下部分,确定扫描程序可能的事件 ID 和说明。Use the following sections to identify the possible event IDs and descriptions for the scanner. 这些事件记录在扫描程序服务的服务器上、Windows 应用程序和服务事件日志和 Azure 信息保护中********。These events are logged on the server that runs the scanner service, in the Windows Applications and Services event log, Azure Information Protection.


信息 910****Information 910

扫描程序周期已开始。Scanner cycle started.

当扫描程序服务启动并开始扫描指定数据存储库中的文件时,会记录此事件。This event is logged when the scanner service is started and begins to scan for files in the data repositories that you specified.


信息 911****Information 911

扫描程序周期已结束。Scanner cycle finished.

扫描程序完成手动扫描,或完成连续计划的一个周期后,会记录此事件。This event is logged when the scanner has finished a manual scan, or the scanner has finished a cycle for a continuous schedule.

如果扫描程序配置为手动运行而不是连续运行,那么若要运行新的扫描,请使用 Start-AIPScan cmdlet。If the scanner was configured to run manually rather than continuously, to run a new scan, use the Start-AIPScan cmdlet. 若要更改计划,请使用 Set-AIPScannerConfiguration cmdlet 和 Schedule 参数**。To change the schedule, use the Set-AIPScannerConfiguration cmdlet and the Schedule parameter.


后续步骤Next steps

想了解 Microsoft 的 Core Services 工程和运行团队是如何实现此扫描程序的?Interested in how the Core Services Engineering and Operations team in Microsoft implemented this scanner? 请阅读以下技术案例研究:使用 Azure 信息保护扫描程序自动执行数据保护Read the technical case study: Automating data protection with Azure Information Protection scanner.

您可能想知道: Windows SERVER FCI 和 Azure 信息保护扫描程序之间的区别是什么?You might be wondering: What's the difference between Windows Server FCI and the Azure Information Protection scanner?

还可在台式计算机中,利用 PowerShell 以交互方式对文件进行分类和保护。You can also use PowerShell to interactively classify and protect files from your desktop computer. 要详细了解此方案及使用 PowerShell 的其他方案,请参阅将 PowerShell 与 Azure 信息保护客户端配合使用For more information about this and other scenarios that use PowerShell, see Using PowerShell with the Azure Information Protection client.