配置和安装 Azure 信息保护统一标记扫描器Configuring and installing the Azure Information Protection unified labeling scanner

适用于: Azure 信息保护、windows server 2019、windows server 2016、windows Server 2012 R2Applies to: Azure Information Protection, Windows Server 2019, Windows Server 2016, Windows Server 2012 R2

备注

如果使用的是 AIP 经典扫描程序,请参阅 安装和配置 Azure 信息保护经典扫描器If you're working with the AIP classic scanner, see Installing and configuring the Azure Information Protection classic scanner.

在开始配置和安装 Azure 信息保护扫描程序之前,请确保系统符合 所需的先决条件Before you start configuring and installing the Azure Information Protection scanner, verify that your system complies with the required prerequisites.

准备就绪后,请继续执行以下步骤:When you're ready, continue with the following steps:

  1. 在 Azure 门户中配置扫描程序Configure the scanner in the Azure portal

  2. 安装扫描程序Install the scanner

  3. 获取扫描程序的 Azure AD 令牌Get an Azure AD token for the scanner

  4. 配置扫描程序以应用分类和保护Configure the scanner to apply classification and protection

根据系统需要执行以下附加配置过程:Perform the following additional configuration procedures as needed for your system:

过程Procedure 说明Description
更改要保护的文件类型Change which file types to protect 你可能想要扫描、分类或保护不同于默认文件类型的文件类型。You may want to scan, classify, or protect different file types than the default. 有关详细信息,请参阅 AIP 扫描进程For more information, see AIP scanning process.
升级扫描仪Upgrading your scanner 升级扫描仪以利用最新的功能和改进。Upgrade your scanner to leverage the latest features and improvements.
批量编辑数据存储库设置Editing data repository settings in bulk 使用导入和导出选项可以批量更改多个数据存储库。Use import and export options to make changes in bulk for multiple data repositories.
使用带有备用配置的扫描程序Use the scanner with alternative configurations 在不使用任何条件配置标签的情况下使用扫描程序Use the scanner without configuring labels with any conditions
优化性能Optimize performance 优化扫描程序性能的指导Guidance to optimize your scanner performance

有关详细信息,请参阅 扫描器的 Cmdlet 列表For more information, see also List of cmdlets for the scanner.

在 Azure 门户中配置扫描程序Configure the scanner in the Azure portal

在安装扫描程序或从较旧的通用版本升级之前,请在 Azure 门户的 "Azure 信息保护" 区域中配置或验证扫描仪设置。Before you install the scanner, or upgrade it from an older general availability version, configure or verify your scanner settings in the Azure Information Protection area of the Azure portal.

若要配置扫描仪:To configure your scanner:

  1. 用下列角色之一登录到 Azure 门户Sign in to the Azure portal with one of the following roles:

    • 合规性管理员Compliance administrator
    • 相容性数据管理员Compliance data administrator
    • 安全管理员Security administrator
    • 全局管理员Global administrator

    然后,导航到 " Azure 信息保护 " 窗格。Then, navigate to the Azure Information Protection pane.

    例如,在资源、服务和文档的搜索框中,开始键入“信息”并选择“Azure 信息保护”。For example, in the search box for resources, services, and docs, start typing Information and select Azure Information Protection.

  2. 创建扫描仪群集Create a scanner cluster. 此群集定义你的扫描仪,并用于识别 scanner 实例,例如在安装、升级和其他进程期间。This cluster defines your scanner and is used to identify the scanner instance, such as during installation, upgrades, and other processes.

  3. (可选) 扫描您的网络中有风险的存储库(Optional) Scan your network for risky repositories. 创建一个网络扫描作业来扫描指定的 IP 地址或范围,并提供可能包含要保护的敏感内容的危险存储库的列表。Create a network scan job to scan a specified IP address or range, and provide a list of risky repositories that may contain sensitive content you'll want to secure.

    运行网络扫描作业,然后 分析找到的所有有风险的存储库Run your network scan job and then analyze any risky repositories found.

  4. 创建内容扫描作业 以定义要扫描的存储库。Create a content scan job to define the repositories you want to scan.

创建扫描仪群集Create a scanner cluster

  1. 从左侧的 "扫描仪" 菜单中,选择 "群集群集" 图标From the Scanner menu on the left, select Clusters clusters icon.

  2. 在 " Azure 信息保护-群集 " 窗格上,选择 " 添加" " 添加图标"。On the Azure Information Protection - Clusters pane, select Add add icon.

  3. 在 " 添加新群集 " 窗格上,为扫描仪输入有意义的名称,并输入可选描述。On the Add a new cluster pane, enter a meaningful name for the scanner, and an optional description.

    群集名称用于标识扫描程序的配置和存储库。The cluster name is used to identify the scanner's configurations and repositories. 例如,你可以输入 欧洲 来识别要扫描的数据存储库的地理位置。For example, you might enter Europe to identify the geographical locations of the data repositories you want to scan.

    稍后将使用此名称来确定要安装或升级扫描仪的位置。You'll use this name later on to identify where you want to install or upgrade your scanner.

  4. 选择 " 保存保存" 图标 以保存所做的更改。Select Save save icon to save your changes.

(公共预览版创建网络扫描作业) Create a network scan job (public preview)

从版本 2.8.85开始,你可以在网络中扫描有风险的存储库。Starting in version 2.8.85, you can scan your network for risky repositories. 添加一个或多个发现内容扫描作业的存储库,以扫描敏感内容。Add one or more of the repositories found to a content scan job to scan them for sensitive content.

备注

网络发现接口目前处于逐步部署中,并将在2020年9月15日在所有区域中可用。The network discovery interface is currently in gradual deployment and will be available in all regions by September 15, 2020.

网络发现先决条件Network discovery prerequisites

先决条件Prerequisite 说明Description
安装网络发现服务Install the Network Discovery service 如果最近升级了扫描仪,可能仍需要安装网络发现服务。If you've recently upgraded your scanner, you may need to still install the Network Discovery service.

运行 MIPNetworkDiscovery cmdlet 以启用网络扫描作业。Run the Install-MIPNetworkDiscovery cmdlet to enable network scan jobs.
Azure 信息保护分析Azure Information Protection analytics 请确保已启用 Azure 信息保护分析。Make sure that you have Azure Information Protection analytics enabled.

在 Azure 门户中,请参阅 Azure 信息保护 > 管理 > 配置分析 (预览版) 。In the Azure portal, go to Azure Information Protection > Manage > Configure analytics (Preview).

有关详细信息,请参阅 Azure 信息保护的中心报告 (公共预览版) For more information, see Central reporting for Azure Information Protection (public preview).

创建网络扫描作业Creating a network scan job

  1. 登录到 Azure 门户,并中转到 " Azure 信息保护"。Log in to the Azure portal, and go to Azure Information Protection. 在左侧的 " 扫描仪 " 菜单下,选择 " **网络扫描作业" (预览) ** 网络扫描作业 "图标Under the Scanner menu on the left, select Network scan jobs (Preview) network scan jobs icon.

  2. 在 " Azure 信息保护-网络扫描作业 " 窗格上,选择 " 添加" " 添加图标"。On the Azure Information Protection - Network scan jobs pane, select Add add icon.

  3. 在 " 添加新的网络扫描作业 " 页上,定义以下设置:On the Add a new network scan job page, define the following settings:

    设置Setting 说明Description
    网络扫描作业名称Network scan job name 为此作业输入有意义的名称。Enter a meaningful name for this job. 此字段为必需字段。This field is required.
    说明Description 输入有意义的说明。Enter a meaningful description.
    选择群集Select the cluster 从下拉列表中,选择要用于扫描已配置网络位置的群集。From the dropdown, select the cluster you want to use to scan the configured network locations.

    提示: 选择群集时,请确保分配的群集中的节点可以通过 SMB 访问配置的 IP 范围。Tip: When selecting a cluster, make sure that the nodes in the cluster you assign can access the configured IP ranges via SMB.
    配置要发现的 IP 范围Configure IP ranges to discover 单击定义 IP 地址或范围。Click to define an IP address or range.

    在 " 选择 IP 范围 " 窗格中,输入一个可选名称,然后输入范围的起始 ip 地址和结束 ip 地址。In the Choose IP ranges pane, enter an optional name, and then a start IP address and end IP address for your range.

    提示: 若要仅扫描特定的 IP 地址,请在 " 起始 ip " 和 " 结束 ip " 字段中输入相同的 ip 地址。Tip: To scan a specific IP address only, enter the identical IP address in both the Start IP and End IP fields.
    设置计划Set schedule 定义希望此网络扫描作业运行的频率。Define how often you want this network scan job to run.

    如果选择 " 每周",则会出现 " 运行网络扫描作业 " 设置。If you select Weekly, the Run network scan job on setting appears. 选择要在一周中的哪几天运行网络扫描作业。Select the days of the week where you want the network scan job to run.
    **设置开始时间 (UTC) **Set start time (UTC) 定义希望此网络扫描作业开始运行的日期和时间。Define the date and time that you want this network scan job to start running. 如果已选择每日、每周或每月运行该作业,则该作业将在所选的重复周期内的指定时间运行。If you've selected to run the job daily, weekly, or monthly, the job will run at the defined time, at the recurrence you've selected.

    注意:将日期设置为该月结束时的任何日子。Note: Be careful when setting the date to any days at the end of the month. 如果选择 31, 则网络扫描作业不会在具有30天或更少30天的任何月份中运行。If you select 31, the network scan job will not run in any month that has 30 days or fewer.
  4. 选择 " 保存保存" 图标 以保存所做的更改。Select Save save icon to save your changes.

提示

如果要使用不同的扫描程序运行相同的网络扫描,请更改在网络扫描作业中定义的群集。If you want to run the same network scan using a different scanner, change the cluster defined in the network scan job.

返回到 " 网络扫描作业 " 窗格,选择 " 分配到群集 " 以选择其他群集,或 取消分配群集 ,以后再进行其他更改。Return to the Network scan jobs pane, and select Assign to cluster to select a different cluster now, or Unassign cluster to make additional changes later.

分析) (公共预览版发现的危险存储库Analyze risky repositories found (public preview)

通过网络扫描作业、内容扫描作业或在日志文件中检测到的用户访问来找到的存储库将聚合起来并在 Scanner > 存储图标 窗格中列出。Repositories found, either by a network scan job, a content scan job, or by user access detected in log files, are aggregated and listed on the Scanner > Repositories repositories icon pane.

如果已 定义网络扫描作业 ,并将其设置为在特定日期和时间运行,请等待它运行完毕,以检查结果。If you've defined a network scan job and have set it to run at a specific date and time, wait until it's finished running to check for results. 运行 内容扫描作业 后,还可以在此处返回以查看更新后的数据。You can also return here after running a content scan job to view updated data.

  1. 在左侧的 "扫描仪" 菜单下,选择 "存储库存储库" 图标Under the Scanner menu on the left, select Repositories repositories icon.

    找到的存储库如下所示:The repositories found are shown as follows:

    • " 按状态显示的存储库 " 关系图显示已为内容扫描作业配置的存储库数,以及有多少存储库未配置。The Repositories by status graph shows how many repositories are already configured for a content scan job, and how many are not.
    • Access Graph 前10个非托管存储库列出了当前未分配到内容扫描作业的前10个存储库,以及有关其访问级别的详细信息。The Top 10 unmanaged repositories by access graph lists the top 10 repositories that are not currently assigned to a content scan job, as well as details about their access levels. 访问级别可以指示存储库的风险。Access levels can indicate how risky your repositories are.
    • "关系图" 列表下的表列出了找到的每个存储库及其详细信息。The table below the graphs list each repository found and their details.
  2. 执行以下任一操作:Do any of the following:

    选项Option 说明Description
    列图标columns icon 选择要更改显示的表列的 Select Columns to change the table columns displayed.
    刷新图标refresh icon 如果扫描仪最近运行过网络扫描结果,请选择 " 刷新 " 以刷新页面。If your scanner has recently run network scan results, select Refresh to refresh the page.
    添加图标add icon 选择表中列出的一个或多个存储库,然后选择 " 分配选定项 ",将其分配到内容扫描作业。Select one or more repositories listed in the table, and then select Assign Selected Items to assign them to a content scan job.
    筛选器Filter "筛选器" 行显示当前应用的任何筛选条件。The filter row shows any filtering criteria currently applied. 选择用于修改其设置的任何条件,或选择 " 添加筛选器 " 以添加新的筛选条件。Select any of the criteria shown to modify its settings, or select Add Filter to add new filtering criteria.

    选择 " 筛选器 " 以应用所做的更改,并使用更新的筛选器刷新该表。Select Filter to apply your changes and refresh the table with the updated filter.
    Log Analytics 图标Log Analytics icon 在 "非托管存储库" 关系图的右上角,单击 " Log Analytics " 图标以跳转到这些存储库的 Log Analytics 数据。In the top-right corner of the unmanaged repositories graph, click the Log Analytics icon to jump to Log Analytics data for these repositories.

具有公共访问权限的存储库Repositories with public access

发现 公共访问 具有 读取读/写 功能的存储库可能具有必须保护的敏感内容。Repositories where Public access is found to have read or read/write capabilities may have sensitive content that must be secured. 如果 公共访问 为 false,则根本无法通过公共方式访问存储库。If Public access is false, the repository not accessible by the public at all.

仅当已在MIPNetworkDiscoveryMIPNetworkDiscovery cmdlet 的StandardDomainsUserAccount参数中设置弱帐户时,才会报告对存储库的公共访问。Public access to a repository is only reported if you've set a weak account in the StandardDomainsUserAccount parameter of the Install-MIPNetworkDiscovery or Set-MIPNetworkDiscovery cmdlets.

  • 这些参数中定义的帐户用于模拟将弱用户访问存储库的权限。The accounts defined in these parameters are used to simulate the access of a weak user to the repository. 如果定义的弱用户可以访问存储库,这意味着可以公开访问存储库。If the weak user defined there can access the repository, this means that the repository can be accessed publicly.

  • 若要确保正确报告公共访问权限,请确保在这些参数中指定的用户仅是 域用户 组的成员。To ensure that public access is reported correctly, make sure that the user specified in these parameters is a member of the Domain Users group only.

创建内容扫描作业Create a content scan job

深入了解你的内容,扫描敏感内容的特定存储库。Deep dive into your content to scan specific repositories for sensitive content.

你可能只想在运行网络扫描作业来分析网络中的存储库之后执行此操作,但也可以自行定义存储库。You may want to do this only after running a network scan job to analyze the repositories in your network, but can also define your repositories yourself.

  1. 在左侧的 " 扫描仪 " 菜单下,选择 " 内容扫描作业"。Under the Scanner menu on the left, select Content scan jobs.

  2. 在 " Azure 信息保护-内容扫描作业 " 窗格上,选择 " 添加" " 添加图标"。On the Azure Information Protection - Content scan jobs pane, select Add add icon.

  3. 对于此初始配置,请配置以下设置,然后选择 " 保存 ",但不要关闭窗格。For this initial configuration, configure the following settings, and then select Save but do not close the pane.

    设置Setting 说明Description
    内容扫描作业设置Content scan job settings - Schedule:保留默认值 "手动"- Schedule: Keep the default of Manual
    - 要发现的信息类型:仅更改为 策略- Info types to be discovered: Change to Policy only
    - 配置存储库:此时不配置,因为必须先保存内容扫描作业。- Configure repositories: Do not configure at this time because the content scan job must first be saved.
    策略实施Policy enforcement - 强制:选择 "关闭"- Enforce: Select Off
    - 基于内容标记文件:将默认值设置为 on- Label files based on content: Keep the default of On
    - 默认标签:保留默认的策略默认值- Default label: Keep the default of Policy default
    - 重新标记文件:保持默认值为Off- Relabel files: Keep the default of Off
    配置文件设置Configure file settings - 保留 "修改日期"、"上次修改时间" 和 "修改者"保留的默认- Preserve "Date modified", "Last modified" and "Modified by": Keep the default of On
    - 要扫描的文件类型:保留默认文件类型以 排除- File types to scan: Keep the default file types for Exclude
    - 默认所有者:保留扫描仪帐户的默认值- Default owner: Keep the default of Scanner Account
  4. 既然已创建并保存了内容扫描作业,你就可以返回到 " 配置存储库 " 选项来指定要扫描的数据存储。Now that the content scan job is created and saved, you're ready to return to the Configure repositories option to specify the data stores to be scanned.

    指定 UNC 路径,以及 sharepoint 本地文档库和文件夹的 SharePoint Server Url。Specify UNC paths, and SharePoint Server URLs for SharePoint on-premises document libraries and folders.

    备注

    Sharepoint 支持 sharepoint Server 2019、SharePoint Server 2016 和 SharePoint Server 2013。SharePoint Server 2019, SharePoint Server 2016, and SharePoint Server 2013 are supported for SharePoint. 具有对此版本 SharePoint 的延长支持时,还支持 SharePoint Server 2010。SharePoint Server 2010 is also supported when you have extended support for this version of SharePoint.

    要添加第一个数据存储,请在 " 添加新的内容扫描作业 " 窗格上,选择 " 配置存储库 " 以打开 " 存储库 " 窗格:To add your first data store, while on the Add a new content scan job pane, select Configure repositories to open the Repositories pane:

    为 Azure 信息保护扫描程序配置数据存储库

    1. 在“存储库”窗格上,选择“添加”:On the Repositories pane, select Add:

      为 Azure 信息保护扫描程序添加数据存储库

    2. 在 " 存储库 " 窗格上,指定数据存储库的路径,然后选择 " 保存"。On the Repository pane, specify the path for the data repository, and then select Save.

      例如:For example:

      • 对于网络共享,请使用 \\Server\FolderFor a network share, use \\Server\Folder.
      • 对于 SharePoint 库,请使用 http://sharepoint.contoso.com/Shared%20Documents/FolderFor a SharePoint library, use http://sharepoint.contoso.com/Shared%20Documents/Folder.

      备注

      不支持通配符,也不支持 WebDav 位置。Wildcards are not supported and WebDav locations are not supported.

      对于此窗格上的其余设置,请不要更改此初始配置的设置,但请将其保留为 内容扫描作业默认值For the remaining settings on this pane, do not change them for this initial configuration, but keep them as Content scan job default. 默认设置表示数据存储库从内容扫描作业继承设置。The default setting means that the data repository inherits the settings from the content scan job.

      添加 SharePoint 路径时,请使用以下语法:Use the following syntax when adding SharePoint paths:

      路径Path 语法Syntax
      根路径Root path http://<SharePoint server name>

      扫描所有站点,包括任何允许用于扫描程序用户的站点集合。Scans all sites, including any site collections allowed for the scanner user.
      需要 额外的权限 来自动发现根内容Requires additional permissions to automatically discover root content
      特定 SharePoint 子网站或集合Specific SharePoint subsite or collection 下列类型作之一:One of the following:
      - http://<SharePoint server name>/<subsite name>
      - http://SharePoint server name>/<site collection name>/<site name>

      需要 额外的权限 来自动发现网站集内容Requires additional permissions to automatically discover site collection content
      特定 SharePoint 库Specific SharePoint library 下列类型作之一:One of the following:
      - http://<SharePoint server name>/<library name>
      - http://SharePoint server name>/.../<library name>
      特定 SharePoint 文件夹Specific SharePoint folder http://<SharePoint server name>/.../<folder name>
  5. 重复上述步骤,根据需要添加任意数量的存储库。Repeat the previous steps to add as many repositories as needed.

    完成后,关闭 " 存储库 " 和 " 内容扫描作业 " 窗格。When you're done, close both the Repositories and Content scan job panes.

返回 " Azure 信息保护-内容扫描作业" 窗格,显示你的内容扫描名称,以及显示为 "手动" 和 "强制" 列为空的 "计划" 列。Back on the Azure Information Protection - Content scan job pane, your content scan name is displayed, together with the SCHEDULE column showing Manual and the ENFORCE column is blank.

你现在已准备好在已创建的内容扫描程序作业中安装扫描程序。You're now ready to install the scanner with the content scanner job that you've created. 继续 安装扫描仪Continue with Install the scanner.

安装扫描程序Install the scanner

在 Azure 门户中配置 Azure 信息保护扫描程序之后,请执行以下步骤安装扫描仪:After you've configured the Azure Information Protection scanner in the Azure portal, perform the steps below to install the scanner:

  1. 登录到将要运行扫描程序的 Windows Server 计算机。Sign in to the Windows Server computer that will run the scanner. 使用具有本地管理员权限并具有写入到 SQL Server master 数据库权限的帐户。Use an account that has local administrator rights and that has permissions to write to the SQL Server master database.

  2. 使用“以管理员身份运行”选项打开 Windows PowerShell 会话****。Open a Windows PowerShell session with the Run as an administrator option.

  3. 运行 install-aipscanner cmdlet,指定要在其中为 Azure 信息保护扫描程序创建数据库的 SQL Server 实例,以及在上一节中指定的扫描仪群集名称:Run the Install-AIPScanner cmdlet, specifying your SQL Server instance on which to create a database for the Azure Information Protection scanner, and the scanner cluster name that you specified in the preceding section:

    Install-AIPScanner -SqlServerInstance <name> -Profile <cluster name>
    

    例如,使用配置文件名称“欧洲”****:Examples, using the profile name of Europe:

    • 对于默认实例:Install-AIPScanner -SqlServerInstance SQLSERVER1 -Profile EuropeFor a default instance: Install-AIPScanner -SqlServerInstance SQLSERVER1 -Profile Europe

    • 对于命名实例:Install-AIPScanner -SqlServerInstance SQLSERVER1\AIPSCANNER -Profile EuropeFor a named instance: Install-AIPScanner -SqlServerInstance SQLSERVER1\AIPSCANNER -Profile Europe

    • 对于 SQL Server Express:Install-AIPScanner -SqlServerInstance SQLSERVER1\SQLEXPRESS -Profile EuropeFor SQL Server Express: Install-AIPScanner -SqlServerInstance SQLSERVER1\SQLEXPRESS -Profile Europe

    出现提示时,请提供扫描程序服务帐户的凭据 (<domain\user name>) 和密码。When you are prompted, provide the credentials for the scanner service account (<domain\user name>) and password.

  4. 使用管理工具服务验证是否已安装该服务 > ServicesVerify that the service is now installed by using Administrative Tools > Services.

    已安装的服务被命名为 Azure信息保护扫描程序,并被配置为使用你创建的扫描程序服务帐户运行****。The installed service is named Azure Information Protection Scanner and is configured to run by using the scanner service account that you created.

现在,你已安装了扫描仪,你需要 获取一个 Azure AD 令牌,以便扫描程序 服务帐户进行身份验证,以便扫描程序可以在无人参与的情况下运行。Now that you have installed the scanner, you need to get an Azure AD token for the scanner service account to authenticate, so that the scanner can run unattended.

获取扫描程序的 Azure AD 令牌Get an Azure AD token for the scanner

使用 Azure AD 令牌,扫描程序可以对 Azure 信息保护服务进行身份验证,从而使扫描仪以非交互方式运行。An Azure AD token allows the scanner to authenticate to the Azure Information Protection service, enabling the scanner to run non-interactively.

有关详细信息,请参阅 如何以非交互方式为 Azure 信息保护标记文件For more information, see How to label files non-interactively for Azure Information Protection.

获取 Azure AD 令牌:To get an Azure AD token:

  1. 返回 Azure 门户,以创建 Azure AD 应用程序以指定用于身份验证的访问令牌。Return to the Azure portal to create an Azure AD application to specify an access token for authentication.

  2. 在 Windows Server 计算机中,如果你的扫描程序服务帐户已被授予 本地登录 的权限,请使用此帐户登录并启动 PowerShell 会话。From the Windows Server computer, if your scanner service account has been granted the Log on locally right for the installation, sign in with this account and start a PowerShell session.

    运行 Set-AIPAuthentication,指定从上一步骤中复制的值:Run Set-AIPAuthentication, specifying the values that you copied from the previous step:

    Set-AIPAuthentication -AppId <ID of the registered app> -AppSecret <client secret sting> -TenantId <your tenant ID> -DelegatedUser <Azure AD account>
    

    例如:For example:

    $pscreds = Get-Credential CONTOSO\scanner
    Set-AIPAuthentication -AppId "77c3c1c3-abf9-404e-8b2b-4652836c8c66" -AppSecret "OAkk+rnuYc/u+]ah2kNxVbtrDGbS47L4" -DelegatedUser scanner@contoso.com -TenantId "9c11c87a-ac8b-46a3-8d5c-f4d0b72ee29a" -OnBehalfOf $pscreds
    Acquired application access token on behalf of CONTOSO\scanner.
    

提示

如果你的扫描仪服务帐户无法被授予本地登录的权限,请使用set-aipauthenticationOnBehalfOf参数,如如何为 Azure 信息保护以非交互方式标记文件中所述。If your scanner service account cannot be granted the Log on locally right for the installation, use the OnBehalfOf parameter with Set-AIPAuthentication, as described in How to label files non-interactively for Azure Information Protection.

现在,扫描程序有一个用于对 Azure AD 进行身份验证的令牌。The scanner now has a token to authenticate to Azure AD. 此令牌的有效期为一年、两年或从不,根据你在 Azure AD 中配置的 Web 应用/API 客户端密码。This token is valid for one year, two years, or never, according to your configuration of the Web app /API client secret in Azure AD.

当令牌过期时,必须重复此过程。When the token expires, you must repeat this procedure.

现在可随时在发现模式下运行第一次扫描。You're now ready to run your first scan in discovery mode. 有关详细信息,请参阅 运行发现周期和查看扫描程序报告For more information, see Run a discovery cycle and view reports for the scanner.

运行初始发现扫描后,请继续 配置扫描仪以应用分类和保护Once you've run your initial discovery scan, continue with Configure the scanner to apply classification and protection.

将扫描程序配置为应用分类和保护Configure the scanner to apply classification and protection

默认设置将扫描程序配置为运行一次,并将其配置为仅报告模式。The default settings configure the scanner to run once, and in reporting-only mode.

若要更改这些设置,请编辑内容扫描作业:To change these settings, edit the content scan job:

  1. 在 Azure 门户的 " Azure 信息保护-内容扫描作业 " 窗格上,选择要编辑的群集和内容扫描作业。In the Azure portal, on the Azure Information Protection - Content scan jobs pane, select the cluster and content scan job to edit it.

  2. 在 "内容扫描作业" 窗格上,更改以下内容,然后选择 " 保存":On the Content scan job pane, change the following, and then select Save:

    • 从 "内容扫描作业" 部分:将计划更改为 "始终"From the Content scan job section: Change the Schedule to Always
    • 策略强制 部分:将 强制 更改为 开启From the Policy enforcement section: Change Enforce to On

    提示

    你可能需要更改此窗格上的其他设置,例如是否更改文件属性以及扫描程序是否可以重新标记文件。You may want to change other settings on this pane, such as whether file attributes are changed and whether the scanner can relabel files. 使用信息弹出通知帮助了解有关每个配置设置的详细信息。Use the information popup help to learn more information about each configuration setting.

  3. 记下当前时间,然后从 " Azure 信息保护-内容扫描作业 " 窗格中再次启动扫描仪:Make a note of the current time and start the scanner again from the Azure Information Protection - Content scan jobs pane:

    启动 Azure 信息保护扫描程序扫描

    或者,在 PowerShell 会话中运行以下命令:Alternatively, run the following command in your PowerShell session:

    Start-AIPScan
    

现在,扫描器计划为连续运行。The scanner is now scheduled to run continuously. 当扫描程序在所有配置的文件中工作时,它会自动启动一个新循环,以便发现所有新文件和更改的文件。When the scanner works its way through all configured files, it automatically starts a new cycle so that any new and changed files are discovered.

更改要保护的文件类型Change which file types to protect

默认情况下,AIP 扫描器仅保护 Office 文件类型和 PDF 文件。By default the AIP scanner protects Office file types and PDF files only.

根据需要使用 PowerShell 命令来更改此行为,例如,将扫描仪配置为保护所有文件类型,就像客户端一样,或保护其他特定文件类型。Use PowerShell commands to change this behavior as needed, such as to configure the scanner to protect all file types, just as the client does, or to protect additional, specific file types.

对于适用于为扫描程序下载标签的用户帐户的标签策略,请指定名为 PFileSupportedExtensions的 PowerShell 高级设置。For a label policy that applies to the user account downloading labels for the scanner, specify a PowerShell advanced setting named PFileSupportedExtensions.

对于有权访问 internet 的扫描仪,此用户帐户是你为 DelegatedUser 参数指定的、带有 set-aipauthentication 命令的帐户。For a scanner that has access to the internet, this user account is the account that you specify for the DelegatedUser parameter with the Set-AIPAuthentication command.

示例1: 用于扫描程序的 PowerShell 命令,用于保护所有文件类型,其中标签策略命名为 "Scanner":Example 1: PowerShell command for the scanner to protect all file types, where your label policy is named "Scanner":

Set-LabelPolicy -Identity Scanner -AdvancedSettings @{PFileSupportedExtensions="*"}

示例2: 用于扫描程序的 PowerShell 命令,用于保护 .xml 文件和 tiff 文件以及 Office 文件和 PDF 文件,其中标签策略命名为 "Scanner":Example 2: PowerShell command for the scanner to protect .xml files and .tiff files in addition to Office files and PDF files, where your label policy is named "Scanner":

Set-LabelPolicy -Identity Scanner -AdvancedSettings @{PFileSupportedExtensions=ConvertTo-Json(".xml", ".tiff")}

有关详细信息,请参阅 更改要保护的文件类型For more information, see Change which file types to protect.

升级扫描仪Upgrading your scanner

如果以前安装了扫描程序并想要升级,请使用 升级 Azure 信息保护扫描程序中所述的说明。If you have previously installed the scanner and want to upgrade, use the instructions described in Upgrading the Azure Information Protection scanner.

然后,按常规方式 配置使用扫描仪 ,跳过安装扫描程序的步骤。Then, configure and use your scanner as usual, skipping the steps to install your scanner.

批量编辑数据存储库设置Editing data repository settings in bulk

使用 " 导出 " 和 " 导入 " 按钮可以在多个存储库中对扫描程序进行更改。Use the Export and Import buttons to make changes for your scanner across several repositories.

这样一来,就不需要在 Azure 门户中手动进行相同的更改。This way, you don't need to make the same changes several times, manually, in the Azure portal.

例如,如果在多个 SharePoint 数据存储库上有一个新的文件类型,则可能需要批量更新这些存储库的设置。For example, if you have a new file type on several SharePoint data repositories, you may want to update the settings for those repositories in bulk.

跨存储库批量进行更改:To make changes in bulk across repositories:

  1. 在 " 存储库 " 窗格的 "Azure 门户" 中,选择 " 导出 " 选项。In the Azure portal on the Repositories pane, select the Export option. 例如:For example:

    导出 Azure 信息保护扫描程序的数据存储库设置

  2. 手动编辑导出的文件以进行更改。Manually edit the exported file to make your change.

  3. 使用同一页面上的 " 导入 " 选项将更新导入到存储库中。Use the Import option on the same page to import the updates back across your repositories.

使用具有备选配置的扫描程序Using the scanner with alternative configurations

Azure 信息保护扫描程序通常会查找为标签指定的条件,以便根据需要对内容进行分类和保护。The Azure Information Protection scanner usually looks for conditions specified for your labels in order to classify and protect your content as needed.

在以下情况下,Azure 信息保护扫描程序还可以扫描内容并管理标签,而不会配置任何条件:In the following scenarios, the Azure Information Protection scanner is also able to scan your content and manage labels, without any conditions configured:

将默认标签应用于数据存储库中的所有文件Apply a default label to all files in a data repository

在此配置中,存储库中所有未标记的文件都标有为存储库或内容扫描作业指定的默认标签。In this configuration, all unlabeled files in the repository are labeled with the default label specified for the repository or the content scan job. 文件标记为 "无检查"。Files are labeled without inspection.

配置下列设置:Configure the following settings:

设置Setting 说明Description
基于内容标记文件Label files based on content 设置为 OffSet to Off
默认标签Default label 设置为 " 自定义",然后选择要使用的标签Set to Custom, and then select the label to use
强制默认标签Enforce default label 选择此值可将默认标签应用到所有文件,即使它们已标记。Select to have the default label applied to all files, even if they are already labeled.

从数据存储库中的所有文件删除现有标签Remove existing labels from all files in a data repository

在此配置中,如果对标签应用了保护,则将删除所有现有标签,包括保护。In this configuration, all existing labels are removed, including protection, if protection was applied with the label. 保留单独应用的保护。Protection applied independently of a label is retained.

配置下列设置:Configure the following settings:

设置Setting 说明Description
基于内容标记文件Label files based on content 设置为 OffSet to Off
默认标签Default label 设置为 NoneSet to None
重新标记文件Relabel files 在选中 "强制默认标签" 复选框的情况下,设置为OnSet to On, with the Enforce default label checkbox selected

标识所有自定义条件和已知的敏感信息类型Identify all custom conditions and known sensitive information types

此配置使你能够查找你可能未意识到的敏感信息,因为扫描程序的扫描速率很高。This configuration enables you to find sensitive information that you might not realize you had, at the expense of scanning rates for the scanner.

将要 发现的信息类型 设置为 " 所有"。Set the Info types to be discovered to All.

为了识别用于标记的条件和信息类型,扫描程序使用指定的任何自定义敏感信息类型以及可供选择的内置敏感信息类型列表,如您在标记管理中心中定义的那样。To identify conditions and information types for labeling, the scanner uses any custom sensitive information types specified, and the list of built-in sensitive information types that are available to select, as defined in your labeling management center.

优化扫描程序性能Optimizing scanner performance

备注

如果希望提高扫描仪计算机的响应能力而不是扫描程序性能,请使用 "高级客户端设置" 限制扫描程序使用的线程数If you are looking to improve the responsiveness of the scanner computer rather than the scanner performance, use an advanced client setting to limit the number of threads used by the scanner.

使用以下选项和指南来帮助优化扫描程序性能:Use the following options and guidance to help you optimize scanner performance:

选项Option 说明Description
在扫描程序计算机和被扫描的数据存储之间建立高速可靠的网络连接Have a high speed and reliable network connection between the scanner computer and the scanned data store 例如,将扫描仪计算机放在与扫描的数据存储相同的网络段中,或者在同一网段中放置。For example, place the scanner computer in the same LAN, or preferably, in the same network segment as the scanned data store.

由于要检查文件,扫描程序会将文件内容传输到运行 scanner 服务的计算机,因此网络连接的质量会影响扫描程序性能。The quality of the network connection affects the scanner performance because, to inspect the files, the scanner transfers the contents of the files to the computer running the scanner service.

减少或消除传输数据所需的网络跃点还可以减少网络上的负载。Reducing or eliminating the network hops required for the data to travel also reduces the load on your network.
确保扫描程序计算机具有可用的处理器资源Make sure the scanner computer has available processor resources 检查文件内容并对文件进行加密和解密是处理密集型操作。Inspecting the file contents and encrypting and decrypting files are processor-intensive actions.

监视指定数据存储的典型扫描周期,以确定缺乏处理器资源是否会对扫描程序性能产生负面影响。Monitor the typical scanning cycles for your specified data stores to identify whether a lack of processor resources is negatively affecting the scanner performance.
安装扫描程序的多个实例Install multiple instances of the scanner 当你指定自定义群集 (配置文件的自定义群集) 名称时,Azure 信息保护扫描程序在相同的 SQL server 实例上支持多个配置数据库。The Azure Information Protection scanner supports multiple configuration databases on the same SQL server instance when you specify a custom cluster (profile) name for the scanner.

多个扫描仪还可以共享同一群集 (配置文件) ,从而缩短扫描时间。Multiple scanners can also share the same cluster (profile), resulting in quicker scanning times.
检查备选配置用法Check your alternative configuration usage 在使用备选配置将默认标签应用于所有文件时,扫描程序可以更快地运行,因为扫描程序不检查文件内容。The scanner runs more quickly when you use the alternative configuration to apply a default label to all files because the scanner does not inspect the file contents.

如果你使用替换配置标识所有自定义条件和已知敏感信息类型,扫描程序的运行速度会更慢。The scanner runs more slowly when you use the alternative configuration to identify all custom conditions and known sensitive information types.

影响性能的其他因素Additional factors that affect performance

影响扫描程序性能的其他因素包括:Additional factors that affect the scanner performance include:

因子Factor 说明Description
加载/响应时间Load/response times 包含要扫描的文件的数据存储的当前负载和响应时间也会影响扫描程序性能。The current load and response times of the data stores that contain the files to scan will also affect scanner performance.
扫描模式 (发现/强制) Scanner mode (Discovery / Enforce) 发现模式的扫描速度通常比 "强制" 模式高。Discovery mode typically has a higher scanning rate than enforce mode.

发现需要单个文件读取操作,而 "强制" 模式需要读取和写入操作。Discovery requires a single file read action, whereas enforce mode requires read and write actions.
策略更改Policy changes 如果已对标签策略中的 autolabeling 进行更改,则扫描程序性能可能会受到影响。Your scanner performance may be affected if you've made changes to the autolabeling in the label policy.

当扫描程序必须检查每个文件时,第一个扫描周期的时间比默认情况下的后续扫描周期长,仅检查新文件和更改的文件。Your first scan cycle, when the scanner must inspect every file, will take longer than subsequent scan cycles that by default, inspect only new and changed files.

如果更改了 "条件" 或 "autolabeling" 设置,将再次扫描所有文件。If you change the conditions or autolabeling settings, all files are scanned again. 有关详细信息,请参阅重新 扫描文件For more information, see Rescanning files.
Regex 构造Regex constructions 扫描程序性能会受到构造自定义条件的正则表达式的影响。Scanner performance is affected by how your regex expressions for custom conditions are constructed.

为避免占用过多内存并存在超时风险(每个文件 15 分钟),请查看正则表达式了解有效的模式匹配。To avoid heavy memory consumption and the risk of timeouts (15 minutes per file), review your regex expressions for efficient pattern matching.

例如:For example:
-避免 贪婪限定符- Avoid greedy quantifiers
-使用非捕获组,例如 (?:expression) 而不是 (expression)- Use non-capturing groups such as (?:expression) instead of (expression)
日志级别Log level 日志级别选项包括扫描器报表的 调试信息错误关闭Log level options include Debug, Info, Error and Off for the scanner reports.

- 禁用 会获得最佳性能- Off results in the best performance
- 调试 大大降低了扫描程序的速度,只应使用进行故障排除。- Debug considerably slows down the scanner and should be used only for troubleshooting.

有关详细信息,请参阅 Set-AIPScannerConfiguration cmdlet 的 eportLevel 参数**。For more information, see the ReportLevel parameter for the Set-AIPScannerConfiguration cmdlet.
正在扫描的文件Files being scanned -除了 Excel 文件,Office 文件的扫描速度比 PDF 文件更快。- With the exception of Excel files, Office files are more quickly scanned than PDF files.

与受保护的文件相比,-不受保护的文件的扫描速度更快。- Unprotected files are quicker to scan than protected files.

-大型文件比小文件需要更长的时间来扫描。- Large files obviously take longer to scan than small files.

适用于扫描程序的 cmdlet 列表List of cmdlets for the scanner

本部分列出 Azure 信息保护扫描程序支持的 PowerShell cmdlet。This section lists PowerShell cmdlets supported for the Azure Information Protection scanner.

备注

Azure 信息保护扫描程序是从 Azure 门户配置的。The Azure Information Protection scanner is configured from the Azure portal. 因此,在以前的版本中用于配置数据存储库的 cmdlet 和 "扫描的文件类型" 列表现已弃用。Therefore, cmdlets used in previous versions to configure data repositories and the scanned file types list are now deprecated.

扫描程序支持的 cmdlet 包括:Supported cmdlets for the scanner include:

后续步骤Next steps

安装并配置了扫描仪后,开始 扫描文件Once you've installed and configured your scanner, start scanning your files.

另请参阅: 部署 Azure 信息保护扫描程序以自动对文件进行分类和保护See also: Deploying the Azure Information Protection scanner to automatically classify and protect files.

详细信息:More information: