修改精確資料比對模式以使用可設定比對Modify Exact Data Match schema to use configurable match

以精確資料比對 (EDM) 為基礎的分類可讓您參考敏感性資訊資料庫中的確切值來建立自訂敏感性資訊類型。Exact Data Match (EDM) based classification enables you to create custom sensitive information types that refer to exact values in a database of sensitive information. 當您需要在確切的字串中允許變數時,您可以使用 可設定比對 來告知 Microsoft 365 略過大小寫和一些分隔符。When you need to allow for variants of a exact string, you can use configurable match to tell Microsoft 365 to ignore case and some delimiters.

重要

使用這個程序來修改現有的 EDM 結構描述和資料檔案。Use this procedure to modify an existing EDM schema and data file.

  1. 從電腦解除安裝您爲了 EDM 結構描述和資料檔案目的而用來連結 Microsoft 365 的 EdmUploadAgent.exeUninstall the EdmUploadAgent.exe from the computer that you use to connect to Microsoft 365 for EDM schema and data file upload purposes.

  2. 針對您透過以下連結的訂閲,下載合適的 EdmUploadAgent.exe 檔案:Download the appropriate EdmUploadAgent.exe file for your subscription using the links below:

    • 商業客戶 + GCC - 大多數商業客戶應使用這個Commercial + GCC - most commercial customers should use this
    • GCC-High - 專門針對高安全性政府雲端用戶GCC-High - This is specifically for high security government cloud subscribers
    • DoD - 專門針對美國國防部雲端客戶DoD - this is specifically for United States Department of Defense cloud customers
  3. 授權 EDM 上傳代理、開啟命令提示字元視窗(以系統管理員身分),然後執行下列命令:Authorize the EDM Upload Agent, open Command Prompt window (as an administrator) and run the following command:

    EdmUploadAgent.exe /Authorize

  4. 如果您沒有現有結構描述的現存副本,您將需要下載現有結構描述的副本,並執行這個命令:If you don't have a current copy of the existing schema, you'll need to download a copy of the existing schema, run this command:

    EdmUploadAgent.exe /SaveSchema /DataStoreName <dataStoreName> [/OutputDir [Output dir location]]

  5. 自訂結構描述,讓每個欄可利用「caseInsensitive 」和/或「ignoredDelimiters 」。Customize the schema so each column utilizes “caseInsensitive” and / or “ignoredDelimiters”. 「caseInsensitive 」的預設值是 「否 」,而「ignoredDelimiters 」的則是空字串。The default value for “caseInsensitive” is “false” and for “ignoredDelimiters”, it is an empty string.

注意

用於偵測常規 Regex 圖樣的基礎自訂敏感性資訊類型或內置的敏感資訊類型,必須支援偵測列出 ignoreDelimiters 的變數輸入。The underlying custom sensitive information type or built in sensitive information type used to detect the general regex pattern must support detection of the variations inputs listed with ignoredDelimiters. 例如,内置的美國社會安全編號 (SSN) 敏感性資訊類型可以偵測資料中有包括破折號、空格或構成 SSN 分組編號之間缺少空格的變數。For example, the built in U.S. social security number (SSN) sensitive information type can detect variations in the data that include dashes, spaces, or lack of spaces between the grouped numbers that make up the SSN. 因此,針對 SSN 資料來説,唯一需要被包括在 EDM 的 ignoredDelimiters 的分隔符是:破折號和空格。As a result, the only delimiters that are relevant to include in EDM’s ignoredDelimiters for SSN data are: dash and space.

這裡有個範例結構描述,通過建立識別敏感性資料中大小寫變數所需的額外欄,來模擬不區分大小寫的比對。Here is a sample schema that simulates case insensitive match by creating the extra columns needed to recognize case variations in the sensitive data.

<EdmSchema xmlns="http://schemas.microsoft.com/office/2018/edm">
  <DataStore name="PatientRecords" description="Schema for patient records policy" version="1">
           <Field name="PolicyNumber" searchable="true" />
           <Field name="PolicyNumberLowerCase" searchable="true" />
           <Field name="PolicyNumberUpperCase" searchable="true" />
           <Field name="PolicyNumberCapitalLetters" searchable="true" />
  </DataStore>
</EdmSchema>

在上述範例中,如果已經同時加上 caseInsensitiveignoredDelimiters,則不再需要原本 PolicyNumber 欄的變數。In the above example, the variations of the original PolicyNumber column will no longer be needed if both caseInsensitive and ignoredDelimiters are added.

要更新這個結構描述,讓 EDM 使用可設定比對,請使用 caseInsensitiveignoredDelimiters 旗標。To update this schema so that EDM uses configurable match use the caseInsensitive and ignoredDelimiters flags. 以下是如何運作:Here's how that looks:

<EdmSchema xmlns="http://schemas.microsoft.com/office/2018/edm">
  <DataStore name="PatientRecords" description="Schema for patient records policy" version="1">
         <Field name="PolicyNumber" searchable="true" caseInsensitive="true" ignoredDelimiters="-,/,*,#,^" />
  </DataStore>
</EdmSchema>

ignoredDelimiters 旗標支援任何非英數字元的字元,這裡有些範例:The ignoredDelimiters flag supports any non-alphanumeric character, here are some examples:

  • ..
  • -
  • /
  • _
  • *
  • ^
  • #
  • !
  • ?
  • [
  • ]
  • {
  • }
  • \
  • ~
  • ;

ignoredDelimiters 旗標不支援:The ignoredDelimiters flag doesn't support:

  • 0-9 字元characters 0-9
  • A-ZA-Z
  • a-za-z
  • "
  • ,
  1. 使用連線到安全性與合規性中心 PowerShell 中的程序,連線到安全性與合規性中心。Connect to the Security & Compliance center using the procedures in Connect to Security & Compliance Center PowerShell.

  2. 一次執行一個下方的 cmdlets 以更新您的結構描述:Update your schema by running these cmdlets one at a time:

$edmSchemaXml=Get-Content .\\edm.xml -Encoding Byte -ReadCount 0

Set-DlpEdmSchema -FileData $edmSchemaXml -Confirm:$true

  1. 如果需要時,更新資料檔案以配合新版本的結構描述If necessary, update the data file to match the new schema version

提示

您也可以選擇在上傳之前通過以下命令對 CSV 檔案進行驗證:Optionally, you can run a validation against your csv file before uploading by running:

EdmUploadAgent.exe /ValidateData /DataFile [data file] [schema file]

所有 EdmUploadAgent.exe 的相關資訊 > 已支援的參數執行For more information on all the EdmUploadAgent.exe >supported parameters run

EdmUploadAgent.exe /?

  1. 開啟命令提示字元視窗(以系統管理員身分),然後執行下列命令以雜湊和上傳您的敏感性資料:Open Command Prompt window (as an administrator) and run the following command to hash and upload your sensitive data:

    EdmUploadAgent.exe /UploadData /DataStoreName [DS Name] /DataFile [data file] /HashLocation [hash file location] /Salt [custom salt] /Schema [Schema file]