使用以精確資料比對為基礎的分類建立自訂敏感性資訊類型Create custom sensitive information types with Exact Data Match based classification

自訂敏感性資訊類型用於協助識別敏感性項目,使得您可以防止不小心或不適當地將其與他人共用。Custom sensitive information types are used to help identify sensitive items so that you can prevent them from being inadvertently or inappropriately shared. 您可以根據下列項目來定義自訂機密資訊類型:You define a custom sensitive information type based on:

  • 模式patterns
  • 關鍵字辨識項,例如員工識別證識別碼keyword evidence such as employee, badge, or ID
  • 字元以特定模式接近證據character proximity to evidence in a particular pattern
  • 信賴等級confidence levels

這類自訂敏感性資訊類型符合許多組織的業務需求。Such custom sensitive information types meet business needs for many organizations.

但是,如果您想要一種使用精確數據值的自定義敏感信息類型,而非根據通用模式找到的匹配,該怎麼辦?But what if you wanted a custom sensitive information type that uses exact data values, instead of one that found matches based on generic patterns? 使用以精確資料比對 (EDM) 為基礎的分類,您可以建立其設計目的為以下的自訂敏感性資訊類型:With Exact Data Match (EDM)-based classification, you can create a custom sensitive information type that is designed to:

  • 動態且可更新;be dynamic and refreshable
  • 更可以調整的;be more scalable
  • 造成較少的誤判;result in fewer false-positives
  • 使用結構化的敏感性資料;work with structured sensitive data
  • 更安全地處理敏感性資訊;以及handle sensitive information more securely
  • 能與數個 Microsoft 雲端服務搭配使用。be used with several Microsoft cloud services

以 EDM 為基礎的分類

以 EDM 為基礎的分類可讓您建立自訂敏感性資訊類型,其參考敏感性資訊資料庫中的確切值。EDM-based classification enables you to create custom sensitive information types that refer to exact values in a database of sensitive information. 資料庫可以每日重新整理,而且可以包含最多 1 億資料列。The database can be refreshed daily, and contain up to 100 million rows of data. 因此,隨著員工、病患或客戶來來去去,以及記錄變更,您的自訂敏感性資訊類型會維持最新且適用。So as employees, patients, or clients come and go, and records change, your custom sensitive information types remain current and applicable. 同時,您可以對原則使用以 EDM 為基礎的分類,例如資料外洩防護原則 (DLP) 或 Microsoft Cloud App Security 檔案原則And, you can use EDM-based classification with policies, such as data loss prevention policies (DLP) or Microsoft Cloud App Security file policies.

注意

Microsoft 365 資訊保護目前在預覽版中支援下列雙位元組字元集語言:Microsoft 365 Information Protection now supports in preview double byte character set languages for:

  • 中文 (簡體)Chinese (simplified)
  • 中文 (繁體)Chinese (traditional)
  • 韓文Korean
  • 日文Japanese

這項支援適用於敏感性資訊類型。This support is available for sensitive information types. 如需詳細資訊,請參閱資訊保護支援雙位元組字元集的版本資訊 (預覽版)See, Information protection support for double byte character sets release notes (preview) for more information.

必要的授權和權限Required licenses and permissions

您必須是全域管理員、合規性系統管理員或 Exchange Online 系統管理員才能執行本文所述的工作。You must be a global admin, compliance administrator, or Exchange Online administrator to perform the tasks described in this article. 若要進一步了解 DLP 權限,請參閱權限To learn more about DLP permissions, see Permissions.

這些訂閱中包含 EDM 型分類EDM-based classification is included in these subscriptions

  • Office 365 E5Office 365 E5
  • Microsoft 365 E5Microsoft 365 E5
  • Microsoft 365 E5 合規性Microsoft 365 E5 Compliance
  • Microsoft E5/A5 資訊保護和控管Microsoft E5/A5 Information Protection and Governance
入口網站Portal 全球/GCCWorld Wide/GCC GCC-HighGCC-High DODDOD
Office SCCOffice SCC protection.office.comprotection.office.com scc.office365.usscc.office365.us scc.protection.apps.milscc.protection.apps.mil
Microsoft 365 安全性中心Microsoft 365 Security center security.microsoft.comsecurity.microsoft.com security.microsoft.ussecurity.microsoft.us security.apps.milsecurity.apps.mil
Microsoft 365 合規性中心Microsoft 365 Compliance center compliance.microsoft.comcompliance.microsoft.com compliance.microsoft.uscompliance.microsoft.us compliance.apps.milcompliance.apps.mil

工作流程概覽The work flow at a glance

階段Phase 需要的項目What's needed
第 1 部分:設定以 EDM 為基礎的分類Part 1: Set up EDM-based classification

(視需要)(As needed)
- 編輯資料庫結構描述- Edit the database schema
- 移除結構描述- Remove the schema
- 敏感性資料的讀取存取權- Read access to the sensitive data
- XML 格式的資料庫結構描述 (提供範例)- Database schema in XML format (example provided)
- XML 格式的規則套件 (提供範例)- Rule package in XML format (example provided)
- 安全性與合規性中心的系統管理員權限 (使用 PowerShell)- Admin permissions to the Security & Compliance Center (using PowerShell)
第 2 部分:雜湊和上傳敏感性資料Part 2: Hash and upload the sensitive data

(視需要)(As needed)
重新整理資料Refresh the data
- 自訂安全性群組和使用者帳戶- Custom security group and user account
- 具有 EDM 上傳代理程式電腦的本機系統管理員存取權- Local admin access to machine with EDM Upload Agent
- 敏感性資料的讀取存取權- Read access to the sensitive data
- 重新整理資料的程序和排程- Process and schedule for refreshing the data
第 3 部分:使用以 EDM 為基礎的分類搭配 Microsoft 雲端服務Part 3: Use EDM-based classification with your Microsoft cloud services - Microsoft 365 訂閱與 DLP- Microsoft 365 subscription with DLP
- 已啟用以 EDM 為基礎的分類功能- EDM-based classification feature enabled

第 1 部分:設定以 EDM 為基礎的分類Part 1: Set up EDM-based classification

設定及安裝以 EDM 為基礎的分類會涉及:Setting up and configuring EDM-based classification involves:

  1. 以 .csv 格式儲存機密資料Saving sensitive data in .csv format
  2. 定義您的機密資訊資料庫架構Define your sensitive information database schema
  3. 建立規則套件Create a rule package

以 .csv 格式儲存機密資料Save sensitive data in .csv format

  1. 找出您要使用的敏感性資訊。Identify the sensitive information you want to use. 將資料匯出至應用程式,例如 Microsoft Excel,並將檔案以 .csv 格式儲存。Export the data to an app, such as Microsoft Excel, and save the file in .csv format. 資料檔案可能包含:The data file can include a maximum of:

    • 最多 1 億列敏感性資料Up to 100 million rows of sensitive data
    • 每個資料來源最多 32 個資料行 (欄位)Up to 32 columns (fields) per data source
    • 最多 5 個資料行 (欄位) 標示為可搜尋Up to 5 columns (fields) marked as searchable
  2. 以 .csv 檔案格式將敏感性資料結構化,使得第一列包含用於以 EDM 為基礎的分類的欄位名稱。Structure the sensitive data in the .csv file such that the first row includes the names of the fields used for EDM-based classification. 在您的 .csv 檔案中,您可能會有欄位名稱,例如 "ssn"、"生日"、"名字"、"姓氏" 等等。In your .csv file, you might have field names, such as "ssn", "birthdate", "firstname", "lastname". 欄標題名稱不能包含空格或底線。The column header names can't include spaces or underscores. 例如,在本文我們所使用的 .csv 檔案範例稱為 PatientRecords.csv,而其資料行包含 PatientIDMRNLastNameFirstNameSSN 等等。For example, the sample .csv file that we use in this article is named PatientRecords.csv, and its columns include PatientID, MRN, LastName, FirstName, SSN, and more.

定義用於敏感性資訊的資料庫結構描述Define the schema for your database of sensitive information

  1. 以 XML 格式定義用於敏感性資訊資料庫的結構描述 (類似以下的範例)。Define the schema for the database of sensitive information in XML format (similar to our example below). 將此結構描述檔案命名為 edm.xml,然後進行設定,讓資料庫中的每一個資料行都會有使用下列語法的行:Name this schema file edm.xml, and configure it such that for each column in the database, there is a line that uses the syntax:

    \<Field name="" searchable=""/\>.\<Field name="" searchable=""/\>.

    • 使用資料行名稱作為欄位名稱值。Use column names for Field name values.
    • 對您想讓它可供搜尋最多 5 個欄位的欄位,使用 searchable="true"Use searchable="true" for the fields that you want to be searchable up to a maximum of 5 fields. 至少必須有一個欄位可供搜尋。At least one field must be searchable.

    例如,下列 XML 檔會為病患記錄資料庫定義結構描述,並將五個欄位指定為可供搜尋:PatientIDMRNSSNPhoneDOBAs an example, the following XML file defines the schema for a patient records database, with five fields specified as searchable: PatientID, MRN, SSN, Phone, and DOB.

    (您可以複製、修改及使用我們的範例)。(You can copy, modify, and use our example.)

    <EdmSchema xmlns="http://schemas.microsoft.com/office/2018/edm">
          <DataStore name="PatientRecords" description="Schema for patient records" version="1">
                <Field name="PatientID" searchable="true" />
                <Field name="MRN" searchable="true" />
                <Field name="FirstName" />
                <Field name="LastName" />
                <Field name="SSN" searchable="true" />
                <Field name="Phone" searchable="true" />
                <Field name="DOB" searchable="true" />
                <Field name="Gender" />
                <Field name="Address" />
          </DataStore>
    </EdmSchema>
    
  2. 使用連線到安全性與合規性中心 PowerShell 中的程序,連線到安全性與合規性中心。Connect to the Security & Compliance center using the procedures in Connect to Security & Compliance Center PowerShell.

  3. 若要上傳資料庫結構描述,請執行下列 Cmdlet,一次一個:To upload the database schema, run the following cmdlets, one at a time:

    $edmSchemaXml=Get-Content .\\edm.xml -Encoding Byte -ReadCount 0
    New-DlpEdmSchema -FileData $edmSchemaXml -Confirm:$true
    

    系統會提示您確認,如下所示:You will be prompted to confirm, as follows:

    確認Confirm

    是否確定要執行此動作?Are you sure you want to perform this action?

    將匯入資料存放區 'patientrecords' 的新 EDM 結構描述。New EDM Schema for the data store 'patientrecords' will be imported.

    [Y] 是 [A] 全部皆是 [N] 否 [L] 全部皆否 [?] 說明 (預設值為 "Y"):[Y] Yes [A] Yes to All [N] No [L] No to All [?] Help (default is "Y"):

提示

若要直接變更而不進行確認,請在步驟 5 中改用此 Cmdlet:New-DlpEdmSchema -FileData $edmSchemaXmlIf you want your changes to occur without confirmation, in Step 5, use this cmdlet instead: New-DlpEdmSchema -FileData $edmSchemaXml

注意

這可能要花 10 到 60 分鐘的時間,才能將 EDMSchema 更新為新增項目。It can take between 10-60 minutes to update the EDMSchema with additions. 在您執行使用新增項目的步驟之前,必須先完成更新。The update must complete before you execute steps that use the additions.

設定規則套件Set up a rule package

  1. 以 XML 格式建立規則套件 (使用 Unicode 編碼方式),類似下列範例。Create a rule package in XML format (with Unicode encoding), similar to the following example. (您可以複製、修改及使用我們的範例)。(You can copy, modify, and use our example.)

    當您設定規則套件時,請務必正確參照您的 .csv 檔案和 edm.xml 檔案。When you set up your rule package, make sure to correctly reference your .csv file and edm.xml file. 您可以複製、修改及使用我們的範例。You can copy, modify, and use our example. 在此範例 xml 中,必須自訂下列欄位,才能建立您的 EDM 敏感性類型:In this sample xml the following fields needs to be customized to create your EDM sensitive type:

    • RulePack id 與 ExactMatch id:使用 New-GUID 產生 GUID。RulePack id & ExactMatch id: Use New-GUID to generate a GUID.

    • 資料存放區:此欄位會指定要使用的 EDM 查閱資料存放區。Datastore: This field specifies EDM lookup data store to be used. 您要提供已設定之 EDM 結構描述的資料來源名稱。You provide a data source name of a configured EDM Schema.

    • idMatch:此欄位會指向 EDM 的主要元素。idMatch: This field points to the primary element for EDM.

      • 相符項目:指定要在完全查閱中使用的欄位。Matches: Specifies the field to be used in exact lookup. 您要在資料存放區的 EDM 結構描述中,提供可搜尋的欄位名稱。You provide a searchable field name in EDM Schema for the DataStore.
      • 分類:此欄位會指定可觸發 EDM 查閱的敏感性類型符合項目。Classification: This field specifies the sensitive type match that triggers EDM lookup. 您可以提供現有內建或自訂分類的名稱或 GUID。You can provide Name or GUID of an existing built-in or custom classification.
    • 相符項目: 此欄位會指向 idMatch 鄰近位置的其他辨識項。Match: This field points to additional evidence found in proximity of idMatch.

      • 相符項目:您要在資料存放區的 EDM 結構描述中,提供任何欄位名稱。Matches: You provide any field name in EDM Schema for DataStore.
    • 資源: 此區段會在多個地區設定中,指定敏感性類型的名稱和描述。Resource: This section specifies the name and description for sensitive type in multiple locales.

      • idRef:您要提供 ExactMatch ID 的 GUID。idRef: You provide GUID for ExactMatch ID.
      • 名稱與描述:視需要自訂。Name & descriptions: customize as required.
    <RulePackage xmlns="http://schemas.microsoft.com/office/2018/edm">
      <RulePack id="fd098e03-1796-41a5-8ab6-198c93c62b11">
        <Version build="0" major="2" minor="0" revision="0" />
        <Publisher id="eb553734-8306-44b4-9ad5-c388ad970528" />
        <Details defaultLangCode="en-us">
          <LocalizedDetails langcode="en-us">
            <PublisherName>IP DLP</PublisherName>
            <Name>Health Care EDM Rulepack</Name>
            <Description>This rule package contains the EDM sensitive type for health care sensitive types.</Description>
          </LocalizedDetails>
        </Details>
      </RulePack>
      <Rules>
        <ExactMatch id = "E1CC861E-3FE9-4A58-82DF-4BD259EAB371" patternsProximity = "300" dataStore ="PatientRecords" recommendedConfidence = "65" >
          <Pattern confidenceLevel="65">
            <idMatch matches = "SSN" classification = "U.S. Social Security Number (SSN)" />
          </Pattern>
          <Pattern confidenceLevel="75">
            <idMatch matches = "SSN" classification = "U.S. Social Security Number (SSN)" />
            <Any minMatches ="3" maxMatches ="6">
              <match matches="PatientID" />
              <match matches="MRN"/>
              <match matches="FirstName"/>
              <match matches="LastName"/>
              <match matches="Phone"/>
              <match matches="DOB"/>
            </Any>
          </Pattern>
        </ExactMatch>
        <LocalizedStrings>
          <Resource idRef="E1CC861E-3FE9-4A58-82DF-4BD259EAB371">
            <Name default="true" langcode="en-us">Patient SSN Exact Match.</Name>
            <Description default="true" langcode="en-us">EDM Sensitive type for detecting Patient SSN.</Description>
          </Resource>
        </LocalizedStrings>
      </Rules>
    </RulePackage>
    
  2. 執行下列 PowerShell Cmdlet 以上傳規則套件,一次一個:Upload the rule package by running the following PowerShell cmdlets, one at a time:

    $rulepack=Get-Content .\\rulepack.xml -Encoding Byte -ReadCount 0
    New-DlpSensitiveInformationTypeRulePackage -FileData $rulepack
    

此時,您已設定以 EDM 為基礎的分類。At this point, you have set up EDM-based classification. 下一個步驟是要對敏感性資料雜湊,然後上傳用於編製索引的雜湊。The next step is to hash the sensitive data, and then upload the hashes for indexing.

回想一下前面的程序,我們的 PatientRecords 結構描述將五個欄位定義為可搜尋:PatientIDMRNSSNPhoneDOBRecall from the previous procedure that our PatientRecords schema defines five fields as searchable: PatientID, MRN, SSN, Phone, and DOB. 我們的範例規則套件包含這些欄位,並會參照資料庫結構描述檔案 (edm.xml),一個 ExactMatch 項目會有一個可搜尋欄位。Our example rule package includes those fields and references the database schema file (edm.xml), with one ExactMatch item per searchable field. 請考慮下列 ExactMatch 項目:Consider the following ExactMatch item:

<ExactMatch id = "E1CC861E-3FE9-4A58-82DF-4BD259EAB371" patternsProximity = "300" dataStore ="PatientRecords" recommendedConfidence = "65" >
      <Pattern confidenceLevel="65">
        <idMatch matches = "SSN" classification = "U.S. Social Security Number (SSN)" />
      </Pattern>
      <Pattern confidenceLevel="75">
        <idMatch matches = "SSN" classification = "U.S. Social Security Number (SSN)" />
        <Any minMatches ="3" maxMatches ="100">
          <match matches="PatientID" />
          <match matches="MRN"/>
          <match matches="FirstName"/>
          <match matches="LastName"/>
          <match matches="Phone"/>
          <match matches="DOB"/>
        </Any>
      </Pattern>
    </ExactMatch>

請注意本範例中的下列重點:In this example, note that:

  • 資料存放區名稱會參照稍早建立的 .csv 檔案:dataStore = "PatientRecords"The dataStore name references the .csv file we created earlier: dataStore = "PatientRecords".

  • IdMatch 值會參照可供搜尋的欄位,其列於資料庫結構描述檔案:idMatch matches = "SSN"The idMatch value references a searchable field that is listed in the database schema file: idMatch matches = "SSN".

  • 分類值會參照現有或自訂機密資訊類型:classification = "U.S. Social Security Number (SSN)"The classification value references an existing or custom sensitive information type: classification = "U.S. Social Security Number (SSN)". (在此案例中,我們使用美國社會安全號碼作為現有的敏感性資訊類型)。(In this case, we use the existing sensitive information type of U.S. Social Security Number.)

注意

這可能要花 10 到 60 分鐘的時間,才能將 EDMSchema 更新為新增項目。It can take between 10-60 minutes to update the EDMSchema with additions. 在您執行使用新增項目的步驟之前,必須先完成更新。The update must complete before you execute steps that use the additions.

編輯以 EDM 為基礎的分類的結構描述Editing the schema for EDM-based classification

如果您想要變更 edm.xml 檔案,例如變更哪些欄位用於以 EDM 為基礎的分類,請遵循下列步驟進行:If you want to make changes to your edm.xml file, such as changing which fields are used for EDM-based classification, follow these steps:

  1. 編輯您的 edm.xml 檔案 (這是本文定義結構描述這一節所討論的檔案)。Edit your edm.xml file (this is the file discussed in the Define the schema section of this article).

  2. 使用連線到安全性與合規性中心 PowerShell 中的程序,連線到安全性與合規性中心。Connect to the Security & Compliance center using the procedures in Connect to Security & Compliance Center PowerShell.

  3. 若要更新資料庫結構描述,請執行下列 Cmdlet,一次一個:To update your database schema, run the following cmdlets, one at a time:

    $edmSchemaXml=Get-Content .\\edm.xml -Encoding Byte -ReadCount 0
    Set-DlpEdmSchema -FileData $edmSchemaXml -Confirm:$true
    

    系統會提示您確認,如下所示:You will be prompted to confirm, as follows:

    確認Confirm

    是否確定要執行此動作?Are you sure you want to perform this action?

    將更新資料存放區 'patientrecords' 的 EDM 結構描述。EDM Schema for the data store 'patientrecords' will be updated.

    [Y] 是 [A] 全部皆是 [N] 否 [L] 全部皆否 [?] 說明 (預設值為 "Y"):[Y] Yes [A] Yes to All [N] No [L] No to All [?] Help (default is "Y"):

    提示

    若要直接變更而不進行確認,請在步驟 3 中改用此 Cmdlet:Set-DlpEdmSchema -FileData $edmSchemaXmlIf you want your changes to occur without confirmation, in Step 3, use this cmdlet instead: Set-DlpEdmSchema -FileData $edmSchemaXml

    注意

    這可能要花 10 到 60 分鐘的時間,才能將 EDMSchema 更新為新增項目。It can take between 10-60 minutes to update the EDMSchema with additions. 在您執行使用新增項目的步驟之前,必須先完成更新。The update must complete before you execute steps that use the additions.

移除以 EDM 為基礎的分類的結構描述Removing the schema for EDM-based classification

(如有需要) 如果您想要移除 EDM 型分類使用的結構描述,請遵循下列步驟:(As needed) If you want to remove the schema you're using for EDM-based classification, follow these steps:

  1. 使用連線到安全性與合規性中心 PowerShell 中的程序,連線到安全性與合規性中心。Connect to the Security & Compliance center using the procedures in Connect to Security & Compliance Center PowerShell.

  2. 執行下列 PowerShell Cmdlet,將 "patientrecords" 的資料存放區名稱取代為您要移除的資料存放區名稱:Run the following PowerShell cmdlets, substituting the data store name of "patient records" with the one you want to remove:

    Remove-DlpEdmSchema -Identity patientrecords
    

    系統會提示您確認:You will be prompted to confirm:

    確認Confirm

    是否確定要執行此動作?Are you sure you want to perform this action?

    將移除資料存放區 'patientrecords' 的 EDM 結構描述。EDM Schema for the data store 'patientrecords' will be removed.

    [Y] 是 [A] 全部皆是 [N] 否 [L] 全部皆否 [?] 說明 (預設值為 "Y"):[Y] Yes [A] Yes to All [N] No [L] No to All [?] Help (default is "Y"):

    提示

    若要直接變更而不進行確認,請在步驟 2 中改用此 Cmdlet:Remove-DlpEdmSchema -Identity patientrecords -Confirm:$falseIf you want your changes to occur without confirmation, in Step 2, use this cmdlet instead: Remove-DlpEdmSchema -Identity patientrecords -Confirm:$false

第 2 部分:雜湊及上傳敏感性資料Part 2: Hash and upload the sensitive data

在此階段中,您要設定自訂安全性群組和使用者帳戶,並設定 EDM Upload Agent tool 上傳代理工具。In this phase, you set up a custom security group and user account, and set up the EDM Upload Agent tool. 然後,您可以對敏感數據使用該工具在雜湊中加入字串,然後將其上傳。Then, you use the tool to hash with salt value the sensitive data, and upload it.

雜湊和上傳可以使用一部電腦來完成,或者您也可以將雜湊步驟與上傳步驟分開,以提高安全性。The hashing and uploading can be done using one computer or you can separate the hashing step from the upload step for greater security.

如果您想要從一部電腦進行雜湊和上傳,您必須從一部可直接連線至 Microsoft 365 租用者的電腦執行。If you want to hash and upload from one computer, you need to do it from a computer that can directly connect to your Microsoft 365 tenant. 這要求您明文的敏感性資料在該電腦上進行雜湊。This requires that your clear text sensitive data files are on that computer for hashing.

如果您不想公開明文機密的資料檔,可以在安全位置的電腦上雜湊,然後將雜湊檔和鹽檔複製到可直接連線到 Microsoft 365 租用者的電腦。If you do not want to expose your clear text sensitive data file, you can hash it on a computer in a secure location and then copy the hash file and the salt file to a computer that can directly connect to your Microsoft 365 tenant for upload. 在這個案例中,您將需要在兩部電腦上都有 EDMUploadAgent。In this scenario, you will need the EDMUploadAgent on both computers.

必要條件Prerequisites

  • Microsoft 365的工作或學校帳戶, 該帳戶將新增至 EDM_DataUploaders 的安全性群組a work or school account for Microsoft 365 that will be added to the EDM_DataUploaders security group
  • Windows 10 或 Windows Server 2016 電腦,其中包含執行 EDMUploadAgent 的 .NET 版本4.6.2a Windows 10 or Windows Server 2016 machine with .NET version 4.6.2 for running the EDMUploadAgent
  • 在你所上傳電腦上的目錄有:a directory on your upload machine for the:
    • EDMUploadAgentEDMUploadAgent
    • 在我們的範例中,您在 csv 格式 PatientRecords 的機密項目檔案your sensitive item file in csv format PatientRecords.csv in our examples
    • 以及輸出雜湊和鹽數值檔案and the output hash and salt files
    • edm.xml 檔案的資料存儲名稱,在這個範例中的如其 PatientRecordsthe datastore name from the edm.xml file, for this example its PatientRecords

設定安全性群組和使用者帳戶Set up the security group and user account

  1. 以全域系統管理員身分,使用適用於您訂閱的連結前往系統管理中心,並建立名為 EDM_DataUploaders安全性群組As a global administrator, go to the admin center using the appropriate link for your subscription and create a security group called EDM_DataUploaders.

  2. 將一或多個使用者新增至 EDM_DataUploaders 安全性群組。Add one or more users to the EDM_DataUploaders security group. (這些使用者將管理敏感性資訊的資料庫)。(These users will manage the database of sensitive information.)

雜湊並從一部電腦上傳Hash and upload from one computer

此電腦必須能夠直接存取您的 Microsoft 365 租用者。This computer must have direct access to your Microsoft 365 tenant.

注意

開始此程序之前,請確認您是 EDM_DataUploaders 安全性群組的成員。Before you begin this procedure, make sure that you are a member of the EDM_DataUploaders security group.

  1. 為 EDMUploadAgent 建立工作目錄。Create a working directory for the EDMUploadAgent. 例如, C:\EDM\DataFor example, C:\EDM\Data. PatientRecords 檔案放在這裡。Place the PatientRecords.csv file there.

  2. 把適合您的訂閱, 下載並安裝到EDM 上傳代理, 步驟1您所建立目錄中 。Download and install the appropriate EDM Upload Agent for your subscription into the directory you created in step 1.

注意

上方連結的 EDMUploadAgent 已更新,可自動為雜湊資料新增鹽值。The EDMUploadAgent at the above links has been updated to automatically add a salt value to the hashed data. 或者,您也可以提供自己的鹽值。Alternately, you can provide your own salt value. 使用此版本後,您將無法使用舊版的 EDMUploadAgent。Once you have used this version, you will not be able to use the previous version of the EDMUploadAgent.

您每天最多可以使用 EDMUploadAgent 將資料上傳到任何指定的資料儲存區兩次。You can upload data with the EDMUploadAgent to any given data store only twice per day.

提示

若要取得所支援命令參數的清單,請執行 agent no 無引數。To a get a list out of the supported command parameters, run the agent no arguments. 例如 'EdmUploadAgent.exe'。For example 'EdmUploadAgent.exe'.

  1. 授權 EDM 上傳代理、開啟命令提示字元視窗(以系統管理員身分),切換至 C:\EDM\Data 目錄,然後執行下列命令:Authorize the EDM Upload Agent, open Command Prompt window (as an administrator), switch to the C:\EDM\Data directory and then run the following command:

    EdmUploadAgent.exe /Authorize

  2. 用您已加入EDM_DataUploaders 安全性群組的Microsoft 365的工作或學校帳戶來登入.Sign in with your work or school account for Microsoft 365 that was added to the EDM_DataUploaders security group. 您的租戶信息將從用戶帳戶中提取出來以建立連接。Your tenant information is extracted from the user account to make the connection.

  3. 若要為敏感性資料雜湊並上傳,請在Command Prompt 命令提示字元視窗中執行下列命令:To hash and upload the sensitive data, run the following command in Command Prompt window:

EdmUploadAgent.exe /UploadData /DataStoreName \<DataStoreName\> /DataFile \<DataFilePath\> /HashLocation \<HashedFileLocation\>

範例: EdmUploadAgent/UploadData/DataStoreName PatientRecords/DataFile C:\Edm\Hash\PatientRecords.csv/HashLocation C:\Edm\HashExample: EdmUploadAgent.exe /UploadData /DataStoreName PatientRecords /DataFile C:\Edm\Hash\PatientRecords.csv /HashLocation C:\Edm\Hash

這會自動在雜湊中添加隨機生成的鹽值,以提高安全性。This will automatically add a randomly generated salt value to the hash for greater security. 或者,如果您想要使用自己的加密鹽值,請在命令列中新增 /Salt Optionally, if you want to use your own salt value, add the /Salt to the command. 此值必須是64個字元,且只能包含 a-z 和0-9 個字元。This value must be 64 characters in length and can only contain the a-z characters and 0-9 characters.

  1. 執行此命令以查看上傳狀態:Check the upload status by running this command:

EdmUploadAgent.exe /GetSession /DataStoreName \<DataStoreName\>

範例: EdmUploadAgent/GetSession/DataStoreName PatientRecordsExample: EdmUploadAgent.exe /GetSession /DataStoreName PatientRecords

尋找 ProcessingInProgress的狀態。Look for the status to be in ProcessingInProgress. 每隔幾分鐘再次檢查,直到狀態變更為 完成Check again every few minutes until the status changes to Completed. 狀態完成後,您的 EDM 資料就可以使用了。Once the status is completed, your EDM data is ready for use.

雜湊和上傳分開Separate Hash and upload

在安全的環境中,在電腦上執行雜湊。Perform the hash on a computer in a secure environment.

  1. 在Command Prompt 命令提示視窗中,執行下列命令:Run the following command in Command Prompt windows:

EdmUploadAgent.exe /CreateHash /DataFile \<DataFilePath\> /HashLocation \<HashedFileLocation\>

例如:For example:

EdmUploadAgent/CreateHash/DataFile C:\Edm\Data\PatientRecords.csv/HashLocation C:\Edm\HashEdmUploadAgent.exe /CreateHash /DataFile C:\Edm\Data\PatientRecords.csv /HashLocation C:\Edm\Hash

如果您沒有指定 [/Salt ] 選項,則會輸出雜湊檔和含這些副檔名的鹽值檔案:This will output a hashed file and a salt file with these extensions if you didn't specify the /Salt option:

  • .EdmHash.EdmHash
  • .EdmSalt.EdmSalt
  1. 請以安全的方式, 將這些檔案複製到您用來上傳機密專案 csv 檔案(PatientRecords)的電腦。Copy these files in a secure fashion to the computer you will use to upload your sensitive items csv file (PatientRecords) to your tenant.

若要上傳已雜湊的資料,請在 Windows 命令提示字元中執行下列命令:To upload the hashed data, run the following command in Windows Command Prompt:

EdmUploadAgent.exe /UploadHash /DataStoreName \<DataStoreName\> /HashFile \<HashedSourceFilePath\>

例如:For example:

EdmUploadAgent.exe /UploadHash /DataStoreName PatientRecords /HashFile C:\Edm\Hash\PatientRecords.EdmHashEdmUploadAgent.exe /UploadHash /DataStoreName PatientRecords /HashFile C:\Edm\Hash\PatientRecords.EdmHash

若要確認您的敏感性資料已上傳,請在命令提示字元中執行下列命令:To verify that your sensitive data has been uploaded, run the following command in Command Prompt window:

EdmUploadAgent.exe /GetDataStore

您會看到資料存放區的清單,以及其上次更新時間。You'll see a list of data stores and when they were last updated.

如果您想要查看上傳到特定儲存區的所有資料,請在 Windows 命令提示字元中執行下列命令:If you want to see all the data uploads to a particular store, run the following command in a Windows command prompt:

EdmUploadAgent.exe /GetSession /DataStoreName <DataStoreName>

針對重新整理您的敏感性資訊資料庫,繼續設定程序和排程。Proceed to set up your process and schedule for Refreshing your sensitive information database.

此時,您已準備好使用以 EDM 為基礎的分類搭配 Microsoft 雲端服務。At this point, you are ready to use EDM-based classification with your Microsoft cloud services. 例如,您可以使用以 EDM 為基礎的分類來設定 DLP 原則For example, you can set up a DLP policy using EDM-based classification.

重新整理您的敏感性資訊資料庫Refreshing your sensitive information database

您可以每天重新整理您的機密資訊資料庫,而 EDM 上傳工具可以將機密資料重新編制索引,然後重新上傳 已編制索引的資料。You can refresh your sensitive information database daily, and the EDM Upload Tool can reindex the sensitive data and then reupload the indexed data.

  1. 決定您重新整理敏感性資訊資料庫的程序和頻率 (每日或每週)。Determine your process and frequency (daily or weekly) for refreshing the database of sensitive information.

  2. 將敏感性資料重新匯出至應用程式,例如 Microsoft Excel,並將檔案儲存為 .csv 格式。Re-export the sensitive data to an app, such as Microsoft Excel, and save the file in .csv format. 遵循雜湊及上傳敏感性資料中所述的步驟時,請保留所使用的相同檔案名稱和位置。Keep the same file name and location you used when you followed the steps described in Hash and upload the sensitive data.

    注意

    如果 .csv 檔案的結構 (欄位名稱) 沒有任何變更,重新整理資料時,您不需要對資料庫結構描述檔案進行任何變更。If there are no changes to the structure (field names) of the .csv file, you won't need to make any changes to your database schema file when you refresh the data. 但如果您必須進行變更,請務必相應地編輯資料庫結構描述和規則套件。But if you must make changes, make sure to edit the database schema and your rule package accordingly.

  3. 使用工作排程器來將雜湊及上傳敏感性資料程序中的步驟 2 和 3 自動化。Use Task Scheduler to automate steps 2 and 3 in the Hash and upload the sensitive data procedure. 您可以使用數個方法來排程工作:You can schedule tasks using several methods:

    方法Method 處理方式What to do
    Windows PowerShellWindows PowerShell 請參閱 ScheduledTasks 文件,以及本文中的範例 PowerShell 指令碼See the ScheduledTasks documentation and the example PowerShell script in this article
    工作排程器 APITask Scheduler API 請參閱工作排程器文件See the Task Scheduler documentation
    Windows 使用者介面Windows user interface 在 Windows 中,按一下 [開始 ],然後輸入「工作排程器」。In Windows, click Start, and type Task Scheduler. 然後在結果清單中,以滑鼠右鍵按一下 [工作排程器 ],然後選擇 [以系統管理員身分執行 ]Then, in the list of results, right-click Task Scheduler, and choose Run as administrator.

工作排程器的範例 PowerShell 指令碼Example PowerShell script for Task Scheduler

本節包含的範例 PowerShell 指令碼,可供您用來對雜湊資料及上傳已雜湊的資料工作進行排程:This section includes an example PowerShell script you can use to schedule your tasks for hashing data and uploading the hashed data:

在相同的步驟中排程雜湊並上傳To schedule hashing and upload in a combined step
param(\[string\]$dataStoreName,\[string\]$fileLocation)
\# Assuming current user is also the user context to run the task
$user = "$env:USERDOMAIN\\$env:USERNAME"
$edminstallpath = 'C:\\Program Files\\Microsoft\\EdmUploadAgent\\'
$edmuploader = $edminstallpath + 'EdmUploadAgent.exe'
$csvext = '.csv'
\# Assuming CSV file name is same as data store name
$dataFile = "$fileLocation\\$dataStoreName$csvext"
\# Assuming location to store hash file is same as the location of csv file
$hashLocation = $fileLocation
$uploadDataArgs = '/UploadData /DataStoreName ' + $dataStoreName + ' /DataFile ' + $dataFile + ' /HashLocation' + $hashLocation
\# Set up actions associated with the task
$actions = @()
$actions += New-ScheduledTaskAction -Execute $edmuploader -Argument $uploadDataArgs -WorkingDirectory $edminstallpath
\# Set up trigger for the task
$trigger = New-ScheduledTaskTrigger -Weekly -DaysOfWeek Sunday -At 2am
\# Set up task settings
$principal = New-ScheduledTaskPrincipal -UserId $user -LogonType S4U -RunLevel Highest
$settings = New-ScheduledTaskSettingsSet -RunOnlyIfNetworkAvailable -StartWhenAvailable -WakeToRun
\# Create the scheduled task
$scheduledTask = New-ScheduledTask -Action $actions -Principal $principal -Trigger $trigger -Settings $settings
\# Get credentials to run the task
$creds = Get-Credential -UserName $user -Message "Enter credentials to run the task"
$password=\[Runtime.InteropServices.Marshal\]::PtrToStringAuto(\[Runtime.InteropServices.Marshal\]::SecureStringToBSTR($creds.Password))
\# Register the scheduled task
$taskName = 'EDMUpload\_' + $dataStoreName
Register-ScheduledTask -TaskName $taskName -InputObject $scheduledTask -User $user -Password $password

在個別的步驟中排程雜湊和上傳To schedule hashing and upload as separate steps

param(\[string\]$dataStoreName,\[string\]$fileLocation)
\# Assuming current user is also the user context to run the task
$user = "$env:USERDOMAIN\\$env:USERNAME"
$edminstallpath = 'C:\\Program Files\\Microsoft\\EdmUploadAgent\\'
$edmuploader = $edminstallpath + 'EdmUploadAgent.exe'
$csvext = '.csv'
$edmext = '.EdmHash'
\# Assuming CSV file name is same as data store name
$dataFile = "$fileLocation\\$dataStoreName$csvext"
$hashFile = "$fileLocation\\$dataStoreName$edmext"
\# Assuming location to store hash file is same as the location of csv file
$hashLocation = $fileLocation
$createHashArgs = '/CreateHash' + ' /DataFile ' + $dataFile + ' /HashLocation ' + $hashLocation
$uploadHashArgs = '/UploadHash /DataStoreName ' + $dataStoreName + ' /HashFile ' + $hashFile
\# Set up actions associated with the task
$actions = @()
$actions += New-ScheduledTaskAction -Execute $edmuploader -Argument $createHashArgs -WorkingDirectory $edminstallpath
$actions += New-ScheduledTaskAction -Execute $edmuploader -Argument $uploadHashArgs -WorkingDirectory $edminstallpath
\# Set up trigger for the task
$trigger = New-ScheduledTaskTrigger -Weekly -DaysOfWeek Sunday -At 2am
\# Set up task settings
$principal = New-ScheduledTaskPrincipal -UserId $user -LogonType S4U -RunLevel Highest
$settings = New-ScheduledTaskSettingsSet -RunOnlyIfNetworkAvailable -StartWhenAvailable -WakeToRun
\# Create the scheduled task
$scheduledTask = New-ScheduledTask -Action $actions -Principal $principal -Trigger $trigger -Settings $settings
\# Get credentials to run the task
$creds = Get-Credential -UserName $user -Message "Enter credentials to run the task"
$password=\[Runtime.InteropServices.Marshal\]::PtrToStringAuto(\[Runtime.InteropServices.Marshal\]::SecureStringToBSTR($creds.Password))
\# Register the scheduled task
$taskName = 'EDMUpload\_' + $dataStoreName
Register-ScheduledTask -TaskName $taskName -InputObject $scheduledTask -User $user -Password $password

第 3 部分:使用以 EDM 為基礎的分類搭配 Microsoft 雲端服務Part 3: Use EDM-based classification with your Microsoft cloud services

這些位置支援 EDM 敏感性資訊類型:These locations are support EDM sensitive information types:

  • 適用於 Exchange Online 的 DLP (電子郵件)DLP for Exchange Online (email)
  • 商務用 OneDrive (檔案)OneDrive for Business (files)
  • Microsoft Teams (交談)Microsoft Teams (conversations)
  • 適用於 SharePoint 的 DLP (檔案)DLP for SharePoint (files)
  • Microsoft Cloud App Security DLP 原則Microsoft Cloud App Security DLP policies

下列案例的 EDM 敏感性資訊類型目前正在開發中,尚未提供使用:EDM sensitive information types for following scenarios are currently in development, but not yet available:

  • 自動分類敏感度標籤和保留標籤Auto-classification of sensitivity labels and retention labels

使用 EDM 建立 DLP 原則To create a DLP policy with EDM

  1. 使用適用於您的訂閱的連結,移至安全性與合規性中心。Go to the Security & Compliance Center using the appropriate link for your subscription.

  2. 選擇 [資料外洩防護 ] > [原則 ]Choose Data loss prevention > Policy.

  3. 選擇 [建立原則 ] > [自訂 ] > [下一步 ]Choose Create a policy > Custom > Next.

  4. 在 [為您的原則命名 ] 索引標籤下,指定名稱和描述,然後選擇 [下一步 ]On the Name your policy tab, specify a name and description, and then choose Next.

  5. 在 [選擇位置 ] 索引標籤上,選取 [讓我選擇特定位置 ],然後選擇 [下一步 ]On the Choose locations tab, select Let me choose specific locations, and then choose Next.

  6. 在 [狀態 ] 資料行中,選取 [Exchange 電子郵件、OneDrive 帳戶、Teams 交談和頻道訊息 ],然後選擇 [下一步 ]In the Status column, select Exchange email, OneDrive accounts, Teams chat and channel message , and then choose Next.

  7. 在 [原則設定 ] 索引標籤上,選擇 [使用進階設定 ],然後選擇 [下一步 ]On the Policy settings tab, choose Use advanced settings, and then choose Next.

  8. 選擇 [+ 新增規則 ]Choose + New rule.

  9. 在 [名稱 ] 區段中,指定規則的名稱和描述。In the Name section, specify a name and description for the rule.

  10. 在 [條件 ] 區段中,於 [+ 新增條件 ] 清單中,選擇 [內容包含敏感性類型 ]In the Conditions section, in the + Add a condition list, choose Content contains sensitive type.

    內容包含敏感性資訊類型

  11. 搜尋您設定規則套件時建立的敏感性資訊類型,然後選擇 [+ 新增 ]Search for the sensitive information type you created when you set up your rule package, and then choose + Add.
    然後選擇 [完成 ]Then choose Done.

  12. 完成選取規則的選項,例如使用者通知使用者覆寫事件報告,依此類推,然後選擇 [儲存]****。Finish selecting options for your rule, such as User notifications, User overrides, Incident reports, and so on, and then choose Save.

  13. 在 [原則設定]**** 索引標籤上,檢閱您的規則,然後選擇 [下一步]****。On the Policy settings tab, review your rules, and then choose Next.

  14. 指定是否立即開啟原則、測試它,或是保持關閉。Specify whether to turn on the policy right away, test it out, or keep it turned off. 接著選擇 [下一步]****。Then choose Next.

  15. 在 [檢閱您的設定]**** 索引標籤上,檢閱您的原則。On the Review your settings tab, review your policy. 視需要進行變更。Make any needed changes. 完成後,選擇 [建立]****。When you're ready, choose Create.

注意

允許大約一小時的時間,讓您的新 DLP 原則在您的整個資料中心生效。Allow approximately one hour for your new DLP policy to work its way through your data center.