在高級電子檔探索中設定搜尋及分析設定Configure search and analytics settings in Advanced eDiscovery

您可以設定每個高級 eDiscovery 案例的設定,以控制下列功能。You can configure settings for each Advanced eDiscovery case to control the following functionality.

  • 接近重複專案和電子郵件執行緒Near duplicates and email threading

  • 佈景主題Themes

  • 自動產生的複查集查詢Autogenerated review set query

  • 忽略文字Ignore text

  • 光學字元辨識Optical character recognition

若要設定案例的搜尋及分析設定:To configure search and analytics settings for a case:

  1. 在 [ 高級電子 檔探索] 頁面上,選取案例。On the Advanced eDiscovery page, select the case.

  2. 在 [ 設定 ] 索引標籤的 [ 搜尋 & 分析] 底下,按一下 [ 選取]。On the Settings tab, under Search & analytics, click Select.

    [案例設定] 頁面隨即顯示。The case settings page is displayed. 這些設定會套用至案例中的所有複查集。These settings are applied to all review sets in a case.

    設定高級 eDiscovery 案例的分析和搜尋設定

接近重複專案和電子郵件執行緒Near duplicates and email threading

在本節中,您可以設定重複偵測、接近重複偵測和電子郵件執行緒的參數。In this section, you can set parameters for duplicate detection, near duplicate detection, and email threading. 如需詳細資訊,請參閱 近期重複偵測電子郵件執行緒For more information, see Near duplicate detection and Email threading.

  • 接近重複/電子郵件執行緒: 開啟時,當您對檢查集中的資料執行分析時,會在工作流程中包含重複偵測、重複偵測和電子郵件執行緒。Near duplicates/email threading: When turned on, duplicate detection, near duplicate detection, and email threading are included as part of the workflow when you run analytics on the data in a review set.

  • 檔和電子郵件相似性臨界值: 如果兩個檔的相似性層級超過臨界值,這兩個檔會放在相同的接近重複的集合中。Document and email similarity threshold: If the similarity level for two documents is above the threshold, both documents are put in the same near duplicate set.

  • 最小/文字數目上限: 這些設定會指定接近重複專案和電子郵件執行緒分析,只會在至少具有至少字數最少的檔上執行,最多的字數目。Minimum/maximum number of words: These settings specify that near duplicates and email threading analysis are performed only on documents that have at least the minimum number of words and at most the maximum number of words.

佈景主題Themes

在本節中,您可以為主題設定參數。In this section, you can set parameters for themes. 如需詳細資訊,請參閱 ThemesFor more information, see Themes.

  • 主題: 開啟時,當您對複查集內的資料執行分析時,會以工作流程的一部分來執行主題聚簇。Themes: When turned on, themes clustering is performed as part of the workflow when you run analytics on the data in a review set.

  • 主題數目上限: 指定當您對複查集內的資料執行分析時,可產生的主題數目上限。Maximum number of themes: Specifies the maximum number of themes that can be generated when you run analytics on the data in a review set.

  • 在主題中包含編號: 開啟時,會在產生主題時包含識別主題) 的數位 (。Include numbers in themes: When turned on, numbers (that identify a theme) are included when generating themes.

  • 動態調整主題數目上限: 在某些情況下,審閱集中的檔可能不足以產生所需的主題數目。Adjust maximum number of themes dynamically: In certain situations, there may not be enough documents in a review set to produce the desired number of themes. 啟用此設定時,「高級 eDiscovery」會動態調整主題的數目上限,而不是嘗試強制執行主題的最大數目。When this setting is enabled, Advanced eDiscovery adjusts the maximum number of themes dynamically rather than attempting to enforce the maximum number of themes.

審閱集合查詢Review set query

如果您選取 [在 分析後自動建立用於審閱已儲存的搜尋 ] 核取方塊,則會有「高級 eDiscovery autogenerates 複查集」查詢(命名 為待複查If you select the Automatically create a For Review saved search after analytics checkbox, Advanced eDiscovery autogenerates review set query named For Review.

進行中的審閱自動產生查詢

此查詢基本上會篩選出評審集中的重複專案。This query basically filters out duplicate items from the review set. 這可讓您檢查審閱集中的唯一專案。This lets you review the unique items in the review set. 只有在您為案例中的複查集執行分析時,才會建立此查詢。This query is created only when you run analytics for a review set in the case. 如需詳細資訊,請參閱關於複查集查詢的資訊,請參閱 在審閱集中查詢資料For more information, about review set queries, see Query the data in a review set.

忽略文字Ignore text

在某些情況下,有些文字會降低分析品質,例如在電子郵件中加入的冗長免責聲明,不論電子郵件的內容為何。There are situations where certain text will diminish the quality of analytics, such as lengthy disclaimers that get added to email messages regardless of the content of the email. 如果您知道應該忽略的文字,您可以指定文字字串及分析功能 (接近重複的電子郵件執行緒、主題及相關性) 應排除的文字,以將其排除在分析之外。If you know of text that should be ignored, you can exclude it from analytics by specifying the text string and the analytics functionality (Near-duplicates, Email threading, Themes, and Relevance) that the text should be excluded for. 也支援使用正則運算式 (RegEx) 為略過的文字。Using regular expressions (RegEx) as ignored text is also supported.

光學字元辨識 (OCR) Optical character recognition (OCR)

開啟此設定時,會在圖像檔案上執行 OCR 處理。When this setting is turned on, OCR processing will be run on image files. 在下列情況下,會執行 OCR 處理:OCR processing is run in the following situations:

  • 當保管人和 非 custodial 資料來源 新增至案例時。When custodians and non-custodial data sources are added to a case. 在高級索引處理常式期間會執行 OCR 處理。OCR processing is performed during the Advanced indexing process. 這表示會在集合搜尋中傳回符合搜尋準則的映射檔中的文字。This means that text in image files that matches the search criteria will be returned in a collection search.

  • 當其他資料 (源的內容未與保管人相關聯並新增至非 custodial 資料來源中的案例時) 會新增至審閱集。When content from other data sources (that aren't associated with a custodian and added to the case in a non-custodial data source) is added to a review set.

將資料新增至審閱集合之後,就可以檢查、搜尋、標記和分析影像文字。After data is added to a review set, image text can be reviewed, searched, tagged, and analyzed. 您可以在 [審閱] 集中的選取影像檔的文字檢視器中查看解壓縮的文字。You can view the extracted text in the Text viewer of the selected image file in the review set. 如需詳細資訊,請參閱:For more information, see: