在 SharePoint Server 中建立及部署自訂實體擷取器Create and deploy custom entity extractors in SharePoint Server

摘要: 了解如何建立自訂實體擷取器以及如何用來設定自訂精簡器。建立一或多個自訂實體擷取字典,並連接到 Managed 屬性。Summary: Learn to create custom entity extractors and how to use them to set up custom refiners. Create one or more custom entity extraction dictionaries and connect them to managed properties.

您可以在 SharePoint Server 的外部系統中建立及維護自訂實體擷取器檔案,再匯入 SharePoint Server ,以確保搜尋系統可以使用此自訂實體擷取器。You create and maintain the custom entity extractor file in a system external to SharePoint Server before you import it into SharePoint Server to make the custom entity extractor available to the search system.

若要使用自訂實體作為精簡器,請先建立及部署自訂實體擷取字典。然後設定 Managed 屬性,以使用自訂實體擷取器並執行完整編目。在這之後,即可設定搜尋結果頁面上的精簡搜尋網頁組件,以使用自訂實體作為精簡器。To use custom entities as refiners, you first create a custom entity extraction dictionary and deploy it. Then, you configure a managed property to use a custom entity extractor and run a full crawl. After that, you can configure the Refinement Web Part on the search results page to use the custom entity as a refiner.

開始之前Before you begin

在開始這項作業之前,您必須擁有的項目:Before you begin this operation, you must have have in place:

  • Search Service 應用程式A Search service application

  • 至少有一個完整已編目的內容來源One or more fully crawled content sources

  • 搜尋結果頁面A search results page

建立自訂實體擷取字典Create a custom entity extraction dictionary

若要建立自訂實體擷取字典To create a custom entity extraction dictionary

  1. 決定您要建立的自訂實體擷取字典類型:全字相符 (不分大小寫)、全字部分相符 (不分大小寫)、全字相符 (區分大小寫) 或全字部分相符 (區分大小寫)。請參閱<自訂實體擷取器類型概觀>。Determine which type of custom entity extraction dictionary you want to create: Word, Word Part, Word exact or Word Part exact. See Overview of custom entity extractor types.

  2. 建立.csv 檔案與金鑰顯示表單的資料行。請確定您使用逗號做為資料行分隔符號。如果檔案包含非 ASCII 字元例如讀音符號,您必須將中 utf-8 編碼。將檔案儲存到可從您要從中執行 Microsoft PowerShell 指令程式來部署自訂實體擷取字典的伺服器存取的位置。Create a .csv file with the columns Key and Display Form. Make sure you use a comma as the column separator. If the file contains non-ASCII characters such as diacritics, you must encode it in UTF-8. Save the file to a location that is accessible from the server from which you will run the Microsoft PowerShell cmdlet to deploy the custom entity extraction dictionary.

    • 在 [索引鍵] 欄中,輸入您想要包含為自訂實體的字詞 (單一或多個字)。您可以使用每個索引鍵的多個列。請確定有條款周圍沒有前端或尾端空格。In the Key column, enter the term (single or multiple words) that you want to include as custom entities. You can use more than one line per key. Make sure there are no leading or trailing spaces around the terms.

    • (選用)在 [顯示表單] 欄中,輸入精簡器名稱。如果您將此資料行保留空白,從內容擷取的字詞會顯示成相同大小寫的精簡器,發生內容中。使用控制項顯示表單資料行及標準化用以顯示精簡器的方式。(Optional) In the Display form column, enter a refiner name. If you leave this column empty, the term that is extracted from the content will be displayed as the refiner in the same case as it occurs in the content. Use the Display Form column to control and standardize the way in which the refiner is displayed.

例如,名為 Contoso 的組織有三種等級的認證系統:Contoso Beginner、Contoso Professional 和 Contoso Expert。Contoso 將擷取這些實體並且進行調整。無論寫入的字詞是 "Contoso"、 "beginner"、"professional" 還是 "expert",都將使精簡器顯示為 Contoso BeginnerContoso ProfessionalContoso Expert 。由於此範例,自訂實體擷取字典輸入能夠顯示為:For example, an organization named Contoso has a certification system with three levels: Contoso Beginner, Contoso Professional and Contoso Expert. Contoso wants to extract these entities and wants to be able to refine on all of them. Regardless of the case in which the word "Contoso", "beginner", "professional" or "expert" is written, they want to display the refiner as Contoso Beginner, Contoso Professional and Contoso Expert. For this example, the custom entity extraction dictionary file input could look like this:

Key,Display form
Contoso Beginner,Contoso Beginner
Contoso B1,Contoso Beginner
Contoso Professional,Contoso Professional
Contoso prof,Contoso Professional
Contoso Expert,Contoso Expert

部署自訂實體擷取字典Deploy a custom entity extraction dictionary

若要部署自訂實體擷取字典,您必須將此字典匯入 SharePoint Server。To deploy the custom entity extraction dictionary, you must import it into SharePoint Server.

若要匯入自訂實體擷取字典To import a custom entity extraction dictionary

  1. 確認匯入自訂實體擷取器字典的使用者帳戶是否為 Search Service 應用程式的管理員。Verify that the user account that is importing the custom entity extractor dictionary is an administrator for the Search service application.

  2. 啟動 SharePoint 管理命令介面。Start the SharePoint Management Shell.

  3. 在 Windows PowerShell 命令提示字元處,輸入下列命令:At the Windows PowerShell command prompt, type the following command:

    $searchApp = Get-SPEnterpriseSearchServiceApplication
    Import-SPEnterpriseSearchCustomExtractionDictionary -SearchApplication $searchApp -Filename <Path> -DictionaryName <Dictionary name> 
    

    其中:Where:

    • <Path> 指定要匯入之 .csv 檔案 (自訂擷取字典) 的完整 UNC 路徑。<Path> specifies the full UNC path of the .csv file (the custom extraction dictionary) to be imported.

    • <Dictionary name> 是自訂擷取字典類型的名稱。<Dictionary name> is the name of the type of the custom extraction dictionary.

      根據您匯入的字典類型,輸入下列其中一項:Depending on which type of dictionary you are importing, enter one of the following:

      • Microsoft.UserDictionaries.EntityExtraction.Custom.Word. n [其中 n = 1、2、3、4 或 5]Microsoft.UserDictionaries.EntityExtraction.Custom.Word. n [where n = 1,2,3,4 or 5]

      • Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1

      • Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart. n [其中 n = 1、2、3、4 或 5]Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart. n [where n = 1,2,3,4 or 5]

      • Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1

設定自訂實體擷取的 Managed 屬性Configure a managed property for custom entity extraction

下列程序說明如何建立自訂實體擷取字典與您要擷取自訂實體之現有 Managed 屬性的關聯。一般而言,這會是您預計要包含這些實體的 Managed 屬性,例如 Managed 屬性 TitleBody 。即使這些內容中的區段標示為 <no index> ,仍是從相關聯的 Managed 屬性中擷取自訂實體。The following procedure describes how to associate the custom entity extraction dictionary with an existing managed property from which you want to extract custom entities. Typically, this is a managed property that you expect to contain these entities, such as the managed properties Title or Body. Custom entities are extracted from the full contents of the managed property they are associated with, even if sections in those contents are tagged as <no index>.

若要指定應擷取自訂實體的現有 Managed 屬性,您可以編輯現有的 Managed 屬性。如需管理編目屬性和 Managed 屬性的詳細資訊,請參閱<在 SharePoint Server 中管理搜尋結構描述>。To specify from which existing managed property custom entities should be extracted, you edit the existing managed property. For more information about managing crawled and managed properties, see Manage the search schema in SharePoint Server.

若要編輯自訂實體擷取的 Managed 屬性To edit a managed property for custom entity extraction

  1. 確認使用者帳戶是否為 Search Service 應用程式的管理員。Verify that the user account is the administrator of the Search service application.

  2. 在管理中心中,按一下 [應用程式管理] 區段的 [管理服務應用程式]。In Central Administration, in the Application Management section, click Manage service applications.

  3. 按一下 [Search Service 應用程式]。Click the Search service application.

  4. 在「搜尋管理」頁面的 [快速啟動] 中,按一下 [查詢與結果] 底下的 [搜尋結構描述]。On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.

  5. Managed 屬性] 頁面上找到 managed 的屬性要關聯使用的自訂實體擷取字典包含單一或多個字 (或多個字組件)。您也可以在 [篩選] 方塊中輸入受管理屬性的名稱。On the Managed Properties page, find the managed property that you want to associate the custom entity extraction dictionary with that contains the single or multiple words (or word parts). You can also enter the name of the managed property in the Filter box.

  6. 指向 managed 屬性、 按一下箭號然後再按一下 [編輯/對應屬性Point to the managed property, click the arrow and then click Edit/Map property.

  7. 在 [編輯 Managed 屬性] 頁面編輯自訂實體擷取] 下的設定。選取 [匯入、 自訂實體擷取字典,並再按一下 [確定]On the Edit Managed Property page, edit the settings under Custom entity extraction. Select the custom entity extraction dictionary that you have imported, and then click OK.

下一次編目完成時,便會啟用自訂實體擷取器。原始的 Managed 屬性內容則會儲存在搜尋索引中而保持不變。此外,取決於啟用的自訂實體擷取器類型,所擷取的實體會複製到下列 Managed 屬性中的一個或多個內:WordCustomRefiner1、WordCustomRefiner2、WordCustomRefiner3、WordCustomRefiner4、WordCustomRefiner5WordExactCustomRefinerWordPartCustomRefiner1、WordPartCustomRefiner2、WordPartCustomRefiner3。WordPartCustomRefiner4、WordPartCustomRefiner5WordPartExactCustomRefinerThese 等 Managed 屬性則會自動設定為可搜尋、可查詢、可擷取、可排序和可精簡搜尋。After the next full crawl has completed, the custom entity extractor is enabled. The original managed property content is saved unchanged in the search index. In addition, depending on the type of custom entity extractor you have enabled, the extracted entities are copied to one or more of the following managed properties:WordCustomRefiner1, WordCustomRefiner2, WordCustomRefiner3, WordCustomRefiner4, WordCustomRefiner5WordExactCustomRefinerWordPartCustomRefiner1, WordPartCustomRefiner2, WordPartCustomRefiner3. WordPartCustomRefiner4, WordPartCustomRefiner5WordPartExactCustomRefinerThese managed properties are automatically configured to be searchable, queryable, retrievable, sortable and refinable.

在網頁組件中設定精簡器Configure a refiner in the Web Part

您可以使用擷取的自訂實體作為搜尋結果頁面中的精簡器。以自訂實體為基礎的精簡器可在精簡搜尋網頁組件中使用。You can use the extracted custom entities as refiners in the search results page. The refiners based on the custom entities are available in the Refinement Web Part.

若要新增以自訂實體擷取器為基礎的精簡器To add a refiner based on a custom entity extractor

  1. 確認執行此程序的使用者帳戶為搜尋中心網站上設計者 SharePoint 群組的成員。Verify that the user account that performs this procedure is a member of the Designers SharePoint group on the Enterprise Search Center site.

  2. 瀏覽至包含您要設定、 按一下 [設定] 功能表然後按一下 [編輯頁面]精簡搜尋網頁組件的頁面。Browse to the page that contains the Refinement Web Part that you want to configure, click the Settings menu and then click Edit Page.

  3. 編輯精簡搜尋網頁組件。按一下精簡搜尋網頁組件功能表]箭頭,然後再按一下 [編輯網頁組件Edit the Refinement Web Part. Click the Refinement Web Part Menu arrow, and then click Edit Web Part.

    • 在 [網頁組件工具窗格] 中的精簡搜尋屬性] 區段中,確認已選取 [選擇此網頁組件中的精簡器In the Web Part tool pane, in the Properties for Search Refinement section, verify that the Choose Refiners in this Web Part is selected.

    • 按一下 [選擇精簡器Click Choose Refiners.

    • 在 [精簡搜尋設定] 頁面上從 [可用的精簡器] 區段中,使用按鈕以選取包含您想要顯示為精簡器從清單中,按一下 [新增的擷取的實體的一或多個受管理的內容。例如,如果您已部署 word 擷取字典,選擇 [ WordCustomRefiner1On the Refinement configuration page, from the Available refiners section, use the buttons to select one or more managed properties containing extracted entities that you want to show as refiners from the list and click Add. For example, if you have deployed a word extraction dictionary, choose WordCustomRefiner1.

    • 在 [設定] 區段中,設定每個精簡器的顯示方式。In the Configure for section, configure how you want each refiner to appear.

  4. 按一下 [ OK ]。Click OK.

自訂實體擷取器類型概觀Overview of custom entity extractor types

下表顯示您可以建立的自訂擷取字典類型,以及字典項目與搜尋索引中之內容的比對方式、部署字典時須使用的字典名稱,以及將包含擷取實體的 Managed 屬性..The following table shows what type of custom extraction dictionaries you can create and how the dictionary entries are matched with content in the search index, which dictionary name you should use when you deploy the dictionary and which managed property will contain the extracted entities..


自訂實體擷取器/自訂實體擷取器字典Custom entity extractor / custom entity extractor dictionary 描述Description 範例Example 在 Windows PowerShell 使用的字典名稱Dictionary name to use in Windows PowerShell 將包含擷取實體的 Managed 屬性Managed property that will contain the extracted entity
全字相符Word Extraction 不區分大小寫,字典項目與 Token 化的內容相符,最多 5 個字典。Case-insensitive, dictionary entries matching tokenized content, maximum 5 dictionaries. "anchor" 項目與 "anchor" 和 "Anchor" 相符,而不會與 "anchorage" 相符The entry "anchor" matches "anchor" and "Anchor," but not "anchorage" Microsoft.UserDictionaries.EntityExtraction.Custom.Word.n [其中 n = 1,2,3,4 或 5]Microsoft.UserDictionaries.EntityExtraction.Custom.Word.n [where n = 1,2,3,4 or 5] WordCustomRefiner1 WordCustomRefiner2 WordCustomRefiner3 WordCustomRefiner4 WordCustomRefiner5WordCustomRefiner1 WordCustomRefiner2 WordCustomRefiner3 WordCustomRefiner4 WordCustomRefiner5
全字部分相符Word Part Extraction 不區分大小寫,字典項目與未 Token 化的內容相符,最多 5 個字典。Case-insensitive, dictionary entries matching un-tokenized content, maximum 5 dictionaries. "anchor" 項目與 "anchor"、"Anchor" 和 "anchorage" 相符The entry "anchor" matches "anchor," "Anchor" and "anchorage" Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n [其中 n = 1,2,3,4 或 5]Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n [where n = 1,2,3,4 or 5] WordPartCustomRefiner1 WordPartCustomRefiner2 WordPartCustomRefiner3 WordPartCustomRefiner4 WordPartCustomRefiner5WordPartCustomRefiner1 WordPartCustomRefiner2 WordPartCustomRefiner3 WordPartCustomRefiner4 WordPartCustomRefiner5
全字相符擷取Word Exact Extraction 區分大小寫,字典項目與 Token 化的內容相符,最多 1 個字典。Case-sensitive, dictionary entries matching tokenized content, maximum 1 dictionary. "anchor" 項目與 "anchor" 項目相符,不與 "Anchor" 或 "Anchorage" 相符The entry "anchor" matches "anchor," but not "Anchor" or "Anchorage" Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1 WordExactCustomRefinerWordExactCustomRefiner
全字部分相符擷取Word Part Exact Extraction 區分大小寫,字典項目與未 Token 化的內容相符,最多 1 個字典。Case-sensitive, dictionary entries matching un-tokenized content, maximum 1 dictionary. "anchor" 項目與 "anchor" 和 "anchorage" 相符,不與 "Anchor" 相符The entry "anchor" matches "anchor" and "anchorage," but not "Anchor" Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1 WordPartExactCustomRefinerWordPartExactCustomRefiner

另請參閱See also

匯入 SPEnterpriseSearchCustomExtractionDictionaryImport-SPEnterpriseSearchCustomExtractionDictionary