在 SharePoint 2016 中重新設計特定效能需求的企業搜尋拓撲Redesign enterprise search topology for specific performance requirements in SharePoint 2016

摘要: 了解如何重新設計企業搜尋拓撲,以擴充搜尋效能來符合特定效能需求。Summary: Learn how to redesign enterprise search topology so you can scale search performance to meet specific performance requirements.

如果遵循在 SharePoint Server 2016 中規劃企業搜尋架構中的指引,但您的搜尋環境不符合特定效能需求,則解決方案是擴充企業搜尋架構的拓撲:If your search environment has specific performance requirements that weren't met by following the guidance in Plan enterprise search architecture in SharePoint Server 2016, then the solution is to scale the topology of your enterprise search architecture:

  1. 重新設計拓撲 (本文)Redesign your topology (this article)

  2. 實作重新設計的拓撲 (在 SharePoint Server 中管理搜尋拓撲)Implement the redesigned topology (Manage the search topology in SharePoint Server)

您是否熟悉 SharePoint Server 2016 中的搜尋系統元件及其互動方式?在繼續之前,請閱讀<SharePoint Server 的搜尋架構概觀>和<SharePoint Server 2016 的搜尋架構> (或< SharePoint Server 2013 的搜尋架構>),以熟悉搜尋架構、搜尋元件、搜尋資料庫和搜尋拓撲。Are you familiar with the components of the search system in SharePoint Server 2016, and how they interact? By reading Overview of search architecture in SharePoint Server and Search architectures for SharePoint Server 2016 (or Search architectures for SharePoint Server 2013) before you get going, you'll become familiar with search architecture, search components, search databases, and the search topology.

在本文中,將顯示如何重新設計搜尋拓撲以符合特定效能需求的逐步指示:In this article, we'll show you step-by-step how to redesign your search topology to meet specific performance requirements:

遵循這些步驟之後,您會知道:After you've followed these steps you'll know:

  • 您拓撲所需之每種類型的搜尋元件和搜尋資料庫數目。How many of each type of search component and search database your topology needs.

  • 在其上部署每個搜尋元件的應用程式伺服器和資料庫伺服器。Which application servers and database servers to deploy each search component on.

  • 每部應用程式伺服器和資料庫伺服器所需的硬體資源。What hardware resources each application server and database server needs.

步驟 1:特定效能需求為何?Step 1: What are the specific performance requirements?

請確定您了解特定效能需求的商業需求。例如,新聞和財務搜尋需要接近即時檢索全新資料,同時訴訟支援服務需要擷取只檢索一次的資料批次。請使用下列一種或多種方法來表示效能需求:Ensure that you understand the business needs behind the specific performance requirements. For example, news and financial search require fresh data that are indexed near real-time, while litigation support services require ingestion of batches of data that are indexed once. Express the performance requirements in one or more of these ways:

  • 索引項目數目。The number of indexed items.

  • 搜尋解決方案每秒必須編目的項目數和其延遲。How many items the search solution must crawl per second and with what latency.

  • 搜尋解決方案每秒必須提供的查詢數和其延遲。How many queries the search solution must serve per second and with what latency.

除了這些效能需求之外,您的環境也可能會有查詢結果關聯性以及要做為備援之搜尋拓撲的需求。有時,您不會有特定效能需求,但識別搜尋架構中可能會影響效能的瓶頸。我們也會涵蓋該內容。In addition to these performance requirements, your environment might also have requirements for the relevancy of query results and for the search topology to be redundant. Sometimes you don't have a specific performance requirement, but you've identified a bottleneck in the search architecture that might affect performance. We'll cover that too.

步驟 2:應該擴充哪些搜尋元件?Step 2: Which search components should I scale?

若要提供較高的效能或是移除瓶頸,您可以新增更多的搜尋元件來執行工作,也可以將更多資源新增至裝載搜尋元件的伺服器。新增更多搜尋元件稱為擴充,而將更多資源新增至伺服器則稱為垂直擴充。要擴充的搜尋元件或要垂直擴充的伺服器,取決於要改善的效能量值或要移除的瓶頸。以下為一些範例:To deliver higher performance or to remove a bottleneck, you can add more search components to do the job or you can add more resources to the servers hosting search components. Adding more search components is known as scaling out, while adding more resources to the servers is known as scaling up. Which search components to scale out, or which servers to scale up, depends on the performance metric to improve, or the bottleneck to remove. Here are some examples:

  • 如果環境需要更高的查詢率,而用於檢索的 CPU 資源是瓶頸,請將另一個索引複本新增至索引的每個分割區。這可讓搜尋平行提供更多查詢。If the environment requires a higher query rate and the CPU resources for indexing are a bottleneck, add another index replica to each partition of the index. This lets search serve more queries in parallel.

  • 如果用於處理編目內容的 CPU 資源是瓶頸,請擴充內容處理元件數目。您也可以垂直擴充內容處理元件,方法是在具有更多或更快 CPU 的伺服器上執行它們。任何一種擴充方法都表示有更多 CPU 資源可處理內容。If CPU resources for processing crawled content are a bottleneck, scale out the number of content processing components. You can also scale up the content processing components by running them on servers with more or faster CPUs. Either way of scaling implies more CPU resources for processing content.

  • 如果分析元件完成分析的速度不夠快,請垂直擴充裝載分析元件之伺服器的處理器資源、磁碟 IOPS 或網路頻寬。If the analytics components don't complete their analyses quickly enough, scale up the processor resources, disk IOPS or network bandwidth of the servers hosting analytics components.

請注意,我們不支援無限制擴充搜尋元件或資料庫數目。請查看搜尋限制中的上限,並保持在這些限制內,確保搜尋元件與資料庫之間具有及時和完善通訊。如果必要,請減少搜尋元件數目來降低搜尋架構容量。Note that we don't support unlimited scale-out of the number of search components or databases. Look up the maximum limits in Search limits and stay within these limits to ensure timely and robust communication between the search components and databases. If it's necessary, reduce the capacity of your search architecture by reducing the number of search components.

在下列各節中,我們提供要擴充哪些搜尋元件或資料庫以滿足每個需求的指導方針:In the following sections we have guidelines for you on which search components or databases to scale to satisfy each requirement:

如何處理索引中的更多項目How to handle more items in the index

如果索引項目數增加,並且以與之前相同的速率變更索引項目,請擴充這些搜尋元件和資料庫來增加搜尋拓撲容量:When the amount of indexed items increases while the indexed items change at the same rate as before, increase the capacity of your search topology by scaling out these search components and databases:

搜尋元件或資料庫Search component or database 指導方針Guideline
索引元件Index component 每 2,000 萬個1索引項目會使用一個索引分割區。Use one index partition for each 20 million1 indexed items.
每個分割區都會包含分割區的一個或多個複本。所有分割區都必須要有相同數目的複本。一個索引元件代表一個索引複本。因此,如果您想要有索引的兩個複本,則索引分割區數目需要是索引元件的索引分割區數目的兩倍。 Each partition contains one or more replicas of the partition. All partitions must have the same number of replicas. An index component represents one index replica. So, if you want two replicas of the index then you'll need twice as many index components as index partitions.
例如,具有 8,000 萬個2項目的備援索引需要四個分割區。針對每個分割區使用兩個複本時,八個索引元件會代表四個分割區。 For example, a redundant index with 80 million2 items requires four partitions. Eight index components represent the four partitions when using two replicas for each partition.
編目資料庫Crawl database 內容主體中每 2,000 萬個項目會使用一個編目資料庫。例如,具有 1 億個項目的索引需要五個編目資料庫。Use one crawl database for each 20 million items in the content corpus. For example, an index with 100 million items requires five crawl databases.
如果增加的索引項目數表示會有較高的編目率,則您也需要更多 IOPS 資源,才能服務編目資料庫。如果您的編目率是一秒一份文件,則編目資料庫需要大約 10 個 IOPS。If the increased amount of indexed items implies a higher crawl rate, you also need more IOPS resources to serve the crawl databases. If your crawl rate is one document per second then the crawl database needs about 10 IOPS.
連結資料庫Link database 內容主體中每 6,000 萬個項目會使用一個連結資料庫。例如,具有 1 億個項目的索引需要兩個連結資料庫。Use one link database for each 60 million items in the content corpus. For example, an index with 100 million items requires two link databases.
如果新增的內容表示會有較高的編目率,則您可能需要更多 IOPS 資源,才能服務連結資料庫。If the added content implies a higher crawl rate, you might need more IOPS resources to serve the link databases.
分析報表資料庫Analytics reporting database 需要的分析報表資料庫數目取決於搜尋環境使用分析的方式和頻率。分析效能開始減少時,一般會新增分析報表資料庫。例如,啟動資料庫的晚間更新而需要更多時間時。如果資料庫的大小達到 250 GB 或共 2,000 萬個資料列,或者每天的檢視數目達到 500,000 個唯一項目,則可能發生此情況。How many analytics reporting databases you need, depends on how the search environment uses analytics, and how often. In general, add an analytics reporting database when the analytics performance starts decreasing. For example, when the nightly update of the database starts to take more time. This might happen when the database reaches a size of 250 GB, or 20 million rows in total, or when the number of views per day reaches 500,000 unique items.

1SharePoint Server 2013 有 1000 萬個項目,或 SharePoint Server 2016 以少於 500 GB RAM、32 GB RAM 和 8 CPU 核心的資源執行。110 million items with SharePoint Server 2013, or with SharePoint Server 2016 running with less resources than 500 GB RAM, 32 GB RAM, and eight CPU cores.

2SharePoint Server 2013 有 4000 萬個項目,或 SharePoint Server 2016 以少於 500 GB RAM、32 GB RAM 和 8 CPU 核心的資源執行。240 million items with SharePoint Server 2013, or with SharePoint Server 2016 running with less resources than 500 GB RAM, 32 GB RAM, and eight CPU cores.

如何增加結果的擷取率和新鮮度How to increase the ingestion rate and the freshness of results

有一些狀況可能需要增加擷取率。其中一個範例是如果您的環境需要極新鮮的結果而且內容量接近搜尋架構的項目上限,或內容頻繁地變更。如果人員曾經將檔案封存在小組網站上,但是現在將其檔案儲存在 OneDrive for Business 上同時處理它們,則內容可能會頻繁地變更。搜尋會檢索人員對其檔案進行的所有變更。There are some situations where you might need to increase the ingestion rate. One example is if your environment requires very fresh results and the content volume is close to the upper item limit for your search architecture, or the content changes often. Content might change often if people used to archive files on a team site, but now they store their files on OneDrive for Business while they work on them. Search indexes all the changes people make to their files.

這有助於了解哪些因素影響搜尋可以多快擷取項目:It's useful to understand which factors influence how quickly search can ingest items:

  • 搜尋可以多快編目項目。這取決於:How quickly search can crawl items. This depends on:

    • 編目元件與內容來源之間的連線速度。The speed of the connection between the crawl components and the content sources.

    • 要編目之項目的類型和平均大小。The type and average size of the items to crawl.

    • 裝載編目資料庫之 SQL 伺服器的效能。The performance of the SQL server hosting the crawl databases.

    • 編目元件所具有的 CPU 和記憶體資源數量。The amount of CPU and memory resources that the crawl components have.

  • 每個項目在檢索之前需要進行多少內容處理。How much content processing each item requires before indexing.

  • 索引有多少分割區。更多的分割區可讓搜尋分散檢索的負載。How many partitions the index has. More partitions lets search spread the load of indexing.

以下是作法:Here's what to do:

  1. 查看編目項目的存留期分佈,來檢查伺服器陣列中結果的新鮮度。在 SharePoint 管理中心網站中,移至 [編目狀況報告]*,然後選取 [編目新鮮度]。您伺服器陣列可接受的存留期分佈取決於商業需求。以下是範例:如果 [編目新鮮度]*** 頁面顯示需要四個小時來檢索 90% 的內容,但是您的需求是 30 分鐘,則會增加擷取率。Check the freshness of the results in your farm by looking at the age distribution of the crawled items. In CentralAdmin_1st, go to Crawl Health Reports and select Crawl Freshness. What age distribution that’s acceptable for your farm depends on your business requirements. Here’s an example: If the Crawl Freshness page shows that it takes four hours to index 90% of the content, but your requirement is 30 minutes, then increase the ingestion rate.

  2. 在 [編目新鮮度]**** 頁面上,識別結果在該天的哪些期間不夠新鮮。On the Crawl Freshness page, identify in which periods of the day that results aren’t fresh enough.

  3. 遵循指導方針來增加這些時段中的擷取速度。Follow the guidelines to increase the ingestion speed in these time periods.

指導方針Guideline
改善特定內容來源的新鮮度Improve freshness for a specific content source
增加用於編目的處理資源Increase processing resources for crawling
增加編目資料庫的處理資源Increase processing resources for the crawl database
增加內容處理的處理和記憶體資源Increase processing and memory resources for content processing
增加索引分割區數目Increase the number of index partitions

改善特定內容來源的新鮮度Improve freshness for a specific content source

檢查編目排程,以及識別搜尋在新鮮度不足的時段進行編目的內容來源。如果特定內容來源的新鮮度不足,請考慮下列各項:Check the crawl schedule, and identify which content sources that search crawls in the time periods where freshness is low. If the freshness is low for a specific content source, consider the following:

  • 增加裝載編目元件的伺服器與該內容來源之間的連線速度。即編目率,會從內容來源下載項目,並將項目傳遞給需要編目元件之網路頻寬的內容處理元件。Increase the speed of the connection between the server hosting the crawl component and that content source. It's the crawl rate, downloading items from content sources, and passing items to the content processing component that drives the need for network bandwidth for the crawl component.

  • 如果內容來源是 SharePoint,則該伺服器陣列可能需要更多且專用的編目目標。請閱讀管理編目負載 (SharePoint 2010) 中的編目目標。If the content source is SharePoint, that farm might need more, and dedicated, crawl targets. Read about crawl targets in Manage crawl load (SharePoint 2010).

  • 改善內容資料庫的效能。了解 SharePoint Server 伺服器陣列中 SQL Server 的最佳作法中的作法。Improve the performance of the content database. Learn how in Best practices for SQL Server in a SharePoint Server farm.

增加用於編目的處理資源Increase processing resources for crawling

如果編目元件經常使用 100% 的處理器資源,請考慮新增另一個編目元件或將更多處理器資源新增至裝載編目元件的伺服器。即編目率、連結探索,以及需要處理器資源之編目的管理。在搜尋架構 (例如 that Microsoft 評估過的小型和中型樣本搜尋架構) 中使用兩個編目元件時,編目一般會夠快。搜尋架構 (例如大型和超大型樣本) 可能需要兩個以上的編目元件。If the crawl component often uses 100% of the processor resources, consider adding another crawl component or adding more processor resources to the servers hosting crawl components. It's the crawl rate, discovery of links, and management of crawling that drives the need for processor resources. Normally, crawling is fast enough when you use two crawl components in search architectures like the small and medium sample search architectures that Microsoft has estimated. Search architectures like the large and extra-large samples might need more than two crawl components.

增加編目資料庫的處理資源Increase processing resources for the crawl database

檢查裝載編目資料庫之 SQL 伺服器的資源是否足夠。請閱讀 SharePoint Server 伺服器陣列中 SQL Server 的最佳作法中的作法。Check whether the SQL servers hosting crawl databases have enough resources. Read how to do this in Best practices for SQL Server in a SharePoint Server farm.

如果所有編目資料庫使用許多處理器資源,請考慮將更多處理器資源新增至裝載資料庫的 SQL 伺服器,或新增編目資料庫數目與現有 SQL 伺服器的編目資料庫數目相同的另一部 SQL 伺服器。例如,如果您有兩部各有三個編目資料庫的 SQL 伺服器,則請新增另一部具有三個編目資料庫的 SQL 伺服器。If all the crawl databases use a lot of processor resources, consider adding more processor resources to the SQL server hosting the databases or add another SQL server with the same number of crawl databases as the existing SQL servers. If you for example have two SQL servers that each has three crawl databases, add another SQL server with three crawl databases.

如果只有一個或多個編目資料庫使用許多處理器資源,這表示編目資料庫的負載不平均。請考慮重新平衡所有編目資料庫中的內容。請注意,在重新平衡期間,搜尋會暫停編目,因此,重新平衡時以及編目取得在暫停期間發生的變更之前,結果會較不新鮮。您將觸發使用 [資料庫]**** 頁面上的 [平衡]**** 按鈕來進行重新平衡。在 [搜尋管理]**** 中,移至 [編目記錄]*,然後選取 [資料庫]If only one or a few crawl databases use a lot of processor resources, this means that the load is uneven across the crawl databases. Consider rebalancing the content across all crawl databases. Note that during rebalancing search pauses crawling, so results are less fresh while rebalancing and until crawling has caught up with the changes that took place during the pause. You trigger rebalancing with the **Balance* button on the Databases page. In Search Administration, go to Crawl Log and select Databases.

增加內容處理的處理和記憶體資源Increase processing and memory resources for content processing

如果內容處理元件使用接近 100% 的 CPU 資源,請考慮新增更多內容處理元件,或將更多 CPU 資源新增至裝載內容處理元件的伺服器。If the content processing component uses close to 100% of the CPU resources, consider adding more content processing components, or adding more CPU resources to the servers hosting content processing component.

如果您注意到記憶體經常重新啟動,請考慮增加裝載內容處理元件之伺服器上的記憶體數量。一個 CPU 核心有 2 GB 的工作記憶體是不錯的經驗法則。If you notice that memory restarts often, consider increasing the amount of memory on the servers hosting content processing components. 2 GB working memory per CPU core is a good rule of thumb.

增加索引分割區數目Increase the number of index partitions

檢查內容處理活動。移至搜尋管理,並選取 [編目狀況報告]*,然後選取 [內容處理活動],即可找到內容處理活動。如果檢索是需要最多時間的活動,請考量將索引分成多個分割區。更多的索引分割區可讓搜尋分散檢索的負載。Check the content processing activity. You find this by going to Search Administration, selecting **Crawl Health Report* and then selecting Content Processing Activity. If indexing is the activity that takes most time, consider dividing the index into more partitions. More index partitions lets search spread the load of indexing.

如果您在執行中安裝上新增更多分割區,則索引會重新分割它自己。重新分割索引可能需要數個小時或數天。需要多長的時間取決於伺服器陣列在重新分割開始時的狀態。If you add more partitions on a running installation, the index repartitions itself. It can take several hours, or days, for the index to repartition. How long time it takes depends on the state of the farm when repartitioning begins.

如何減少查詢延遲以及增加查詢輸送量How to reduce query latency and increase query throughput

搜尋每秒可以服務的查詢數稱為查詢輸送量。查詢輸送量取決於搜尋用來處理查詢的時間,以及查詢因處理資源無法使用而等待的任何時間。處理和等待時間的總和稱為查詢延遲。減少查詢延遲會增加查詢輸送量。若要減少查詢延遲,請遵循其中一個或兩個指導方針:How many queries search can serve per second is known as query throughput. Query throughput depends on the time search uses to process a query and any time the query waits because a processing resource isn't available. The sum of the processing and waiting time is known as query latency. Reducing query latency increases query throughput. To reduce query latency, follow one or both of these guidelines:

指導方針Guideline
減少查詢的處理時間Reduce the processing time for queries
減少查詢的等待時間Reduce the waiting time for queries

減少查詢的處理時間Reduce the processing time for queries

請考慮將更多分割區新增至索引。更多分割區表示每個分割區中有更少項目。更少項目表示每個分割區可以回應查詢的速度更快。但是太多分割區不夠好。因為查詢處理元件必須合併來自每個分割區的回應以產生查詢的答案,所以索引有更多分割區時,合併會需要更多時間。所有分割區都必須要有相同數目的複本。Consider adding more partitions to the index. More partitions mean fewer items in each partition. Fewer items mean that each partition responds faster to queries. But too many partitions aren't good either. Because the query processing component has to merge the responses from each partition to produce an answer to a query, a merge takes more time when the index has more partitions. All partitions must have the same number of replicas.

在執行中安裝上新增更多分割區時,索引會重新分割它自己。重新分割索引可能需要數個小時或數天。需要多長的時間取決於伺服器陣列在重新分割開始時的狀態。When you add more partitions on a running installation, the index repartitions itself. It can take several hours, or days, for the index to repartition. How long time it takes depends on the state of the farm when repartitioning begins.

減少查詢的等待時間Reduce the waiting time for queries

請考慮下列動作:Consider these actions:

  • 新增索引的更多複本。新增更多複本時,搜尋會將查詢分佈到複本,並平行處理它們。一個索引元件代表一個索引複本。所有分割區都必須要有相同數目的複本,因此將一個索引元件新增至索引的每個分割區。將索引元件以複本形式新增至執行中安裝上的現有分割區時,搜尋會自動將索引分割區中的資料植入新複本中。在新複本作業之前,這可能需要數個小時。Add more replicas of the index. When you add more replicas, search distributes queries across the replicas and works on them in parallel. An index component represents one index replica. All partitions must have the same number of replicas, so add one index component to each partition of the index. When you add index components as replicas to existing partitions on a running installation, search automatically seeds the new replicas with data from the index partition. It can take several hours before the new replicas are operational.

  • 將更多記憶體新增至裝載索引元件的伺服器。Add more memory to the servers hosting index components.

  • 在裝載索引元件的伺服器上,針對索引,切換至更快的儲存體 (例如固態硬碟 (SSD))。On the servers hosting index components, switch to faster storage for the index, for example a Solid State Drive (SSD).

  • 將更多處理器資源新增至裝載索引元件的伺服器。然後,元件每秒會處理更多查詢。例如,如果伺服器有 2 GHz 的 CPU,則一個核心可以處理:Add more processor resources to the servers hosting index components. Then the components handle more queries per second. For example, if the server has a 2 GHz CPU, one core can handle:

    • 每秒 5 個查詢 (索引中有 100 萬個項目時)。5 queries per second when you have 1 million items in the index.

    • 每秒 2 個查詢 (索引中有 500 萬個項目時)。2 queries per second when you have 5 million items in the index.

    • 每秒 1 個查詢 (索引中有 1,000 萬個項目時)。1 query per second when you have 10 million items in the index.

  • 將更多處理器資源新增至裝載查詢處理元件的伺服器。然後,元件每秒會處理更多查詢,特別是查詢不常用且複雜時。即需要查詢處理元件之處理器資源的查詢率和查詢轉換數目。查詢處理元件一般每秒每 4 個查詢需要一個 CPU 核心。Add more processor resources to the servers hosting query processing components. Then the components handle more queries per second, especially when queries are infrequent and complex. It's the query rate and the number of query transforms that drive the need for processor resources for the query processing component. A query processing component typically needs one CPU core per 4 queries per second.

如何減少分析處理時間How to decrease analytics processing time

分析處理會在每個晚上進行。分析處理元件會將中繼資料儲存至裝載元件的伺服器,並將分析結果儲存至分析報表資料庫。如果錯誤阻礙分析處理,則這不會影響文件編目或回答查詢。但是查詢結果不會有最佳相關性。Analytics processing takes place every night. The analytics processing component stores intermediate data on the server hosting the component, and stores the results of the analysis in the analytics reporting database. If a fault hinders processing of analytics, this will not affect document crawling or answering queries. But the query results won't have optimal relevance.

請考慮下列動作:Consider these actions:

  • 如果您的環境需要查詢結果的最佳相關性,而分析處理的速度不夠快無法滿足此要求,請新增更多磁碟 (主軸) 或更快的磁碟。If your environment requires optimal relevance for query results and analytics processing isn't fast enough to satisfy this, add more disks (spindles) or faster disks.

  • 如果分析處理所花的時間高於平常,請新增分析報表資料庫。如果資料庫的大小達到 250 GB 或共 2,000 萬個資料列,或者每天的檢視數目達到 500,000 個唯一項目,則可能會看到這類增加。If the analytics processing starts to take more time than usual, add an analytics reporting database. You might see such an increase when the database reaches a size of 250 GB, or 20 million rows in total, or when the number of views per day reaches 500,000 unique items.

  • 如果分析處理需要 24 個小時才能完成,請新增更多分析處理元件,或將更多處理器資源新增至裝載分析處理元件的伺服器。即索引中的項目數,以及網站上需要處理器資源的活動。If the analytics processing takes more than 24 hours to complete, either add more analytics processing components, or add more processor resources to the servers hosting analytics processing components. It's the number of items in the index and the activity on the site that drives the need for processor resources.

  • 如果分析處理從未完成,或您收到裝載分析元件之伺服器上磁碟的狀況提醒,請將更多磁碟空間新增至伺服器。若要讓分析元件更快速地處理更大量的中繼資料,請考慮將更多分析處理元件或更多處理器資源新增至裝載分析處理元件的伺服器。If the analytics processing never completes, or you get health alerts for the disks on the servers hosting analytics components, add more disk space to the servers. For the analytics component to process the larger amount of intermediate data faster, consider adding more analytics processing components or more processor resources to the server hosting the analytics processing component.

如何將搜尋元件和資料庫設為備援How to make your search components and databases redundant

如果在個別錯誤網域上裝載備援搜尋元件和資料庫,則搜尋架構支援高可用性。建議您設計具有備援搜尋資料庫和元件的搜尋拓撲。因為 Microsoft 測試過的所有樣本搜尋架構都具有備援搜尋元件和資料庫,所以處理專屬拓撲時,可能會發現學習這些範例十分有用 (請參閱 SharePoint 2016 的企業搜尋架構)。Your search architecture supports high availability when you host redundant search components and databases on separate fault domains. We recommend designing your search topology with redundant search databases and components. All the sample search architectures that Microsoft tested have redundant search components and databases, you might find it useful to study these samples when working on your own topology (see Enterprise Search Architectures for SharePoint 2016).

請遵循下列指導方針:Follow these guidelines:

指導方針Guidelines
將索引設為備援Make the index redundant
將編目、內容處理、查詢處理、分析處理和搜尋管理元件設為備援Make crawling, content processing, query processing, analytics processing, and search administration redundant
將搜尋資料庫設為備援Make search databases redundant

將索引設為備援Make the index redundant

如果索引的索引分割區有兩個以上的索引複本,則索引為備援。如果裝載索引複本的伺服器失敗,則這可能會減少效能,但是搜尋還是可以服務查詢和索引項目。但是,如果環境隨時都需要相同的效能,則搜尋需要更多備援索引元件。例如:您已設計每個分割區有兩個複本的搜尋拓撲來減少查詢的等待時間,而且您的環境隨時都需要較短的等待時間來進行查詢。請增加每個分割區的索引複本數目。Your index is redundant if it has two or more index replicas per index partition. If a server hosting an index replica fails, this might reduce performance but search can still serve queries and index items. But if the environment requires the same performance at all times, search needs more redundant index components. For example: You designed your search topology with two replicas per partition to reduce the waiting time for queries and your environment requires a short waiting time for queries all the time. Increase the number of index replicas per partition.

所有分割區都必須要有相同數目的複本。一個索引元件代表一個索引複本。因此,如果您想要有索引的兩個複本,則索引分割區數目需要是索引元件的索引分割區數目的兩倍。例如,若使用 SharePoint Server 2016,具有 8,000 萬個項目的備援索引需要四個分割區。針對每個分割區使用兩個複本時,八個索引元件會代表四個分割區。All partitions must have the same number of replicas. An index component represents one index replica. So, if you want two replicas of the index then you'll need twice as many index components as index partitions. For example, with SharePoint Server 2016a redundant index with 80 million items requires four partitions. Eight index components represent the four partitions when using two replicas for each partition.

如果您將索引元件以複本形式新增至執行中安裝上的現有分割區,搜尋會自動將索引分割區中的資料植入新複本中。在新複本作業之前,這可能需要數個小時。If you add index components as replicas to existing partitions on a running installation, search automatically seeds the new replicas with data from the index partition. It can take several hours before the new replicas are operational.

將編目、內容處理、查詢處理、分析處理和搜尋管理元件設為備援Make crawling, content processing, query processing, analytics processing, and search administration redundant

我們會使用編目元件做為範例。如果您需要使用其中一部裝載編目元件的伺服器進行維護,這可能會減少結果的新鮮度,但是搜尋還是可以編目所有內容。但是,如果環境隨時都需要相同的結果新鮮度,則搜尋需要更多備援編目元件。例如:您已設計具有三個編目元件的搜尋拓撲,而且想要相同的結果新鮮度,即使兩部編目元件伺服器失敗也是一樣。請新增兩個以上的編目元件。Let's use the crawl component as an example. If you need to take down one of the servers hosting a crawl component for maintenance, this might reduce the freshness of results but search can still crawl all the content. But if the environment requires the same freshness of results all the time, search needs more redundant crawl components. For example: You designed your search topology with three crawl components and you want the same freshness of results even if two crawl component servers fail. Add two more crawl components.

搜尋管理元件是此原則的例外。一個搜尋管理元件就有任何大小搜尋拓撲的足夠容量。因此,兩個搜尋管理元件就足以進行備援。The search administration component is an exception to this principle. One search administration component has enough capacity for any size search topology. So, two search administration components are enough for redundancy.

內容處理元件會平衡彼此的負載,因此,備援內容處理元件會增加處理項目的容量。Content processing components balance the load among each other, so redundant content processing components increase the capacity to process items.

將搜尋資料庫設為備援Make search databases redundant

若要將搜尋資料庫設為備援,請使用 SQL Server 所提供的高可用性替代方式 (請參閱為 SharePoint Server 打造高可用性架構和策略)。To make your search databases redundant, use the high availability alternatives that SQL server offers (see Create a high availability architecture and strategy for SharePoint Server).

步驟 3:選擇以實體或虛擬的方式執行伺服器Step 3: Choose to run the servers physically or virtually

一開始規劃搜尋架構時,您決定使用實體伺服器或虛擬機器,或是混合使用兩者。請考慮該決策是否仍然有效。如果您現在有更多搜尋元件,則可能會想要使用虛擬機器更輕鬆地管理架構。例如,取代錯誤虛擬機器會比取代實體機器容易。也請注意,雖然虛擬環境較容易管理,但是它的效能等級有時可能會稍微低於實體環境的效能等級。實體伺服器可以在相同伺服器上裝載的搜尋元件多於虛擬伺服器。您可以在Overview of farm virtualization and architectures for SharePoint 2013中發現有效指引。When you originally planned your search architecture, you decided to use physical servers or virtual machines, or a mix. Consider whether that decision still is valid. If you now have many more search components, you might want to use virtual machines to make managing the architecture easier. For example, it's easier to replace a faulted virtual machine than a physical machine. Note also that although a virtual environment is easier to manage, its performance level can sometimes be slightly lower than that of a physical environment. A physical server can host more search components on the same server than a virtual server. You'll find useful guidance in Overview of farm virtualization and architectures for SharePoint 2013.

步驟 4:哪部伺服器要裝載哪個搜尋元件或資料庫?Step 4: Which server to host which search component or database?

現在,您已重新設計搜尋拓撲,下一個步驟是將搜尋和資料庫元件指派給實體或虛擬伺服器。沒有一種最佳的方式可以將搜尋元件指派給實體伺服器或虛擬機器,但是我們為您提供指導方針:Now that you've redesigned your search topology, your next step is to assign the search and database components to physical or virtual servers. There isn't one optimal way to assign search components to physical servers or virtual machines, but we've got guidelines for you:

一部伺服器一種搜尋元件類型One search component type per server

每部實體伺服器或虛擬機器都只能裝載每種類型的一個搜尋元件。索引元件是一個例外。實體伺服器或虛擬機器最多可以裝載四個索引元件。您可以在搜尋限制中閱讀這些限制。Each physical server or virtual machine can only host one search component of each type. The index component is an exception. Physical servers or virtual machines can host up to four index components. You can read about these limits in Search limits.

區分大量處理與即時元件Separate bulk processing and real-time components from each other

請避免在相同實體伺服器或虛擬機器上混合使用大量處理和即時處理搜尋元件。編目、內容處理和分析處理元件會執行大量處理。索引和查詢處理元件會執行即時處理。Avoid mixing bulk processing and real time processing search components on the same physical server or virtual machine. Crawl, content processing, and analytics processing components perform bulk processing. Index and query processing components perform real time processing.

不要混合使用競爭搜尋元件Don't mix competing search components

如果元件將競爭相同的資源,則請避免在實體伺服器或機器上混合使用搜尋元件。下表說明每個元件所需的相對資源數量。Avoid mixing search components on a physical server or machine if the components will compete for the same resources. Here's a table that illustrates the relative amount of resources that each component needs.

搜尋元件必要資源概觀

例如,可能不適合將編目和分析處理元件放在相同的伺服器上,因為它們都使用許多網路頻寬。但是,如果實體伺服器或虛擬機器有足夠的網路容量,則元件不會競爭。For example, it might not be a good idea to put a crawl and analytics processing component on the same server because they both use a lot of network bandwidth. But, if the physical server or virtual machine has enough network capacity, the components won't compete.

另一個範例是 Microsoft 評估過的超大型搜尋架構樣本。我們已在此將編目和搜尋管理元件放在不同的虛擬機器上。這有助於提升編目的速度,因為這兩個元件可能會爭用處理器資源。Another example is the extra-large search architecture sample that Microsoft has estimated. Here we've put the crawl and search administration components on separate virtual machines. This is good for the speed of crawling because the two components otherwise might compete for processor resources.

使用失敗網域Use failure domains

將備援搜尋元件指派給個別失敗網域中的主機。Assign redundant search components to hosts in separate failure domains.

步驟 5:我應該要知道哪些硬體需求?Step 5: Which hardware requirements should I be aware of?

下一個步驟是規劃您需要的硬體:The next step is to plan the hardware you'll need:

選擇主機伺服器的硬體資源數量Choose amount of hardware resources for the host servers

每個搜尋元件和搜尋資料庫都需要主機伺服器的最少硬體資源數量,才能執行良好。但是,您有的硬體資源越多,搜尋架構的效能會越好。因此,數量最好高於最少硬體資源數量。每個搜尋元件所需的資源取決於工作量,大部分是根據編目率、查詢率和索引項目數目所決定。Each search component and search database requires a minimum amount of hardware resources from the host server to perform well. But, the more hardware resources you have, the better the performance of your search architecture will be. So it's a good idea to have more than the minimum amount of hardware resources. The resources each search component requires depends on the workload, mostly determined by the crawl rate, the query rate, and the number of indexed items.

例如,在 Windows Server 2008 R2 Service Pack 1 (SP1) 上裝載虛擬機器時,每部虛擬機器無法使用四個以上的 CPU 核心。使用 Windows Server 2012 或更新版本,您可以每部虛擬機器使用八個以上的 CPU 核心。然後,您可以每部虛擬機器擴充更多 CPU 核心,而非垂直擴充更多虛擬機器。請設定裝載相同搜尋元件的伺服器或虛擬機器,且硬體資源相同。我們將使用索引元件做為範例。在虛擬機器上裝載索引分割區時,效能最弱的虛擬機器會決定整體搜尋架構的效能。For example, when hosting virtual machines on Windows Server 2008 R2 Service Pack 1 (SP1), you can't use more than four CPU cores per virtual machine. With Windows Server 2012 or newer, you use eight or more CPU cores per virtual machine. Then you can scale out with more CPU cores for each virtual machine instead of scaling up with more virtual machines. Set up servers or virtual machines that host the same search components, with the same hardware resources. Let's use the index component as an example. When you host index partitions on virtual machines, the virtual machine with the weakest performance determines the performance of the overall search architecture.

一般儲存空間General storage

確定每個主機伺服器具有足夠的磁碟空間可以容納 Windows Server 作業系統和 SharePoint Server 2016 程式檔案的基本安裝。主機伺服器也需要可用的硬碟空間來進行日常作業和頁面檔案的診斷,例如記錄、偵錯及建立記憶體傾印。通常 80 GB 的磁碟空間即已夠 Windows Server 作業系統和 SharePoint Server 2016 程式檔案使用。Make sure that each host server has sufficient disk space for the base installation of the Windows Server operating system and for the SharePoint Server 2016 program files. The host server also needs free hard disk space for diagnostics such as logging, debugging, and creating memory dumps, for daily operations, and for the page file. Normally, 80 GB of disk space is enough for the Windows Server operating system and for the SharePoint Server 2016 program files.

請新增儲存體,供每部資料庫伺服器的 SQL 記錄空間使用。如果您未設定資料庫伺服器經常備份資料庫,則 SQL 記錄空間會使用許多儲存體。如需如何規劃 SQL 資料庫的詳細資訊,請參閱規劃及設定儲存設備與 SQL Server 容量 (SharePoint Server)Add storage for the SQL log space for each database server. If you don't set the database server to back up the databases often, the SQL log space uses lots of storage. For more information about how to plan SQL databases, see Storage and SQL Server capacity planning and configuration (SharePoint Server).

分析報表資料庫所需的最少儲存體會不同。原因是儲存磁碟區取決於使用者如何與 SharePoint Server 2016 互動。使用者互動頻繁時,通常會儲存更多事件。請檢查目前搜尋架構用於分析資料庫的儲存體數量,並至少針對已重新設計的拓撲指派此數量。The minimum storage that the analytics reporting database requires can vary. This is because the amount of storage depends on how users interact with SharePoint Server 2016. When users interact frequently, there usually are more events to store. Check the amount of storage your current search architecture uses for the analytics database, and assign at least this amount for your redesigned topology.

索引元件的最少資源Minimum resources for the index component

這些是伺服器或虛擬機器裝載一個索引元件或裝載一個索引元件和一個查詢處理元件必須要有的最少資源:These are the minimum resources a server or virtual machine must have to host one index component, or to host one index component and one query processing component:

儲存體Storage 記憶體Memory 處理器Processor 網路頻寬Network bandwidth
500 GB 供索引使用1500 GB for the index1 32 GB132 GB1 64 位元,最少為 8 核心12.64-bit, 8 cores minimum1, 2. 2 Gbps2 Gbps

1若使用 SharePoint Server 2013,最低資源量是 500 GB 儲存空間、16 GB RAM,以及四個 CPU 核心。1With SharePoint Server 2013 the minimum amount of resources are 500 GB storage,16 GB RAM, and four CPU cores.

2若使用 SharePoint Server 2016,您可以使用 16 GB 的 RAM 和四個 CPU 核心,但每個索引元件最多可以保留 1,000 萬個項目 (而不是 2,000 萬個項目)。2You can use 16 GB RAM and four CPU cores with SharePoint Server 2016, but then each index component can maximum hold 10 million items (instead of 20 million items).

分析處理元件的最少資源Minimum resources for the analytics processing component

這些是伺服器或虛擬機器裝載一個分析處理元件必須要有的最少資源:These are the minimum resources a server or virtual machine must have to host one analytics processing component:

儲存體Storage 記憶體Memory 處理器Processor 網路頻寬Network bandwidth
300 GB 供本機處理分析使用300 GB for local processing of analytics 8 GB8 GB 64 位元,最少使用 4 個核心,但建議使用 8 個核心。64-bit, 4 cores minimum, but 8 cores recommended. 2 Gbps2 Gbps

如果伺服器裝載一個分析處理元件以及一個或多個大量處理元件,請將記憶體增加為 16 GB。If the server hosts one analytics processing component and one or more bulk processing components, increase memory to 16 GB.

編目、內容處理、查詢處理和搜尋管理元件的最少資源Minimum resources for the crawl, content processing, query processing, and search administration component

這些是伺服器或虛擬機器裝載下列其中一個元件必須要有的最少資源:These are the minimum resources a server or virtual machine must have to host one of these components:

儲存體Storage 記憶體Memory 處理器Processor 網路頻寬Network bandwidth
不需要Not required 8 GB8 GB 64 位元,最少使用 4 個核心,但建議使用 8 個核心。64-bit, 4 cores minimum, but 8 cores recommended. 2 Gbps2 Gbps

如果伺服器裝載上述兩個以上的元件,請將記憶體增加為 16 GB。If the server hosts two or more of these components, increase memory to 16 GB.

查詢處理元件需要良好的網路頻寬。即索引分割區數目以及需要網路頻寬的查詢和結果大小。例如,針對裝載查詢處理元件的伺服器或虛擬機器,每個查詢處理元件每秒 20 個查詢 (20 QPS/QPC) 以及具有 20 個索引分割區的索引會導致 200 Mbps 連入流量以及 100 Mbps 連出流量。The query processing component requires good network bandwidth. It's the number of index partitions and the size of queries and results that drive this need for network bandwidth. For example, 20 queries per second per query processing component (20 QPS/QPC) and an index with 20 index partitions results in 200 Mbps incoming traffic and 100 Mbps outgoing traffic for the server or virtual machine hosting the query processing component.

搜尋資料庫的最少資源Minimum resources for search databases

這些是伺服器或虛擬機器裝載一或多個搜尋資料庫必須要有的最少資源:These are the minimum resources a server or virtual machine must have to host one or more search databases:

儲存體Storage 記憶體Memory 處理器Processor 網路頻寬Network bandwidth
分析報表資料庫所需的儲存體會因下列項目而不同:搜尋環境使用分析的方式和頻率。使用分析報表資料庫的目前儲存體數量做為指導方針。The storage that the analytics reporting database requires varies with how the search environment uses analytics, and how often. Use the current amount of storage for the analytics reporting database as a guideline. 小型部署需要 8 GB。8 GB for small deployments.

16 GB (中型部署)16 GB for medium deployments
64 位元,4 核心。64-bit, 4 cores. 2 Gbps2 Gbps

規劃儲存效能Plan storage performance

儲存空間的速度會影響搜尋效能。請確定您的儲存空間速度足以處理來自搜尋元件和資料庫的流量。磁碟速度是以每秒 I/O 作業數 (IOPS) 來測量。The speed of the storage affects the search performance. Make sure that the storage you have is fast enough to handle the traffic from the search components and databases. Disk speed is measured in I/O operations per second (IOPS).

您決定將搜尋元件資料與作業系統資料分散在儲存體中的方式,會影響搜尋效能。您不妨:The way that you decide to distribute data from the search components and from the operating system across your storage, affects search performance. It's a good idea to:

  • 將 Windows Server 作業系統檔案、SharePoint Server 2016 程式檔案和診斷記錄分割到三個具有正常效能的個別儲存磁碟區或分割區。Split the Windows Server operating system files, the SharePoint Server 2016 program files, and diagnostic logs across three separate storage volumes or partitions with normal performance.

  • 將搜尋元件資料另外儲存在一個高效能的儲存磁碟區或分割區。針對索引元件,此儲存體也必須要有高效能。Store the search component data on a separate storage volume or partition. For index components, this storage must also have high performance.

    注意

    在主機上安裝 SharePoint Server 2016 時,您可以設定搜尋元件資料的自訂位置。主機上需要儲存資料的任何搜尋元件都會將它儲存在此位置中。稍後,若要變更此位置,您必須重新安裝 SharePoint Server 2016。You can set a custom location for search component data when you install SharePoint Server 2016 on a host. Any search component on the host that needs to store data, stores it in this location. To change this location later, you have to reinstall SharePoint Server 2016.

選擇儲存體類型Choose type of storage

如需儲存架構和磁碟類型的詳細資訊,請參閱規劃及設定儲存設備與 SQL Server 容量 (SharePoint Server 2016)。裝載索引、分析處理與搜尋管理元件 (或搜尋資料庫) 的伺服器所需要的儲存空間,必須可以維持低延遲,同時提供足夠每秒 I/O 作業數 (IOPS)。下列各表顯示這每個搜尋元件和資料庫需要多少 IOPS。For an overview of storage architectures and disk types, see Storage and SQL Server capacity planning and configuration (SharePoint Server 2016). The servers that host the index, analytics processing, and the search administration components, or search databases, require storage that can maintain low latency, while providing sufficient I/O operations per second (IOPS). The following tables show how many IOPS each of these search components and databases require.

如果您部署共用儲存設備 (例如 SAN/NAS),一個搜尋元件的尖峰磁碟負載通常會跟其他搜尋元件的尖峰磁碟負載同時發生。若要得到搜尋作業需要從共用儲存設備得到的 IOPS 數,您需要將這每個元件的 IOPS 相加。If you deploy shared storage like SAN/NAS, the peak disk load of one search component typically coincides with the peak disk load of another search component. To get the number of IOPS search requires from the shared storage, you need to add up the IOPS requirement of each of these components.

搜尋元件 IOPS 需求Search component IOPS requirements

元件名稱Component name 元件詳細資料Component details IOPS 需求IOPS requirements 使用個別儲存磁碟區/磁碟分割Use of separate storage volume/partition
索引元件Index component 合併索引及處理和回應查詢時使用儲存設備。Uses storage when merging the index and when handling and responding to queries. 300 IOPS 用於 64 KB 隨機讀取。300 IOPS for 64 KB random reads.
100 IOPS 用於 256 KB 隨機寫入。100 IOPS for 256 KB random writes.
200 MB/s 用於循序讀取。200 MB/s for sequential reads.
200 MB/s 用於循序寫入。200 MB/s for sequential writes.
Yes
分析元件Analytics component 在本機以大量處理方式分析資料。Analyzes data locally, in bulk processing. No Yes
編目元件Crawl component 在將下載的內容傳送至內容處理元件之前,先將該內容儲存到本機。儲存空間受限於網路頻寬。Stores downloaded content locally, before it sends it to a content processing component. Storage is limited by network bandwidth. No Yes

搜尋資料庫 IOPS 需求Search database IOPS requirements

資料庫名稱Database name IOPS 需求IOPS requirements I/O 子系統的一般負載。Typical load on I/O subsystem.
編目資料庫Crawl database 中至高 IOPSMedium to high IOPS 每秒每文件10 IOPS (DPS) 編目率。10 IOPS per 1 document per second (DPS) crawl rate.
連結資料庫Link database 中 IOPSMedium IOPS 搜尋索引中每 100 萬個項目 10 IOPS。10 IOPS per 1 million items in the search index.
搜尋管理資料庫Search administration database 低 IOPSLow IOPS 不適用。Not applicable.
分析報表資料庫Analytics reporting database 中 IOPSMedium IOPS 不適用。Not applicable.

選擇您的搜尋架構如何支援高可用性Choose how your search architecture supports high availability

如果您不熟悉高可用性策格,則以下文章可協助您開始進行:為 SharePoint Server 打造高可用性架構和策略。在個別錯誤網域上裝載備援搜尋元件和資料庫時,伺服器陣列的其中一部分中斷不會阻礙整個服務。但是,因為搜尋元件無法再共用負載,所以搜尋效能會下降。若要減少遺失單一伺服器的機會,最好改善本機備援。針對搜尋架構中的每部主機伺服器:If you aren't familiar with high availability strategies, here's an article that will get you started: Create a high availability architecture and strategy for SharePoint Server. When you host redundant search components and databases on separate fault domains, an outage in one part of the farm doesn't take down the complete service. But, search performance will degrade because the search components can't share the load any longer. To reduce the chance of losing a single server it's a good idea to improve local redundancy. For each host server in your search architecture:

  • 在每部伺服器上使用 RAID 儲存體。Use RAID storage on each server.

  • 在每部伺服器上安裝多個備援網路連線。Install multiple redundant network connections on each server.

  • 針對每部伺服器,安裝多個配有獨立電線或不斷電供應系統 (UPS) 的備援電源供應器。Install multiple redundant power supplies with independent wiring or an uninterruptable power supply (UPS) for each server.

所有範例搜尋架構都會在獨立伺服器上裝載備援搜尋元件。在範例搜尋架構中,每個主機配對中最右邊的主機都是備援。以下是具有所述備援主機的大型搜尋架構:All of the sample search architectures host redundant search components on independent servers. In the sample search architectures, the rightmost host in each host pair is redundant. Here's the large search architecture with outlined redundant hosts:

指出哪些伺服器主控多餘搜尋元件的大型企業搜尋伺服器陣列的圖表。