Estimate capacity and performance for enterprise managed metadata in SharePoint Server 2010

 

Applies to: SharePoint Server 2010

This article provides recommendations related to sizing and performance optimization of the managed metadata service in Microsoft SharePoint Server 2010. This article also provides best practices on how to configure the service and structure the service application databases for maximum performance.

The information in this article can help you understand the tested performance and capacity limits of the managed metadata service. Use this information to determine whether your planned deployment falls within acceptable performance and capacity limits.

In this article:

  • Test farm characteristics

  • Test results and recommendations

For general information about capacity management and how to plan for SharePoint Server 2010, see Capacity management and sizing for SharePoint Server 2010.

Test farm characteristics

Dataset

The tests were first run against the baseline dataset, which simulates a typical customer dataset. Then, a single variable was changed and the same tests were run again to determine the effect that changing that variable had on performance. In most cases, variables were tested independently. However, in some cases, certain important variables were tested in combination.

Baseline dataset

The baseline dataset contains the data shown in the following table.

Data Detail

Term set groups

100

Term sets

1,000 (10 per group)

Managed terms (this does not include enterprise keywords)

20,000 (20 per term set)

Enterprise keywords

80,000

Total terms (this includes managed terms and enterprise keywords)

100,000

Labels

100,000 (1 per term)

Term label length

250 characters per label

The number of terms in the baseline dataset is shown in the following graph.

Ratio of keywords to terms

In these tests, the ratio of keywords to managed terms is always 4:1 in all datasets.

Workload

Several key characteristics of the managed metadata service can potentially affect the performance of the service. These characteristics include the following:

  • Characteristics of the data in the service

    • Term label length

    • Number of terms per term store

    • Number of term labels per term store

  • Characteristics of the load on the service

    • Read/write mix
  • Size of term store cache

  • Number of service applications per database server

  • Performance of service timer jobs (Content Type Hub, Content Type Subscriber, Enterprise Metadata site data update, Taxonomy Update Scheduler)

The specific capacity and performance test results presented in this article might differ from the test results in real-world environments, and are intended to provide a starting point for the design of an appropriately scaled environment. After you have completed your initial system design, test the configuration to determine whether the system will support how you have configured the managed metadata service in your environment.

Test scenarios

The following tests were used for each test scenario:

  • Create a term (Write test)

    This test creates a term in an existing term set.

  • Get suggestions (Read-only test)

    This test searches for terms that begin with a single character string, as used in the suggestions retrieval of the keywords field.

  • Get matches (Read-only test)

    This test searches for terms that match a provided string, as in the value matching of the keywords field or metadata field.

  • Get child terms in a term set by using paging (Read-only test)

    This test retrieves child terms in a term set by using paging.

  • Validate a term (Read-only test)

    This test validates a single term, as in the value validation of the keywords field or metadata field.

Test mix

Most tests (except tests of the effects of write operations) used the same mix of read and write operations, which is shown in the following table.

Test Percentage of mix

Create a term

0.125%

Get suggestions

72.875%

Get matches

11%

Get child terms in a term set by using paging

5%

Validate a term

11%

"Get suggestions" is the most frequently used end-user operation. This is why it is heavily weighted in the test mix.

Hardware, settings, and topology

The test farm is a three-server farm that has a separate, single server each for the Web server, application server, and database computer.

Web server and application server

The Web server and the application server used identical hardware and were configured as shown in the following table.

Component Web server and application server configuration

Processors

Two quad-core, 2.33 GHz each

RAM

8 GB

Operating system

Windows Server 2008, 64-bit

Size of the system drive

300 GB

Number of network adapters

Two

Network adapter speed

1 gigabit per second

Authentication

Windows Basic

Software version

SharePoint Server 2010

Note

The result may vary depending on which version is used.

Services that run locally

Central Administration

Microsoft SharePoint Foundation Incoming E-Mail

Microsoft SharePoint Foundation Web Application

Microsoft SharePoint Foundation Microsoft SharePoint Foundation Workflow Timer Service

Database server

The database server was configured as shown in the following table.

Component Database server configuration

Processors

Four quad-core, 3.19 GHz each

RAM

16 GB

Operating system

Windows Server 2008, 64-bit

Storage

15 disks of 300 GB, 15,000 RPM each

Number of network adapters

Two

Network adapter speed

1 gigabit per second

Authentication

Windows NTLM

Software version

Microsoft SQL Server 2008

Test results and recommendations

This section describes test results and gives recommendations for the following characteristics:

  • Data characteristics

  • Load characteristics

  • Performance of timer jobs

Data characteristics

Effect of term label length

These tests were performed on the baseline term store first by using a term label length of 5 characters, and again by using a term label length of 250 characters. In this test, write operations represent a much greater percentage of the total than the mix of read and write operations that we used for most other tests.

Test Percentage of mix

Create a term

5%

Get suggestions

70%

Get matches

10%

Get child terms in a term set by using paging

5%

Validate a term

10%

The requests-per-second (RPS) results for different term label lengths are shown in the following graph. This data suggests that term label length has an insignificant effect on average RPS for both loads.

RPS v. label length

CPU and memory usage are shown in the following graphs.

CPU utilization RAM utilization

As shown by the results, the effect of term label length on CPU and memory usage for the Web server and application server is insignificant. However, the load on the database server increases as the term label length increases.

Conclusions and recommendations: Term label length

Term label length does not have a significant effect on the system performance.

Terms per term store

These tests were performed on the baseline term store, and then the term store was scaled up to 1 million terms by increasing the number of managed terms and keywords proportionally.

When the keywords term set was removed from the term store for testing, the difference was not significant among the performance of a term store with 100,000 terms, a term store with 500,000 terms, and a term store with 1 million terms, as shown in the following two graphs.

RPS CPU Utilization

When the system is under the specified test load, the time that is required to create a keyword increases significantly as the number of keywords increases from 16,000 to 800,000. This trend can be seen in the next graph.

Time to create keyword

Conclusions and recommendations: Terms per term store

The number of terms in a term store does not significantly affect system performance when very few users create keywords or when the number of keywords is small.

The keywords term set is stored in a flat list, unlike other term sets that have a more complex structure. The larger the flat list grows, the longer it takes to check whether there is already a keyword that has the same name. Therefore, it takes longer to create a keyword in a large keywords term set.

The term store administrator should limit the size of the keywords term set to prevent latency when users create keyword terms. One approach is to frequently move keywords into a regular term set, which can improve performance and contribute to better organization of term data.

Any term set that contains more than 150,000 terms in a flat list is subject to latency and performance issues. One alternative is to use a managed term set which usually has a structured collection of terms. For more information about term sets, see Managed metadata overview (SharePoint Server 2010).

Common errors

As the total number of terms in the term store approaches 500,000, users might experience various exceptions when they attempt to access the term store. By checking the related Unified Logging Service (ULS) log, the farm administrator can find the exception and determine whether it applies to client or server.

TimeoutException

When TimeoutException errors occur, you can modify the time-out value in the client.config file or in the web.config file for the managed metadata service. The client.config file can be found in the %PROGRAMFILES%\Microsoft Office Servers\14.0\WebClients\Metadata folder. The web.config file can be found in the %PROGRAMFILES%\Microsoft Office Servers\14.0\WebServices\Metadata folder. There are four time-out values:

  • receiveTimeout A time-out value that specifies the interval of time provided for a receive operation to be completed.

  • sendTimeout A time-out value that specifies the interval of time provided for a send operation to be completed.

  • openTimeout A time-out value that specifies the interval of time provided for an open operation to be completed.

  • closeTimeout A time-out value that specifies the interval of time provided for a close operation to be completed.

These time-out values are defined in the customBinding section. You can increase the time-out value based on the specific operation that is timing out. For example, if the time-out occurs when messages are received, you only need to increase the value of ReceiveTimeout.

Note

There are time-out values for HTTP and HTTPS, and therefore you must modify the time-out value for either HTTP or HTTPS.

For more information about time-out values, see <customBinding> (https://go.microsoft.com/fwlink/p/?LinkId=214213).

ThreadAbortException

When ThreadAbortException errors occur, you can increase the execution time-out value in the web.config file for the specific Web application. The web.config file is located in the %inetpub%\wwwroot\wss\VirtualDirectories\<Application Port Number> folder. For example, if the request is for TaxonomyInternalService on a Web application, first identify the web.config file for the Web application, and then add the following code into the configuration node.

  <location path="_vti_bin/TaxonomyInternalService.json">
    <system.web>
      <httpRuntime executionTimeout="3600" />
    </system.web>
  </location>

Note

The default executionTimeout value is 360 seconds.

Term labels per term store

This test was performed on a baseline term store that had 100,000 terms. During the test, the number of labels was incremented for each term, as shown in the following graph.

Average RPS

The average RPS decreases only slightly as the number of labels increases. CPU and memory usage on the Web server, application server, and database server increase only slightly, as shown in the following graphs.

Average CPU Utilization Average RAM Utilization

Conclusions and recommendations: Term labels per term store

The number of labels does not have a significant effect on system performance when the average number of labels per term is less than four.

Summary: Data characteristics

This section reviews the test results for three different characteristics of the term store data: term label length, the number of terms per term store, and the number of term labels per term store. Trends revealed by this test include the following:

  • Increasing the term label length to 250 does not have a significant effect on term store performance.

  • Increasing the average number of labels per term to four does not have a significant effect on term store performance.

  • Increasing the number of terms to 1 million does not have a significant effect on term store performance.

  • When the term store contains more than 150,000 terms in a term set that uses a flat list, it can take a long time to add new terms to the term store.

Load characteristics

Impact of changes in the read/write mix

These tests were performed by using the baseline read/write operation test mix with the "Create taxonomy item" test as the item that varied. The following table shows the specific operations that were used in the baseline test mix and their associated percentages.

Test Percentage of load

Get suggestions

73%

Create taxonomy item

0%

Get matches

11%

Get paged child terms in a term set

5%

Validate a term

11%

For each successive test, the number of terms created was increased. The following table shows the three tests that were performed by looking at the average terms created per minute and then getting the average RPS.

Average terms created/minute Average RPS

0

182

8.4

157

20

139

As shown in the following graph, RPS decreases as the average number of terms created per minute increases.

RPS average terms created per minute

CPU and memory usage are displayed in the following two graphs.

CPU Average number of terms created per minute RAM Average terms created per minute

Conclusions and recommendations: Impact of changes in the read/write mix

It is expected that term store performance will decrease as the percentage of write operations increases; because write operations hold more exclusive locks on the data, which delays the execution of read operations. Based on the test data, RPS does not significantly decrease until the average number of terms created reaches 20 per minute. However, an average term creation rate of 20 per minute is fairly high and does not ordinarily occur, especially in a mature term set. Making a term set read-only can improve performance by eliminating write operations.

Term store cache

The term store cache exists on all Web servers in a farm. It can contain term set groups, term sets, and terms. These tests were performed to show how the memory footprint of the cache object changes as the number of terms increases. There are other factors that affect the cache size — for example, term descriptions, the number of labels, and custom properties. To simplify the test, every term in the baseline term store has no description or custom properties, and has only one label with 250 characters.

The following graph shows how the memory footprint changes as the number of terms in the cache increases.

Cache size vs. number of terms viewed

Conclusions and recommendations: Term store cache

Memory usage on the Web server increases linearly as the number of the terms in the cache increases. This makes it possible to estimate the cache size if the number of terms is known. Based on the test data, memory usage should not be a performance issue for most systems.

Service applications consumed by a farm

This test shows the difference in performance between one and two managed metadata service applications that have their databases hosted on the same database server.

As shown in the following graph, under the same load, RPS decreases when an additional service application is added. It is expected that RPS will decrease when additional service applications are added.

RPS for two service applications

Latency for most operations is not significantly affected when additional service applications are added. However, unlike other operations, the "get suggestions" operation interacts with all available service applications. Therefore, latency for this operation increases as the number of service applications increases, as shown in the following graph. It is expected that this trend will continue as the number of service applications increases.

Keywords suggestions latency

As shown in the following graphs, database server CPU usage increases significantly when there are two service applications that have databases residing on the same server, but memory usage is not significantly increased.

Average CPU utilization Average RAM utilization

Conclusions and recommendations: Service applications consumed by a farm

If you must maintain more than one managed metadata service application, make sure that latency for keyword suggestion operations is at an acceptable level. Note that network latency also contributes to total effective latency. We recommend that managed metadata service applications be consolidated as much as possible.

If a single SQL Server computer is used to host all service applications, the server must have enough CPU and memory resources to support acceptable performance targets.

Performance of timer jobs

This section shows the performance characteristics of two timer jobs in the managed metadata service: the Content Type Subscriber timer job and the Taxonomy Update Scheduler timer job. Both timer jobs enumerate the site collections in a given Web application, and can potentially run for a long time and consume significant system resources in a large farm.

Content Type Subscriber timer job

The Content Type Subscriber timer job distributes the published content types to all appropriate site collections of a Web application. The overall time that this timer job takes to run depends on many factors, such as the number of content types that need to be distributed, the number and type of fields in the content type, and the number of site collections. This test shows how the following scaling factors affect the overall time to distribute a content type:

  • The number of site collections in a Web application

  • The number of content types

The first test was done by publishing 10 content types and distributing them to a different number of site collections. As shown in the following graph, the relationship between time to distribute content types and the number of site collections is almost linear.

Syndication times vs. number of site collections

In this test, one content type was published to 1,000 site collections, and then ten content types were published to 1,000 site collections. The distribution time for ten content types is approximately 10 times the distribution time for one content type, again showing an almost linear increase.

Syndication time vs. number of content types

Conclusions and recommendations: Content Type Subscriber timer job

Test results show that the average time for a single content type to be distributed to a single site collection is almost a constant. Therefore, it is safe to run this timer job on a large collection of site collections. You can use the average distribution time to estimate how long the timer job will take to execute, given the number of site collections and the number of content types to distribute. If those numbers are extremely large, you might find it takes hours or even days to run the timer job. Nevertheless, you can pause and resume this timer job, and content type publishing is not a frequent activity.

Note that the time that is required to execute this timer job can increase significantly if a content type pushdown occurs during the timer job, especially if many lists are involved. For more information about content type pushdown, see the Managed Metadata Connections section in the Managed metadata service application overview (SharePoint Server 2010).

Tip

When you try to publish a very large content type, you might see the following error:
WebException: the request was aborted.
The cause is that the size of the content type exceeds the 4 MB default maximum HTTP request size for the service application. To prevent this error, you can increase the maximumRequestLength value in the web.config file for the service application.

Taxonomy Update Scheduler timer job

The Taxonomy Update Scheduler timer job keeps the hidden taxonomy list on every site collection of a Web application in sync with the term store. The overall time that this timer job takes to run depends on the number of items that need to be updated and the number of site collections that contain updated items. This test shows how the size of the hidden list and the number of site collections in the Web application affects the average update time of a single item for a site collection.

The following graph shows the relationship between the number of site collections and the average time to update one term in one site collection.

Average time to update term

As shown in the following graph, the average time to update one term in one site collection increases slightly as the size of the hidden list increases.

Average time to update a term in a hidden list

Conclusions and recommendations: Taxonomy Update Scheduler timer job

An increase in the number of site collections does not have a significant effect on the average time to update a term in a site collection. Therefore, it is safe to run this timer job on a Web application that has a large number of site collections. You can estimate the overall execution time of the timer job by multiplying the average time to update a term in a site collection by the number of site collections and the average number of updated terms in each site collection. You can also pause and resume this timer job.

The size of the taxonomy hidden list increases over time as more and more terms are used by the site collection. The timer job might take longer to execute as the hidden list grows in size.

See Also

Other Resources

Resource Center: Managed Metadata and Taxonomy in SharePoint Server 2010