Step 7: Design the Farm

Article
02/26/2008

Published: February 25, 2008

Terminal servers in a farm need to be serving the same published applications and be configured the same way. This ensures that users receive the same experience no matter which terminal server they connect to.

Each terminal server farm contains a unique set of applications and users. The behavior of those users in the application can be quite varied. For these reasons, capacity planning and testing is necessary for each terminal server farm.

There are many variables involved in moving a set of applications that are running on client computers to instead run on terminal servers. The goal here is to arrive at a reasonable estimate of capacity without requiring excessive modeling and precision. Implement the first set of users, and record the lessons learned; then repeat for the next set of users, adjusting to include the learnings.

This step determines the form factor of the server in each farm and, therefore, the number of servers required to deliver the applications from the farm. Additional servers may then be specified for Terminal Services Web Access (TS Web Access), fault tolerance, load balancing, and maintenance.

Perform the following for each farm, and record the results in the job aid like the sample in Appendix C.

Task 1: Select a Form Factor for the Server

The goal of this task is to determine the most appropriate type of hardware on which to deploy the terminal servers.

Form factor in this guide refers to the combination of the servers’ characteristics including:

Processor architecture (32 versus 64 bit)
Number of CPUs and their speed
Amount of memory installed
Disk storage capacity and disk subsystem design
Number of network card ports configured

To make this selection, start with the prerequisites for Windows Server 2008 as a minimum requirement, and then determine which of the following purchasing options will be used:

Use existing hardware. Organizations may already have server hardware resources that can be reconfigured and redeployed to publish applications through Terminal Services. The primary drawback for using existing machines is that the hardware configuration might not match the “ideal” configuration for the terminal server farm. These machines can differ significantly based on the age of the hardware, the system specifications, and other capabilities. Often, using existing hardware can decrease overall costs of implementation (cost avoidance) but may result in sacrificing standardization. Since the servers in each farm must publish the same applications, they should be of the same form factor in order to provide a consistent user experience. More powerful servers can be assigned a higher server weight in the farm so that the TS Session Broker directs more sessions to them. Where there is significant variation in the form factors of the available servers, this may require that additional farms be added to the design in order to improve uniformity within each farm. That will of course drive up implementation and management costs.
Purchase new hardware with the organization’s standard form factor. The success of this approach will depend on how close is the organization’s standard form factor to the “ideal” configuration for the terminal server farm. Using a volume-purchased hardware configuration may lower overall costs of implementation (cost avoidance); but if it is not a close match to the ideal configuration, additional hardware units may need to be purchased in order to deliver the required performance and throughput from a farm.
Purchase new hardware with a form factor that can be selected to best fit the requirements of the farms. In this case, the organization has the opportunity to procure a hardware configuration that optimizes the Terminal Services implementation, but if it differs from previous standard configurations, the per unit acquisition cost and initial support costs may be higher.

When purchasing new hardware, there is no precise practical means of choosing a form factor. In the best case, the testing in step 7: task 2 will be conducted across several form factors, and the configuration that performs the best with the most users will be selected. It may be necessary to review this step and the defined strategy when applications are actually allocated to each farm. It is not uncommon to iterate several times through form factor decisions to find the optimal configuration.

Follow these guidelines, which are presented in priority order, to determine the ideal form factor for the servers:

64-bit versus 32-bit architecture. The 64-bit architecture removes kernel address space limitations that affect the number of terminal server sessions that are supported by the operating system in the 32-bit architecture. 64-bit architecture will be able to support more user sessions. The most significant performance drawback when migrating to 64-bit architecture is significantly higher memory usage.
Use large memory. In the Terminal Services environment, where many users are sharing the same application, large memory can significantly improve performance because it allows each user’s work to remain in-memory rather than being swapped out to a physical disk. Additional memory will also reduce the disk I/O demand for this reason. If 64-bit architecture is being deployed instead of 32-bit, consider implementing more memory, perhaps twice as much memory.
More numerous, smaller disk spindles rather than a fewer large disks. Consider using a RAID configuration for disk fault tolerance. The key to performance though is in the total disk IOs per second (IOPS). In order to increase the IOPS for the disk subsystem, consider using more numerous, smaller disks rather than fewer but larger disks. This approach distributes the work load across more spindles, thereby increasing the overall performance. The number of spindles used has a significant effect on the response times for file access. The disk activity generated on a typical terminal server system affects the following three areas:
- System files and application binaries
- Page files
- User profiles and user data
Ideally, these three areas should be supported by distinct storage devices. Use storage adapters with a battery-backed cache that allows write-back optimizations. Controllers with write-back cache support offer improved support for synchronous disk writes. Because all terminal server users have an individual hive, synchronous disk writes are significantly more common on a terminal server system. Registry hives are periodically saved to disk by using synchronous write operations.
Use multi-core CPUs. In the multi-user environment of terminal server, multiple processors and cores can help reduce CPU congestion.

Refer to the Performance Tuning Guidelines for Windows Server 2008 in the “Additional Reading” section of this step for more specific guidance on form factor for Terminal Services.

Whatever approach is chosen, the goal of this step is to select the server size and number that best meets the business requirements for security, service delivery, and fault tolerance with the fewest machines possible.

Document the selected form factor in the farm design job aid like the example in Appendix C, and then proceed to the scaling assessment in the next task.

Task 2: Determine the Number of Terminal Servers Required in the Farm

The optimal number of servers in a terminal server farm delivers a consistent, responsive user experience while balancing system utilization, growth capacity, and cost. In order to consistently arrive at that optimal number for each farm, start with the implementation of a small farm, follow one of the methodologies presented below, and measure the results to determine whether the farm delivers the expected performance. Learn from the experience and use it to adjust the method accordingly when planning the next farm.

Two methods are presented for sizing the terminal server farm:

The first method involves running multiple load tests using a fully configured server to see how many heavy, normal, and light users, respectively, a selected form factor can handle. The form factor could be adjusted after each test—for example, by adding more memory—and then re-running the test to optimize that form factor.
The second method uses measurements of current loads on the client computers and extrapolates them to determine the server load. A load test suite should then be run against a server of the selected form factor to confirm the results.

Choose the approach below that works best for your organization. Method 2 involves additional work at the beginning in order to gather the data from the workstation environment. If the load test scenarios can be readily set up, method 1 may be shorter and more effective.

Server Sizing Method 1: Estimating Load

A practical way to estimate server capacity prior to putting it in to production is to load test a server of the selected form factor with a complete set of applications and a full complement of users.

While monitoring the performance of the server using the guidelines in Appendix D: “Server Performance Analyzing and Scaling,” use a load testing product or live tests. The Windows Server 2003 Deployment Kit provides terminal server capacity planning tools—Roboserver (Robosrv.exe) and Roboclient (Robocli.exe)—which include application scripting support. You can use these tools, which are available on the Windows Server 2003 Deployment Kit companion CD, to easily place and manage simulated loads on a server.

When performing a load test, load the server with consistent groups of users in blocks, and increment it over time. Each group should be running a typical set of applications. Monitor the system until the utilization of the processor, memory, disk, or NIC exceeds acceptable limits. When the acceptable limit has been exceeded, subtract a comfortable measure of users from that last test. This reduced number of users is the total number of users per server for the form factor. Consider removing an additional percentage of users to create additional capacity in the server, which is sometimes referred to as buffer or headroom. This number is the users per server capacity value. Record it in the job aid like the example in Appendix C.

Run four suites of this test:

For heavy users, to establish the per-user cost of a heavy user.
For normal users, to establish the per-user cost of a normal user.
For light users, to establish the per-user cost of a light user.
For a mix of users, who represent the expected user population since that population will typically be a mixture of heavy, normal, and light users.

In each test, add users and monitor the system until it reaches the acceptable performance threshold. Record in the job aid (Appendix C) how many users of each type the server can hold and the unit cost of each single user. Confirm that the results of the last test, completed with a mix of users, match what would have been expected for that combination of heavy, medium, and light users.

Now test with a logon load. Multiply the users per server by the percentage of users expected to log on at peak logon time. Load the server with this value of simultaneous logons. The processor usage will reach 100 percent, so the important indicator is logon time. If logon time is unacceptable under the SLA, the users per server value must be reduced and retested.

Once the users per server value is determined, divide the total number of users likely to be connected to a terminal server farm at peak times by the users per server value to arrive at the base number of servers for the farm.

Total number of users connected at peak load / users per server = number of servers in the farm

Record this number in the job aid (Appendix C). Repeat the process for each defined terminal server farm.

Proceed to Task 3 to determine the number of servers required for the TS Web Access role service.

Server Sizing 2: Client-based Calculations

A second approach to sizing is to monitor the existing client computer resource usage and attempt to extrapolate the results and apply them to the server systems.

Use Windows Performance Monitor and Windows Task Manager to measure the resources that the applications consume on the client computers where they are currently running, assuming that they are running on existing machines. Once this data has been gathered for all the applications that will run together on a server farm, it can be used to determine the amount of memory, processor, disk IO, and network resources that will be required to deliver a satisfactory Terminal Services user experience.

Ideally, run these tests on actual Windows Server 2008 systems running the applications in Terminal Services sessions. However, for purposes of information gathering or writing an estimate, a stand-alone user workstation can be used. Appendix D: “Server Performance Analyzing and Scaling” details the most important performance counters for determining how much capacity to plan for based on the resources required to process a workload. To measure an application’s load, the following information needs to be recorded using the named counter so as to extrapolate the required capacity:

Processor usage. The % Processor Time counter shows how much of the processor capacity is being used.
Memory usage. The Memory\Available Mbytes counter indicates how much memory is available.
Disk. IO per second (IOPS). Disk free space should not normally be a concern, but if starting an application changes this value significantly, it becomes important.
Network. The Bytes Total/sec measurement becomes more relevant if these tests are on a terminal server being accessed by an RDC client.

Use the instructions in Appendix D to take measurements under each of the conditions listed below in this section. Record the information for each application in the list produced from step 3.

Also, treat a remote desktop session to a terminal server as an application, and complete the same steps. A remote desktop session is functionally equivalent to a Terminal Services published application in the sense that the same amount of data is sent for each screen update whether the screen is a Remote Desktop Session or an application. The remaining resources also can be measured in the same way as any published application is measured:

Baseline. Gather the system’s average processor, memory, disk access activity, and network activity for comparison.
Initial logon cost. When a user connects to the server to start a session, there is a logon process that can be resource-intensive. Measure the value of a user’s attaching to a server to begin a terminal server session. Also record the logon time so that peak logon time can be planned for in step 8.
Startup cost. How large a spike of system memory, processor, disk access, and network activity does starting the application consume? Does the resource utilization stay at the startup levels, increase, or reduce to a steady-state value?
Operating levels. After the application finishes starting up and is ready to be used, measure the difference between the baseline measurements and the resources the application uses during operation.
Successive user startup. Record the change in resources when a second user starts the application. This may be significantly different from the first user and needs to be factored in when sizing the terminal server farms.
Resource release. Reverse the steps named earlier to measure the resources returned to the system when a user closes the applications.

Repeat these steps for each application, and record them.

Note If there are different RDC client versions in scope and if the tests are being performed on a terminal server, then these tests must be run on each RDC client version. This is necessary because each newer version delivers improvements in performance and efficiency.

Using the resource usage information collected above, along with the mapping of users to farms that was determined in step 6, calculate the number of users that a server of the selected form factor could support.

Complete the following steps for both processor use and memory use as determined above:

Add together usage percentages of a typical first user’s application set, and subtract that number from 100.
Divide this result by the sum of the usage percentages of typical subsequent users.

The quotient plus 1 (for the first user) is the maximum users per server at normal workload. Compare the results for processor and server memory capacity, and then take the lower number.

See the following example.

Applications for typical user	First startup		Subsequent startup
	Percent processor usage	Percent memory usage	Percent processor usage	Percent memory usage
Microsoft Office Word 2007	2.0	1.2	1.5	.5
Adobe Acrobat 8.0	3.0	1.6	1.0	.7
Total	5.0	2.8	2.5	1.2
Processor capacity = ((100 - First startup processor percentage) ÷ Subsequent startup processor percentage) + 1 = 39 users
Server memory capacity = ((100 - First startup memory percentage) ÷ Subsequent startup memory percentage) + 1 = 82 Users
Capacity: Choose the 39 as users per server, since that is the lower number.

Realize that as the number of simultaneous logons approaches the maximum users per server, the time to log on increases. If the population consistently arrives and logs on at, or near, the same time every morning, such as at a call center or an accounting firm, then reduce the users per server value, possibly as much as 10 percent, to compensate.

Now divide this number into the number of users who will be assigned to that farm and who will be active at any given time to arrive at the number of servers that will be required in the farm.

Active users on farm / users per server = number of servers required in the farm

Record this number in the job aid (Appendix C).

Task 3: Determine the Number of Additional Servers Required for Fault Tolerance

The number of servers per farm, determined in the two previous tasks, does not account for fault tolerance. It’s therefore necessary to determine the number of additional servers needed to provide a fault tolerant implementation.

Terminal servers cannot be placed in a Microsoft Cluster Service (MSCS), so fault tolerance, if desired, must be provided by additional servers with a load balancing solution. If a server goes down, the users on that server will need to restart their sessions, which may be instantiated on one of the remaining servers. Incoming session requests will be load balanced onto the remaining servers by TS Session Broker. The user will experience session interruption in this case.

Total server capacity will be reduced during the downtime because the remaining servers will be more heavily loaded, so plan for one or more servers to account for this so that capacity levels are maintained in the event of unplanned or planned maintenance server downtime.

Determine the number of additional servers required to meet the fault tolerance requirements, and record this in the job aid (Appendix C).

Task 4: Determine the Number of Servers Required for TS Web Access

TS Web Access is the only Terminal Services role service that cannot be shared across farms, and for that reason it is considered here as part of the farm.

Note The TS Web Access role service cannot be shared with a Windows Server 2003 Terminal Services implementation.

The TS Web Access role service can be used to present applications on a Web site that is accessed by the user with a browser. However, when the user selects an application from the menu and launches it, he or she is connected directly to the terminal server, and TS Web Access is no longer involved.

At the time of writing, there is no authoritative product guidance available on the server sizing requirements for a TS Web Access role. TS Web Access is a light-weight application and can be hosted on any hardware configuration that supports a light-weight Internet Information Services (IIS) application. In a terminal server farm of a few servers, the TS Web Access role service may be hosted on an IIS Web server farm that is performing other work. It is therefore prudent to install the TS Web Access application on an existing server(s) and monitor the system usage to determine if dedicated hardware is required.

The TS Web Access role service can be made fault tolerant by duplicating the role service on additional server(s) and load balancing them using Network Load Balancing or a hardware load balancing solution. If the TS Web Access role service were to become unavailable, users who are accustomed to accessing their applications from its menu may not be able to access their applications. If these users can access their applications in another manner, balance the cost of providing high availability of the TS Web Access role service against the business benefit that it delivers in terms of user convenience. If TS Web Access is the only way for important external users, such as customers, to access applications, high availability may be imperative.

TS Web Access may be used to make a menu of applications available to both internal and external users. Consider whether the organization’s security posture will require that the internal and external facing TS Web Access users be hosted on separate systems.

Record if, where, and how many TS Web Access role services will be hosted for each farm in the job aid (Appendix C).

Decision Summary

In this step, the design of the farm has been completed. The form factor of the terminal servers in the farm has been decided upon, the servers have been sized, and the fault tolerance approach has been determined. The TS Web Access role service, if it will be used, has also been designed since it will be a part of the farm.

Before proceeding to the next step, tasks must be repeated for any other farms that will be implemented. In the next step, decisions will be made on the storage of user profiles and data.

Additional Reading

“Performance Tuning Guidelines for Windows Server 2008,” available at https://www.microsoft.com/whdc/system/sysperf/Perf_tun_srv.mspx.
Windows Server 2008 Terminal Services RemoteApp Step-by-Step Guide, available at https://go.microsoft.com/fwlink/?LinkID=84895.

This accelerator is part of a larger series of tools and guidance from Solution Accelerators.