Chapter 5 - Load Balancing

Article
08/31/2009

This chapter provides detailed information about each of the Microsoft Application Center 2000 (Application Center) load balancing features—including setup and configuration tips. The chapter uses simple cluster models to illustrate how the different load balancing options work and interact with each other. You'll also get an inside look at the sequence of events and processing activities that occur when you use either Microsoft Windows 2000 Network Load Balancing or Application Center Component Load Balancing (CLB).

Load Balancing Options

There are two load-balancing options that you can use on any cluster you create with Application Center:

Network Load Balancing (NLB)—This native Windows 2000 Advanced Server load-balancing mechanism distributes IP-based client requests across a cluster of Web servers.

Note Although it's not included with Windows 2000 Server, NLB is automatically installed on a system running Windows 2000 Server when Application Center is installed on the system.
Component Load Balancing (CLB)—This native Application Center load-balancing mechanism distributes client requests for component instance creations across a cluster of COM+ application servers.

And, of course, there's also the option of not using load balancing on a cluster. In some scenarios—for example, a small testing and staging environment—you can disable load balancing because it really isn't needed. There are also times during synchronization and monitoring when it's desirable to have load balancing temporarily disabled for a member.

Note You can choose to not use either NLB or CLB if you're running a third-party load balancing solution. You can continue to use your existing load balancer, but Application Center doesn't support managing these devices out-of-the-box. However, device-enabling solutions have been developed for selected load balancers. These solutions are documented in Chapter 13, "Third Party Load Balancer Support." If the load balancer that you're using isn't currently supported, you will have to develop custom scripts to enable communication between the load balancer and Application Center.

One of the guiding principles of the Application Center design is robust support for arm's length administration—which is to say, an administrator does not have to physically go to a member to set up or re-configure its load balancing behavior. The Application Center user interface simplifies configuration by abstracting away the details of the different load balancing options.

Network Load Balancing

NLB is Application Center's default option for redistributing IP requests among Web servers. In addition to balancing loads across a cluster, NLB also ensures high availability; if a member is taken offline or becomes unavailable, the load is automatically redistributed among the remaining cluster hosts.

Let's examine the NLB architecture in detail before looking at load balancing algorithms and configuration.

Network Load Balancing Architecture

NLB is a fully distributed IP-level load-balancing solution. It works by having every cluster member concurrently detect incoming traffic that's directed to a cluster IP address. (See Figure 5.1.) You can have several load-balanced IP addresses, and traffic to any IP address—except the dedicated, or management, IP address—on an NLB-enabled adapter will get load balanced.

Note It's important to understand that NLB itself has no notion of a cluster or cluster members—these concepts are specific to Application Center. For the sake of documentation consistency we've used the terms cluster and member when dealing with NLB concepts and technology.

Low-Level Architecture and Integration

Application Center provides a tightly integrated interface to NLB by using Windows Management Instrumentation (WMI) to communicate with NLB and the network adapters through the Windows 2000 kernel and TCP/IP network layers. NLB runs as a network driver logically situated beneath higher-level applications protocols such as HTTP and FTP. As you can see in Figure 5.1, this driver is an intermediate driver in the Windows 2000 network stack.

Bb734910.f05uj01(en-us,TechNet.10).gif

Figure 5.1 Low-level architecture for NLB as implemented by Application Center

As shown in Figure 5.1, the NLB driver can be bound to only one network adapter in Windows 2000. Figure 5.1 also illustrates how Application Center abstracts a significant amount of low-level network detail, which simplifies configuring and managing NLB on a cluster.

Application Center uses the NLB WMI provider to interact with NLB and fires WMI events for the cases shown in Table 5.1.

Table 5.1 WMI Events Related to Load Balancing

Activity	Comment
Member starting to go offline	MicrosoftAC_Cluster_LoadBalancing_ServerOfflineRequest_Event
Member starting to drain	MicrosoftAC_Cluster_LoadBalancing_ServerDrainStart_Event
Member finished draining	MicrosoftAC_Cluster_LoadBalancing_DrainStop_Event
Member offline	MicrosoftAC_Cluster_LoadBalancing_ServerOffline_Event
Member starting to go online	MicrosoftAC_Cluster_LoadBalancing_ServerOnlineRequest_Event
Member online	MicrosoftAC_Cluster_LoadBalancing_ServerOnline_Event
Member failed, has left the cluster (1)	MicrosoftAC_Cluster_Membership_ServerFailed_Event
Member alive, has rejoined the cluster (1)	MicrosoftAC_Cluster_Membership_Server_Live_Event

1 These are cluster membership events, rather than load balancing events

Network-Level Architecture

As indicated in Chapter 4, "Cluster Services," the front-end adapter on the controller has at least one static IP address that's used for load balancing.

This IP address is the cluster (or virtual) IP address. The cluster IP address—common to every cluster member—is where all inbound packets are sent. And, since the IP addresses on the front-end adapter are mapped to the same media access control (MAC in conceptual artwork) address, each member is able to receive the TCP packets. (See "Adding a Server" in Chapter 4.)

Note IP address mapping to the same media access control address is only true for the default Unicast NLB mode that Application Center uses. This is not the case in a Multicast environment.

After receiving an incoming packet, each cluster host passes the packet up to the NLB driver for filtering and distribution using a hashing algorithm. Figure 5.2 shows the network level of the NLB architecture for a two-node cluster.

Bb734910.f05uj02(en-us,TechNet.10).gif

Figure 5.2 TCP and Ethernet architecture for an Application Center cluster

NLB Performance

NLB's architecture maximizes throughput by eliminating the need to route incoming traffic to individual cluster hosts. The NLB approach implements filtering to eliminate unwanted packets rather than routing packets—which involves receiving, examining, rewriting, and resending—to increase throughput.

During packet reception, NLB, which is fully pipelined, overlaps the delivery of incoming packets to TCP/IP and the reception of other packets by the network adapter driver. Because TCP/IP can process a packet while the NDIS driver receives a subsequent packet, overall processing is sped up and latency is reduced. This implementation of NLB also reduces TCP/IP and NDIS overhead and eliminates the need for an extra copy of packet data in memory.

During packet sending, NLB enhances throughput and reduces latency and overhead by increasing the number of packets that TCP/IP can send with one NDIS call.

See also: Appendix B, "Network Load Balancing Technical Overview," for detailed information about NLB performance and scalability.

Load Balancing Distribution

Load distribution on a cluster is based on one of three algorithms that are determined by the client affinity, which is part of the port rules that can be configured for the cluster.

No Affinity

In the case where no client affinity is specified, load distribution on a cluster is based on a distributed filtering algorithm that maps incoming client requests to cluster hosts.

Note The load-balancing algorithm does not respond to changes in the load on each cluster host (such as CPU load or memory usage), but you can adjust load-balancing weights on a per-server basis. The other case in which the load distribution map is recalculated is when the cluster membership changes—which is to say a member is taken out of the load balancing loop, members are added or removed, or port rules are changed.

When inspecting an arriving packet, the hosts simultaneously perform a statistical mapping to determine which host should handle the request. This mapping technique uses a randomization function that calculates which cluster member should process the packet based on the client's IP address. The appropriate host then forwards the packet up the network stack to TCP/IP, and the other hosts discard it.

Note The Single affinity setting assumes that client IP addresses are statistically independent. This assumption can break down if a firewall is used that proxies client requests with one IP address. In this scenario, one host will handle all client requests from that proxy server, defeating the load balancing. However, if No affinity is enabled, the distribution of client ports within a firewall will usually suffice to give good load balancing results.

The simplicity and speed of this algorithm allows it to deliver very high performance (including high throughput and low overhead) for a broad range of applications. The algorithm is optimized to deliver statistically even distribution for a large client population that is making numerous, relatively small requests.

Single Affinity

Single affinity is the default setting that the wizard uses when you create an NLB cluster. This affinity is used primarily when the bulk of the client traffic originates from intranet addresses. Single affinity is also useful for stateful Internet (or intranet) applications where session stickiness is important.

When Single affinity is enabled, the client's port number isn't used and the mapping algorithm uses the client's full IP address to determine load distribution. As a result, all requests from the same client always map to the same host within the cluster. Because there is no time-out value (typical of dispatcher-based implementations), this condition persists until cluster membership changes.

Note If you have a stateful application, you should use Single affinity and enable request forwarding; otherwise, especially if you're using No affinity, you should not have request forwarding enabled for HTTP requests. For more information on request forwarding, refer to "Maintaining Session State with Network Load Balancing" later in this chapter.

Class C Affinity

As in the case of Single affinity, client port numbers aren't used to calculate load distribution. When Class C affinity is enabled, the mapping algorithm bases load distribution on the Class C portion (the upper 24 bits) of the client's IP address.

IP address basics

IP addresses are 32-bit numbers, most commonly represented in dotted decimal notation (xxx.xxx.xxx.xxx). Because each decimal number represents 8 bits of binary data, an IP address can have a decimal value from 0 through 255. IP addresses most commonly come as class A, B, or C. (Class D addresses are used for multi-cast applications, and class E are reserved for future use.)

It's the value of the first number of the IP address that determines the class to which a given IP address belongs.

The range of values for these classes are given below, using the notation N=network and H=host for allocation.

Class	Range	Allocation
A	1-126	N.H.H.H
B	128-19	N.N.H.H
C	192-223	N.N.N.H

Using these ranges, an example of a class C address would be 200.200.200.0.

This ensures that all clients within the same class C address space map to the same host—which is why this setting isn't very useful for load balancing internal network, or intranet, traffic.

Class C affinity is typically used when the bulk of the client traffic originates on the Internet.

Figure 5.3 displays the Network Load Balancing Properties dialog box for the component that's configured for a front-end adapter. In this particular example, load balancing is configured for multiple hosts with Single affinity. By default, each member is configured to handle an equal load on the cluster. These settings are automatically created by Application Center when you create a cluster (or add a member) and enable NLB.

Bb734910.f05uj03(en-us,TechNet.10).gif

Figure 5.3 Affinity settings for an NLB-enabled network adapter

Convergence—Redistributing the Load on an NLB Cluster

There are several instances in which cluster traffic has to be remapped due to a change in cluster membership: when a member leaves the cluster and when a member joins the cluster. Either event triggers convergence, which involves computing a new cluster membership list—from NLB's perspective—and recalculating the statistical mapping of client requests to the cluster members. A convergence can also be initiated when several other events take place on a cluster, such as setting a member on/offline for load balancing, changing the load balancing weight on a member, or implementing port rule changes.

Note Adjusting load balancing weight for a member or members makes it necessary to recalculate the cluster's load mapping, which also forces a convergence.

Removing a Member

Two situations cause a member to leave the cluster or go offline in the context of load balancing. First, the member can fail, an event that is detected by the NLB heartbeat. Second, the instance is explicit and is initiated by the system administrator, who can either take a member out of the load-balancing loop or remove it from the cluster.

The NLB heartbeat

Like Application Center, NLB uses a heartbeat mechanism to determine the state of the members that are load balanced. This message is an Ethernet-level broadcast that goes to every load-balanced cluster member.

The default period between sending heartbeat messages is one second, and you can change this value by altering the AliveMsgPeriod parameter (time in milliseconds) in the registry. All NLB registry parameters are located in the HKEY_LOCAL_ MACHINE\System\Current Control Set\Services\WLBS\Parameters key.

NLB assumes that a member is functioning normally within a cluster as long as it participates in the normal exchange of heartbeat messages between it and the other members. If the other members do not receive a message from a member for several periods of heartbeat exchange, they initiate convergence. The number of missed messages required to initiate convergence is set to five by default. You can change this by editing the AliveMsgTolerance parameter in the registry.

You should choose your AliveMsgPeriod and AliveMsgTolerance settings according to your failover requirements. A longer message exchange period reduces the networking overhead, but it increases the failover delay. Likewise, increasing the number of message exchanges prior to convergence will reduce the number of unnecessary convergence initiations due to network congestion, but it will also increase the failover delay.

Based on the default values, five seconds are needed to discover that a member is unavailable for load balancing and another five seconds are needed to redistribute the cluster load.

During convergence, NLB reduces the heartbeat period by one-half to expedite completion of the convergence process.

Server Failure

When a cluster member fails, the client sessions associated with the member are dropped. However, NLB does attempt to preserve as many of a failed member's client sessions as possible.

After convergence occurs, client connections to the failed host are remapped among the remaining cluster members, who are unaffected by the failure and continue to satisfy existing client requests during convergence. Convergence ends when all the members report a consistent view of the cluster membership and distribution map for several heartbeat periods.

Set Offline/Remove Server

An administrator can take a specific member offline either via the Set Offline command in the user interface or with the command-line command AC /LOADBALANCE. This action removes the member from the load-balancing loop, but it remains in the cluster. Application Center extends NLB's connection persistence by implementing draining. Draining describes a server state where existing TCP connections are resolved and the member does not accept any new TCP connection requests. Draining is primarily a feature that allows existing sessions to complete.

The default draining time is 20 minutes and is configurable via the Application Center user interface. (See Figure 6.6 in Chapter 6.)

Note Whenever you initiate the action of either setting a member offline or removing it from the cluster, you are prompted to drain the member first. You can, of course, ignore this prompt and force completion of the action. See Figure 5.4.

Bb734910.f05uj04(en-us,TechNet.10).gif

Figure 5.4 Draining prompt when taking a member offline

Adding a Member

Convergence is also initiated when a new member is added to a cluster.

Note In an Application Center NLB cluster, convergence takes place after the new member is online in the context of load balancing. If the new member is in the synchronization loop, it is set online after it's synchronized to the cluster controller; otherwise, it's set online immediately. The latter option is enabled by default and you can change it on the last page of the Add Cluster Member wizard.

After convergence is completed, NLB remaps the appropriate portion of the clients to the new member. NLB tracks TCP connections on each host, and after their current TCP connection completes, the new member can handle the next connection from the affected clients. In the case of User Datagram Protocol (UDP) data streams, the new host can handle connections immediately. This can potentially break client sessions that span multiple connections or make up UDP streams. This problem is avoided by having the member manage the session state so that it can be reconstructed or retrieved from any cluster member. The Application Center request forwarder manages the session state through the use of client cookies.

Note The Generic Routing Encapsulation (GRE) stream within the Point-to-Point Tunneling Protocol (PPTP) is a special case of a session that is not affected by adding a new member. Because the GRE stream is contained within the duration of its TCP control connection, NLB tracks this GRE stream along with its corresponding control connection. This prevents disruption to the PPTP tunnel.

Network and NLB settings propagation

In order for an NLB cluster to function correctly, all cluster-wide NLB settings (for example, the IP address bound to the network adapter, default gateway settings, and client affinity) have to be replicated from the cluster controller to cluster members. Only network settings on the NLB-bound network adapter are replicated. If NLB is not used on the cluster, these settings are not replicated.

In addition to the settings that have to be replicated, there are some NLB and network settings that need to be configured automatically on a per-node basis. The Application Center Cluster Service, rather than the replication drivers, handles these particular settings.

See also: Chapter 6, "Synchronization and Deployment."

Maintaining Session State with Network Load Balancing

Even with a single Web server, one of the biggest challenges faced by application developers is maintaining session state or coherency, which is to say, ensuring that client session information is not lost.

ASP Application state versus Session state

Application state is global and applies to all users of an application. You can use it, for example, to initialize variables and update application-wide data. Application state is typically used for Active Server Pages (ASP) output/input caching.

Session state refers to each user of the application. Session state is a temporary store for user session information such as user preferences.

In a load-balanced, multi-server environment, managing and resolving session state for individual clients is more complex.

In certain scenarios a client request can get re-mapped to the wrong cluster member, which is to say the client requires state information that was created on a different member during an earlier part of the session. This typically happens when a client connection originates from a proxy server environment that consists of several servers, and different servers may be used to proxy requests from a single client. Because NLB maps requests according to the client's IP address, successive requests that are really coming from the same client appear to have a different IP address. As a result, the most recent request may get sent to a different cluster member. This situation is particularly problematic for server applications that maintain state locally on the server (in memory or in persistent storage), such as ASP Session state. There are alternative ways of keeping state, depending on your business or application needs (which determine how important session state really is). Among these alternatives are client-side cookies and databases.

Note The issue of maintaining session state is not unique to NLB; all load-balancing solutions have the potential to map a new client request to the wrong server.

The Application Center request forwarder is a mechanism designed to resolve session state issues when handling HTTP client requests.

The Request Forwarder

The request forwarder design is based on these principles:

Transparent to clients—Clients are not aware of request forwarding and, beyond the requirement of accepting cookies, do not have to participate in request forwarding programmatically.
Transparent to server applications—Internet Information Services version 5.0 (IIS) applications, such as ISAPI and ASP pages, on either the wrong or the original member that handled a client request are unaware of any request handling problems caused by load balancing.
Generic—Generic policies, rather than hard-coded publishing verbs and HTTP URL formats, are applied to specific HTTP verbs and parts of the URL namespace.
Easy to configure—Minimal configuration is required to use request forwarding.
Distributed—Processing on the forwarding server is reduced by having the bulk of the processing take place on the original server that handled the client request.

The request forwarder, implemented as an ISAPI filter and extension, sits between the HTTP client and server applications (for example, ISAPI and ASP pages). It stores information that identifies the sticky server—the first server that handled the client request—in an HTTP cookie.

Note The cookie, which is only generated for sessions that require coherency, consists of a single name-value pair where RQFW is {server instance GUID}. The server instance globally unique identifier (GUID) is known by every member and is used to ensure that requests are forwarded only within a cluster or to members in the out of cluster forwarding (OCF) list.

This cookie is returned to the client on its first trip, and on subsequent client requests the cookie is checked to see which server first handled the client during a given session. If NLB sends the request to a different server than the original, the request forwarder sends the request to the sticky server. The request forwarder behaves like a proxy server without caching—if a request needs to be forwarded, the server holding the request opens a connection to the sticky server and forwards the HTTP request over the back-end network. The original server responds to the HTTP request, and the request forwarder pipes the response back out to the client. The next two drawings, Figures 5.5 and 5.6, show the processing logic that the request forwarder uses when determining how to handle an incoming client request.

Note The following abbreviations are used in Figures 5.5 and 5.6:

RF—request forwarder
CC—cluster controller
FP—FrontPage
DAV—Distributed Authoring and Versioning
ASP—Active Server Page
VDir—virtual directory
HTMLA—HTML Admin

By default, request forwarding is enabled for any site where ASP is enabled, which is by default all sites, but you can change this option or disable session coherency if you don't want to use this feature. The configuration options for advanced load balancing are described in detail in "Network Load Balancing Administration" later in this chapter.

Let's look at the request forwarder architecture before covering the different scenarios that require this feature, and seeing how request forwarding works in an Application Center cluster.

Bb734910.f05uj05(en-us,TechNet.10).gif

Figure 5.5 Request forwarding process flow chart, part 1

Bb734910.f05uj06(en-us,TechNet.10).gif

Figure 5.6 Request forwarding process flow chart, part 2

The Request Forwarder Architecture

From an architectural perspective, the request forwarder, which runs in process to IIS, consists of two parts: the request forwarder filter and the request forwarder engine.

The Request Forwarder Filter

The request forwarder filter is an ISAPI filter that monitors incoming requests and decides whether to allow requests to pass through and be executed locally or forward the request to another server, and if it must forward the request, it does the necessary preparatory work. It also gives out and interprets routing cookies, which allow for session coherency (sticky sessions) where a client is attached (stuck) to a single server for the lifetime of its browser session.

The request forwarder filter is similar in design to the ISAPI application provided with Microsoft Proxy Server. It is installed as a high-priority SF_NOTIFY_PREPROC_HEADERS filter, which means that it will execute before all other filters except for READ_RAW filters.

Note It's important to ensure that the Security Support Provider Interface (SSPI) filter precedes the request forwarder filter in the filter list.

When the filter receives a request, the URL is retrieved and the directory, file name, and type are extracted. Most of the forwarding decision-making can be made from this information alone. One of the primary objectives at this stage is to ensure maximum performance for static file processing because static files take the least amount of time to process, and as a consequence, the request forwarding impact is the greatest.

Note For security reasons, the request forwarder filter ensures that no URLs are permitted to address the request forwarder engine directly; therefore, URLs that access files with an .rqrw file extension are not allowed through.

The Request Forwarder Engine

The request forwarder engine is an ISAPI extension that uses a COM component to forward requests. By having the request forwarder engine reside in the same DLL as the request forwarder filter it is easier to maintain connection state and support connections for special cases such as NTLM authentication.

Note For authenticated requests, the request forwarder engine passes the authentication HTTP headers through to the target. It does not attempt to do any authentication itself. Because NTLM authentication requires the same connection for the lifetime of the authenticated request, connections are kept alive for the duration of the requests.

The request forwarder engine prepares each client request and does the actual forwarding. The engine is script-mapped to handle GETs for URLs that have an .rqfw file extension. After a packet is prepared for transmission to the target server, the request forwarder pipe COM component is called to perform the actual transfer.

Note The request is dispatched to a thread pool for an asynchronous connection to the target. After a connection is established, control passes to a thread pool, which in turn sends the request to the target. The thread pool streams data from the target's response back to the client.

After the data is sent and a response received, it is the engine's responsibility to handle all error cases, including those summarized in Table 5.2. These error cases will always return a 502.1 error.

Request forwarding is generic, and allows forwarding to any cluster member. However, for security reasons destination members for sticky sessions are always validated against the directory of known members before allowing forwarding, and forwarding, by default, is restricted to the members within the cluster that received the client request.

Note There may be special cases (a cluster farm, for example) in which request forwarding between clusters is required, and OCF needs to be enabled.

You can create an MD_RF_OCF_SERVERS list that the request forwarder will check whenever it receives a request for a sticky session where the server GUID is not a cluster member. This list is a MULTI_SZ metabase entry where items take the form of IP, {guid}. The metabase path to this entry is: /LM/Appcenter/Cluster. The IP address referenced here is the IP address of the back-end adapter on the member to which you want to forward the request.

Table 5.2 Request Forwarding Error Cases and Their Resolution

Error case	Resolution
Target server busy	The response is returned to the user as is, so they have the option of clicking the Refresh button.
Target server unavailable (1)	This occurs when the request forwarder fails to connect to the target, or if a failure occurs during the forwarding process.
	Server state cannot be recovered in this case, so the current server has to handle the request. The existing cookie is deleted and the request is forwarded back to the current server for processing.

1 If the request fails once the first packet has been sent, the request forwarder is unable to recover the original request information and will not be able to forward the request back to the forwarding server for local processing. In this case a 502.1 error will be returned and an error code or explanation of the failure will be present in the body of the message.

Request Forwarding Notes

This section contains miscellaneous information related to request forwarding.

Header Information

Because the target sees the forwarding server as a regular client, data from the originating client needs to be stored and passed along. This data is stored as header information that's added to the request. These headers are summarized in Table 5.3.

Table 5.3 Custom Header Information

Header	Example	Description
MS-RFServer	WebServer1	The name of the server from which the request is coming.
MS-RFClient	192.168.100.001	The IP address of the client that made the original request.
MS-RFUsername	Username	The user name, if any, of the client that made the original request.
MS-RFTTL	4	The number of times that this request has been forwarded. By default a request can be forwarded only once.
MS-RFHostHeader	Microsoft.com	The original host header requested by the client. *See sidebar* .

Sites bound by IP address

Normally, sites are identified by a combination of ip:port:hostheader. A site that is bound by IP address will have a binding of the form ip:port:. The problem that the request forwarder faces is that if a request is forwarded to another server, the IP address that is forwarded will be the IP address on the back-end adapter, not the front-end adapter that the binding is for. The request forwarder automatically adds new headers to each IP-bound site to enable request forwarding. The new header takes the form port:ACv1VSite#, where # is the InstanceID of the virtual site. Because the host header has to be altered to enable forwarding—overwriting the original header information—the request forwarder adds the MS-RFHostHeader to store the original host header information. When the request forwarder detects the presence of the MS-RFHostHeader, it restores the originating header information on the target.

Note Because IIS is unaware that forwarding has occurred, if an ISAPI or ASP application is to retrieve the client information correctly, it must check for these special headers first. If the headers are not present, the application sends a request to IIS for the relevant information.

Security alert—IP address spoofing using HTTP headers

If the IIS application uses any information in the request forwarder headers to identify a client, it opens itself up to spoofing attacks. For example, the client sends a direct request to Server A by using the GUID or dedicated IP address of a cluster member and spoofs the following values in the HTTP header fields:

MS-RFServer:ServerB
MS-RFClient:192.168.001.001
MS-RFUserName:Administrator
RMS-RFTTL:1

This makes Server A think that the request was forwarded by Server B and that the client's source IP address is 192.168.001.001. As a result, ASP and ISAPI filters will perceive the request as coming from Server B, with the client source IP address specified, and authenticated as an administrator.

For Internet clients IIS applications should treat the request forwarder headers as unreliable information and, where available, use reliable sources (like IIS) for this information—use the request forwarder headers only as a last resort.

The request forwarder filter has to modify the appropriate headers to ensure that the request gets forwarded to the correct cluster member. The forwarding server also has to parse returning responses to ensure that notifications, the "Connection:close" notification in particular, are caught.

HTTPS

Because HTTP server (HTTPS) requests are encrypted, the receiving server has to perform a full Secure Sockets Layer (SSL) handshake before it can examine the routing cookie. After the routing information is decrypted, the forwarding server encrypts the request and sends it to the correct server. We recommend that you configure NLB with either Single or Class C affinity because this ensures the session stability that's needed for encryption.

Note Because the request forwarder uses host-headers to determine the destination server, and that information is encrypted in the packet, the request forwarder supports SSL on a per-port basis only.

Application Center only supports one SSL site per cluster if the site uses port 443 because encryption requests cannot be forwarded to multiple sites. You can overcome this by binding SSL to non-standard ports for the other site, that is to say, not port 443. However, most public sites will want to use 443 for obvious reasons.

Request Forwarder Filter Positioning

There are several instances where the request forwarder filter has to be positioned correctly in the filter priority chain to ensure that request forwarding is handled correctly. These instances are:

In order to perform the SSL handshake required for HTTPS, the filter must be installed after the SSPI.
If there are ISAPI filters in use that remap URLs, and authors want requests to these URLs to be handled by the request forwarder, these ISAPI filters must be ahead of the request forwarder filter.

Performance Counters and Error Messages

Application Center contains a collection of performance counters and error messages that are specific to the request forwarder.

Performance Counters

Table 5.4 lists the counters that are available and can be accessed through the Windows Performance Monitor.

Table 5.4 Available Request Forwarder Performance Counters

Counter	The total number of	Unit
Total Application Center Administration Requests	Requests received by the Application Center Administration site.	Integer
Total Application Center Administration Requests/sec	Requests received by the Application Center Administration site, expressed on a per-second basis for a given period of time.	Integer/time
Total Coherent Session Requests	Requests for pages requiring session coherency.	Integer
Total Coherent Session Requests/sec	Requests for pages requiring session coherency, expressed on a per-second basis for a given period of time.	Integer/time
Total Dynamic Requests	Requests for dynamic content.	Integer
Total Dynamic Requests/sec	Requests for dynamic content expressed on a per-second basis for a given period of time.	Integer/time
Total Failed Requests	Failed requests.	Integer
Total Failed Requests/sec	Failed requests expressed on a per-second basis over a given period of time.	Integer/time
Total Forwarded Requests	Requests forwarded.	Integer
Total Forwarded Requests/sec	Requests forwarded, expressed on a per-second basis over a given period of time.	Integer/time
Total Publishing Requests	Requests submitted by a publishing tool such as Microsoft FrontPage 2000 (FrontPage).	Integer
Total Publishing Requests/sec	Requests submitted by a publishing tool, such as FrontPage, expressed on a per-second basis over a given period of time.	Integer/time
Total Requests	Requests received by the request forwarder.	Integer
Total Requests/sec	Requests received by the request forwarder, expressed on a per-second basis over a given period of time.	Integer/time
Total Web Administration Requests	Requests received by the Web Administration site.	Integer
Total Web Administration Requests/sec	Requests received by the Web Administration site, expressed on a per-second basis over a given period of time.	Integer/time

Error Messages

The request forwarder generates a single custom error on a per-virtual site basis (a 502.1 error) for the events listed in Table 5.5, which also lists the corresponding error message for each event.

Table 5.5 Request Forwarder Error Events and Messages

Event	Message
Server offline	Unable to forward request to an offline member.
Memory error	Unable to process request; out of memory.
Incorrect extension	Extension .rqfw reserved for use by Application Center.(1)
No controller error	Unable to forward request to controller.
Instance error	Unable to retrieve the InstanceID of the request.
Metabase error	Unable to retrieve the property from the metabase.
Port error	Unable to retrieve, or invalid, port.
IP error	Unable to retrieve the IP number.
Initialize error	Initialization error prior to forwarding.
Get hop count error	Unable to retrieve hop count from IIS.
Set hop count error	Unable to set hop count header.
Hop count error	Too many hops, not forwarding.
HRESULT error	Windows HRESULT error.

1 The request forwarder returns error 404 when it receives a request to an incorrect extension (.rqfw).

Each error is accompanied with the following information:

"While acting as a transparent gateway, the server attempted to contact an upstream content server and received a response that was not valid."

The error file is installed in the IIS custom errors directory, and the filter transmits the file by using the IIS support function transmit file. The actual response code sent to IIS reads "502.1 Transparent Gateway Error". There is a special metabase entry in the root node of each virtual site that specifies the location of this file. Setting this entry, ID 57615, is supported at the per-virtual site level only. In the event that this file cannot be found in the custom file list for the Web site, or if the file is missing, there is an abbreviated version of the file's messages stored as a resource string.

Scenarios that Require Request Forwarding

In an Application Center cluster, the following scenarios require request forwarding:

Applications with server-side state—Applications that have server-side state, typically ASP applications that use session variables, must have their clients return to the same member for every request in a single session. The request forwarder works with any type of dynamic content. In this scenario, you should evaluate the appropriateness of the request forwarder as an architectural decision—a high volume of forwarded requests could hurt scalability and performance. You should configure your load balancer for session stickiness in this scenario. For example, in an NLB cluster, configure load balancing to Single affinity. The request forwarder is there to catch the small percentage of requests that fall through.

Note A list of static file types (shown in the Advanced Load Balancing Options dialog box) is maintained as a global table stored at the AppCenter/Cluster level. If an incoming file extension is not found in this table, it's considered to be dynamic and will be treated as a sticky file. You can reduce or expand this list according to the file types handled at your site. Remember that while there is a performance cost as the list increases in size, there is an even greater performance cost associated with forwarding files unnecessarily.
Cluster publishing with FrontPage or WebDAV—These HTTP-based publishing requests need to be directed to the cluster controller to ensure that information is synchronized across the cluster correctly. Any processing overhead associated with synchronizing FrontPage or WebDAV-supported directories and files, which change infrequently, is offset by two gains. First, consistent behavior is ensured when using these publishing tools in a cluster environment; and second, the presence of these directories and files on every member ensures that it is possible to promote a reliable member to controller. Forwarding for FrontPage and WebDAV publishing is disabled by default. You can change these settings on the Request Forwarding tab on the clustername Properties dialog box. (See Figure 5.10.)

Note Typically, there is a delay in getting new content pushed out from the controller to all the cluster members. This lag may result in a user seeing old content when they browse a site simply because the new pages haven't been synchronized to the server that received the client request.
IIS/Application Center Web-based administration—Because all administrative actions for IIS and Application Center HTML administration must be executed on the controller, these requests must be forwarded to the controller to ensure that changes are correctly synchronized to members. When the request forwarder starts, it caches the site identifier for the IIS and Application Center Administration sites and forwards requests accordingly.

Request Handling Examples in a Network Load Balancing Cluster

The following drawings illustrate client request handling in an NLB cluster. In the first scenario (Figure 5.7), session state (coherency) is not maintained; in the second scenario (Figure 5.8), session state is maintained.

Stateless Session

In this scenario, the client request is sent to the cluster IP address—seen by the client as a single host—and NLB routes the request to the appropriate host based on the load-balancing algorithm that's in use.

Bb734910.f05uj07(en-us,TechNet.10).gif

Figure 5.7 Client request handling on a load-balanced cluster without session coherency

Processing Activities and Their Sequence

The processing steps for handling this type of client request are as follows:

A client sends a request to the Web site, https://samples.microsoft.com, which has a cluster IP address of 207.46.130.30.
All of the members hear the incoming request and resolve its destination according to the cluster's load-balancing algorithm, which is based partially on NLB client affinity.
Server A, designated as the target, picks up the client request.
Server A resolves the client request, and then returns the requested item to the client.

Stateful Session

This scenario is similar to the preceding one in that the client request is sent to the cluster IP address—seen by the client as a single host—and NLB routes the request to the appropriate host based on the load-balancing algorithm that's in use. However, the client request originates from a proxy server environment, so it's necessary to use the HTTP request forwarder to maintain session coherency and track which server handled the first client request. The next illustration, Figure 5.8, shows how two requests from the same client are handled by Application Center. The upper part of the drawing ("First Pass") shows how the first request is handled, and the lower part of the drawing ("Second Pass") shows how the second—and how a subsequent request that was balanced to a different server—would be processed by using request forwarding.

Processing Activities and Their Sequence

The processing steps for handling this type of client request during the first pass are as follows:

A client sends a request to the Web site (https://samples.microsoft.com ), with the cluster IP address of 207.46.130.30 to the proxy server farm. The request is handled by Proxy 2, which gives the request an originating address of 207.46.130.14
Proxy server 2 sends the request to the Application Center cluster.
All of the members hear the incoming request and resolve the request based on the load-balancing algorithm, which is determined partially by the cluster's NLB client affinity. Server A picks up the client request.
The request forwarder on Server A determines if a cookie is required and, if so, creates one for the session.

Note At this point, the member that's designated to handle the request determines whether or not request forwarding is required. If it is, the member checks to see if session coherency is enabled. Among the other checks made at this point are: the nature of the request (for example, an ASP session or FrontPage publishing), whether the requested page is one of the files that should not be forwarded, and whether a cookie needs to be generated. If required, the member creates a routing cookie that contains information that uniquely identifies the server that owns the sticky client session.
Server A sends the cookie and the requested page to the proxy server that initiated the request, Proxy 2.
Proxy 2 sends the requested page and cookie to the client.

Figure 5.8 Client request handling on a load-balanced cluster with session coherency

The processing steps for handling a client request during the second pass are as follows:

The client sends a request that's handled by the server Proxy Server.
Proxy 1 assigns an originating IP address of 207.46.130.45 to the request, and then sends the request to the cluster.
The members resolve the request based on the load-balancing algorithm, which is determined by the cluster's NLB client affinity. This time, Server B picks up the client request.
Server B determines that the requested file is dynamic and checks for a cookie in the request header. Based on the cookie information, Server A is identified as the target for the request. The request forwarder sends the request to Server A over the back-end network.

Note A new cookie may be created if a request arrives that doesn't have its sticky flag enabled or if the routing information points to an invalid target.

If the member specified in the cookie is out of service or offline, the request is handled locally. The original ASP session information is removed from the cookie and changed to reflect the new sticky server.
Server A prepares the response, which is then pipelined by the request forwarder to Server B.
Server B passes the requested page to Proxy 1.
Proxy 1 sends the response to the client.

The preceding steps provide a simple summary of the process that takes place when session coherency is enabled and the HTTP request forwarder is used.

Network Load Balancing Administration

The Application Center user interface enables you to work with the various aspects of load balancing on a cluster through properties dialog boxes at the cluster and member nodes level.

Setting a Server Offline or Online

Right-clicking an individual member node exposes the pop-up menu to take a member out of the load-balancing loop (Set Offline). If you haven't already drained the member's connections, you'll be prompted by the dialog box shown in Figure 5.4 earlier in this chapter. As previously noted, you can accept the default draining period of 20 minutes, enter your own one-off draining period, or take the member out of the load-balancing loop immediately. Unless you're working in a test environment, you should allow the member to drain its connections. This enables a cleaner NLB convergence by reducing the risk of disrupted user sessions.

Note Even if the member being drained has no more connections coming in from NLB, the member is drained for the full draining period to ensure that any potential requests that are forwarded from other cluster members are handled.

Adding a member to the load-balancing loop is done in the same way as taking a member out of the loop. The only difference is that a draining period isn't required. After NLB receives a heartbeat message from the new member, convergence is initiated and a new load balancing membership list is calculated along with revised load distribution.

Configuring Load Balancing Weights

There are several scenarios where you might want to adjust the relative amount of traffic that an individual member handles:

You need to compensate for hardware-based performance differences. (This is the most common scenario.)
You have a cluster in which updates to the Web server occur frequently and a large volume of data is replicated throughout the day. In this instance, it would make sense to have the controller respond to fewer client requests and dedicate more resources to cluster synchronization. Alternatively, you could set the controller offline for load balancing during this replication period.
You have a cluster member that's also handling FTP and e-mail traffic. Once again, you can free up resources on this member by reducing the relative amount of HTTP requests to which it responds.

Regardless of the client affinity that's assigned to the cluster, you can adjust load-balancing weights on a per-server basis by using the membername Properties dialog box shown in Figure 5.9.

As you can see in Figure 5.9, a member's load balancing weight is adjusted by using a slider that operates on a scale from 1 through 20. By default, Application Center sets every member's weight at the mid-point in this scale when you create a cluster and add members.

Note When you adjust the load balancing weight on a single member, the load mapping for the entire cluster is recalculated.

The weight scale's range does not represent a percentage of the client traffic distribution. In order to determine what percentage of the traffic every member receives, you have to do the following calculation:

Add up the total of the server weights for the cluster
Calculate each member's server weight as a percentage of the total weight for the cluster

Bb734910.f05uj09(en-us,TechNet.10).gif

Figure 5.9 ACDW518AS Properties dialog box for adjusting server weight

Table 5.6 shows how server weights can be expressed as a percentage of the cluster traffic in a cluster of five members.

Note For this example, we arbitrarily decided that the weights represented in the scale ran from 1 through 100, and that the average load would have a value of 50.

Table 5.6 Sample Server Weight Calculation

Member	Weight	Total weight for cluster	Percentage of traffic
A	50	250	20
B	75	250	30
C	45	250	18
D	30	250	12
E	50	250	20

As the example in Table 5.6 illustrates, it takes a fairly significant difference in server weights to create a large difference in the percentage of the cluster traffic that each member receives; for example, the difference in weight between members A and B is 25, yet the traffic percentage difference is only 10. However, this weight-to-percentage ratio changes according to the size of the cluster. Let's assume that you have a cluster that consists of 10 members and that the total cluster weight is 500. In this scenario, members A and B have the same weights as in the five-member cluster. The new percentage of the load handled by each member is 10 and 15 percent, respectively. The gap between the two is reduced to 5 percent from 10 percent.

Tip If you want to set load balancing weights on a per-member basis, you should do a similar calculation before establishing new load weights. This will enable you to get a better picture of the impact of these changes on load distribution across the cluster. Chapter 10, "Working with Performance Counters," provides an example of load-balancing weight adjustment on a cluster that has load applied to it (by using the Web Application Stress tool).

Note You can also manually configure load-balancing weight for a member on a port rule basis.

Configuring Client Affinity

You can change a cluster's client affinity by opening the clustername Properties dialog box from the clustername node in the Application Center snap-in. By using the provided drop-down list, you can change the default affinity setting from Single to None or Class C.

Tip If you decide that it's necessary to change a cluster's load-balancing algorithm by changing client affinity, you should consider doing this during a period when cluster traffic is at a minimum. Even with NLB's effectiveness, convergence can have an impact on your clients' connections. Remapping the cluster during off hours will help reduce the risk of having a negative impact on client sessions.

Configuring Request Forwarding

You can configure request forwarding, or state (a process described in detail in "Maintaining Session State with Network Load Balancing" earlier in this chapter), via the Request Forwarder tab on the RKWebCluster Properties dialog box, shown in Figure 5.10.

Bb734910.f05uj10(en-us,TechNet.10).gif

Figure 5.10 Request Forwarding tab on the RKWebCluster Properties dialog box, which is used for configuring request forwarding on a cluster

Let's examine the different elements of the advanced load balancing options configuration in more detail.

Your configuration options for advanced load balancing are as follows:

Enable Web request forwarding—Clearing this check box disables request forwarding for dynamic HTTP requests. In the request forwarding logic framework illustrated in Figure 5.6, this means that "Always Add Cookies" evaluates to false and "Ignore ASP Session States" evaluates to true. As a result, HTTP language-based administration (HTMLA), FrontPage, and DAV publishing requests could still be forwarded and Application Center Administration requests are forwarded.
Enable for Web sites using ASP session state only—When this option is clicked, forwarding is enabled for all sites that have the ASP Allow Session State Enabled property set to true. This is an inheritable property, which is true (the default) for all virtual directories when IIS is installed. This means that all sites have ASP session coherency enabled by default in the IIS space.
Enable for all Web sites—Clicking this option means that forwarding cookies will always be added to a client request. In terms of the process flow chart, "Always Add Cookies" evaluates to true.
Forward Distributed Authoring and Versioning (DAV) requests and Forward FrontPage publishing requests—These options enable you to specify the type of tool that you're using to publish content. If either of these check boxes are selected, these requests are automatically forwarded to the controller.

Note The MD_RF_FORWARDING_ENABLED_FOR_VDIR property allows you to specify whether or not a virtual directory has forwarding disabled, or if the virtual directory requires all requests to be forwarded to the controller. If this property is set to 0 or 1, it enables or disables forwarding for the Vdir. If the property is set to 2, the request will be always forwarded to the controller.

Component Load Balancing

Component Load Balancing (CLB) is a feature that provides dynamic load balancing for COM+ application components. In order to enable component load balancing, an Application Center COM+ application cluster must instantiate components when requests are received from either an Application Center Web cluster or COM+ routing cluster or clients running the Win32 API. Functionally, an Application Center Web cluster and COM+ routing cluster are the same; both support CLB and can route requests to a COM+ application cluster.

Note Only COM+ object activation, or instantiation, is load-balanced by CLB and queued components cannot be load balanced.

By providing a mechanism for segregating the component tier and load balancing COM+ component requests, this feature:

Lets you scale out the component layer independently of the Web tier.
Allows you to distribute the workload across tiers.

Note Typically your performance gains are greater when you scale your Web tier than when you distribute applications on a component server tier.
Enables you to manage the business logic tier independently of the front-end Web tier or back-end database tier.
Provides an additional layer for securing applications.

Note As is the case with NLB, there is no single point of failure because each member in a COM+ routing cluster or Web cluster functions as a router.

CLB uses a hybrid of adaptive load balancing and round-robin processing to distribute client requests across a cluster. Let's examine the different elements of CLB and their role in load balancing COM+ component requests.

CLB Architecture

The key elements of the CLB architecture are as follows:

The component server routing list (on the Web cluster or COM+ routing cluster)
The COM+ CLB service (on the Web cluster or COM+ routing cluster)
The CLB Tracker and Tracker objects (on the COM+ application cluster)
The CLB Activator

The Component Server Routing List

After you've created a Web cluster or COM+ routing cluster (the front-end members), you have to explicitly enumerate the members on the COM+ application cluster (the back-end members) that you want to handle component requests. This option, shown in Figure 5.11, is applied to the entire cluster—you cannot create a separate routing list for individual members.

Tip You can discover the members of a COM+ application cluster by running the following commands from the command line:

On the routing cluster's controller, type: ac clb /listmembers
On the COM+ application cluster controller, type: ac cluster /listmembers

Bb734910.f05uj11(en-us,TechNet.10).gif

Figure 5.11 RKWebCluster Properties dialog box with a Component servers routing list

The list of component servers on the back end is stored in the metabase and registry and it is referenced by the CLB service to determine which members have to be polled in the COM+ application cluster.

The COM+ CLB Service

This Application Center service polls the COM+ application cluster members to obtain individual COM+ server response times.

Note Because the COM+ CLB service runs on every member of the front-end (Web cluster or COM+ routing cluster), each member maintains its own list of component server response times. This design means that back-end member response time information isn't lost if one of the front-end members fails.

After obtaining response time information for each member that it's aware of—determined by the component server list—the COM+ CLB service organizes the list of polled members in ascending order according to their response times and writes this information to a shared memory table on the member that did the polling.

Polling

Members of either a Web cluster or COM+ routing cluster that activate components on a COM+ application routing cluster poll the COM+ cluster's members every 200 milliseconds to obtain information about their response times. A member's response time—relative to other members—provides an indicator of the load on each member. After each poll the members are placed in a table by order of increasing response time and subsequent activation requests are sent to each member in the order they appear in the table. The table of server response times is the pivotal element for distributing component requests across a CLB cluster.

The CLB Tracker and Tracker Objects

The COM+ CLB service uses two objects during polling for gathering response time information:

The CLB Tracker object, shown in Figure 5.12, ships with Application Center.
The Tracker object, shown in Figure 5.13, ships with the Windows operating system.

Figure 5.12 The CLB Tracker object and its interfaces

Application Center installs the CLB Tracker object on a computer during the set-up process. The CLB Tracker is activated only on COM+ application cluster members that are being polled by a routing cluster.

Figure 5.13 The Tracker object and its interfaces

The Tracker object is installed as part of the Windows 2000 Server set-up process and is active on all servers running Windows 2000 Server. This object's instantiation on a server provides the response time data that's used to determine which COM+ server should handle incoming client requests.

The Polling Process

The polling process for a single front-end routing server and single back-end application server consists of the following steps:

The COM+ CLB service on a front-end member reads the component server routing list to determine which members have to be polled.
The service calls into an instance of the CLB Tracker object running on the first member in the list.
The CLB Tracker object calls into the Tracker object and gathers response time information for the target.
The CLB service receives the response time information from the CLB Tracker object and stores it in memory.
The CLB service moves to the next member in the routing list and repeats the preceding steps until every member in the component server list is polled.
The CLB service orders the list of members (and their response times) that it's holding in memory in ascending order according to response time, and then writes this information to a shared memory table on the front-end member.
In 200 milliseconds, the polling process is repeated.

Is it a heartbeat?

In a sense, the component server polling activity serves the same purpose as an NLB or Application Center cluster heartbeat.

If an instance of the CLB Tracker object can't be contacted on the target, the member's response time can't be added to the response-time table. When this table is parsed to determine where to route a COM+ request, the member doesn't exist. For all intents and purposes, the member is offline for CLB.

If the member can be polled during the next cycle, its name will reappear in the response-time table and it will be back in the load-balancing loop.

Figure 5.14 illustrates the polling process with a single front-end member and three back-end CLB cluster members. There is some overhead involved in having every member of a front-end cluster poll each of the back-end members, and this overhead, along with the time it takes requests to traverse a network, has to be taken into account when you're planning on distributing an application across tiers.

Bb734910.f05uj14(en-us,TechNet.10).gif

Figure 5.14 The CLB server polling process

Now let's take a look at the remaining piece of CLB, the CLB Activator.

The CLB Activator

This program processes all the incoming CoCreateInstance and CoGetClassObject requests for components marked as "supports dynamic load balancing" and, after parsing the response-time table, changes the incoming RemoteServerName value to the name of the component server that should handle the request. COM+ on the routing server then forwards the request to COM+ on the selected component server, which instantiates the object and returns a response to the original client with its server address. All subsequent method calls on the object are made directly from the client to the component server for the lifetime of the object.

Let's use Figure 5.15 to demonstrate this CLB routing process. For the purpose of this example, let's assume that the server response-time table has just been updated for the front-end member and that the first request is coming in.

After the CLB Activator receives the incoming CoCreateInstance, it parses the response-time table. Server S3 has the lowest response (25 ms). Therefore, it has the lowest load and should receive the request. The CLB Activator changes the value of RemoteServerName to "S3" and passes this information to COM+, which in turn directs the request to S3. When the next CoCreateInstance request comes in, the CLB Activator implements the round-robin aspect of CLB. The CLB Activator identifies S1 as the next least-loaded cluster member and changes the RemoteServerName value to "S1". Once again, this information is passed to COM+, which sends the new request to S1.

After the CLB Activator processes the last server in the list, it moves to the top of the list and continues assigning server names in round-robin fashion. This looping through the server list to handle incoming requests continues on the front-end member until the response-time table is updated after the next polling period. With new values in place, the CLB Activator starts at the top of the server list and works through the list until the next polling update to the response-time table.

Bb734910.f05uj15(en-us,TechNet.10).gif

Figure 5.15 The CLB routing process

Note In terms of processing overhead, the greatest hit occurs during the process of identifying the appropriate host and instantiating the object on that member. After that point, client-to-member communications is direct, and is not slowed down by any intervening layers.

Component Load Balancing Scenarios

Using CLB on a back-end cluster in combination with various front-end cluster configurations provides numerous opportunities to develop multi-tier clusters to host your various applications. There are three primary CLB models that Application Center supports:

Two-tier with full load balancing—A front-end Web cluster passes requests to a back-end CLB cluster.
Three-tier with full load balancing—A front-end Web cluster passes requests to a load-balanced middle tier that routes requests to a back-end CLB cluster.
Three-tier with fail over—A front-end Web cluster passes requests to two members in the middle tier (one member acts as a back-up, but doesn't server requests) that routes requests to a back-end CLB cluster.

When should you use a multi-tier load balancing topology?

Although Application Center supports multi-tier scenarios, you should not implement these scenarios simply as de facto models for distributing application components. There are good reasons for distributing applications across tiers, as well as keeping everything on a single tier. You have to fully analyze your technical and business requirements before making the split/no-split decision. There is no automatic, or right, answer to this question.

The main reasons for setting up separate Web and COM+ application server tiers:

Security—An additional layer of firewalls can be placed between one tier and the other.
Administrative partitioning—Different groups of developers and administrators are responsible for the HTML/ASP and COM+ applications. Putting the two groups on different tiers prevents problems between the groups.
Sharing a single COM+ application cluster among multiple Web clusters.
In some scenarios (for example, a low-throughput environment where each request is very expensive), sending multiple costly COM+ requests from a single Web request to multiple COM+ servers that are using a COM+ application cluster and CLB will increase response time—but not throughput.

Note In a high-throughput environment, this benefit is muted, because load will be balanced evenly around the cluster whether it is multi-tier or not.

The main reasons for choosing not to set up separate Web and COM+ application server tiers include:

Performance—Remote access is more expensive than running locally and overall performance will degrade if a single front-end cluster is split into two clusters without adding more hardware.
Administrative complexity—Managing two clusters is more complex than managing one.
It is difficult to make full use of the hardware—You must carefully balance hardware between the Web cluster and COM+ application cluster. Adding capacity becomes more complex, and it's likely that one tier or the other will end up with less capacity. This causes a bottleneck that requires more hardware to maintain optimal headroom; in addition, more monitoring is necessary to balance hardware utilization.
Dependency maintenance—Whenever a member is added to the COM+ application cluster, the cluster on the front end must be updated with the new membership list of the back-end component members.

Note You can use a routing cluster, but this has its own set of problems, and you have an additional tier to manage, with its inherent monitoring and throughput problems.

Two-Tier with Full Load Balancing

In the two-tier model shown in Figure 5.16, an Application Center Web cluster that is using NLB on the front end acts as a component routing cluster to component servers on the back-end cluster, which uses CLB. Out-of-process COM+ object calls are made to the CLB cluster members. The component server routing list and server response-time table reside on each front-end cluster member.

The following table (Table 5.7) summarizes the cluster settings for the two clusters in Figure 5.16.

Table 5.7 Key Cluster Settings for a Two-Tier Cluster

Setting	Cluster on front tier	Cluster on back tier
NLB (1)	Yes	No
Is a router	Yes	No
Component installed	COM object	COM object
COM proxy remote server name (RSN)	n/a	n/a
Component is marked for load balancing	Yes	No

1 You can achieve this with third-party load balancing as well.

Bb734910.f05uj16(en-us,TechNet.10).gif

Figure 5.16 Two-tier cluster model with NLB and CLB clusters

In this two-tier scenario, the front-end cluster uses NLB to distribute incoming client HTTP requests. The appropriate COM+ objects and applications on the front-end members are configured to support load balancing and the cluster is set up as a router—a role that you establish by selecting the General/Web cluster option as the cluster type when you create the cluster by using the New Cluster Wizard. This routing cluster that you create can handle both HTTP and component requests. (The back-end cluster with COM+ servers shown in Figure 5.16 is the COM+ application cluster, which is also created with the New Cluster Wizard, with COM+ application cluster selected as the cluster type).

Three-Tier with Full Load Balancing

This next model is virtually identical to the fail over model discussed in the next section. The notable difference is that full load balancing is enabled across the middle tier of routing servers. Once again, HTTP traffic is handled for the most part by the front end, and the middle cluster is dedicated to handling front-end and Win32 client requests.

Table 5.8 on the following page summarizes the cluster settings for the three clusters in Figure 5.17.

Table 5.8 Key Cluster Settings for a Three-Tier Cluster with Fail Over

Setting	Cluster on front tier	Cluster on middle tier	Cluster on back tier
NLB	Yes	Yes	No
Is a router	Yes	Yes	No
Component installed	COM Proxy	COM Object	COM Object
COM proxy RSN	Middle cluster IP	n/a	n/a
Component is marked for load balancing	No	Yes	No

Bb734910.f05uj17(en-us,TechNet.10).gif

Figure 5.17 A three-tier cluster with load balancing across all three tiers

Using CLB with rich clients

If you configure the clients that are using the Win32 API to use the IP address or Web cluster name, which you have to register with a name service, NLB will load balance the instantiation requests across the Web/CLB routing cluster. The selected routing member will then dynamically re-route the instantiation request to one of the COM+ application servers by using the CLB dynamic load-balancing algorithm. The following actions occur during this process:

The client process running the Win32 API issues a common client interface (CCI).
OLE and the Service Control Manager (SCM) on the client running the Win32 API find the proxy and forward the CCI over a TCP connection to the Web/CLB routing cluster address.
NLB on the Web/CLB routing cluster routes the TCP connection to one of the Web/CLB routing members based on the NLB load-balancing algorithm.
The NLB-designated Web/CLB routing member accepts the connection and hands the CCI to its SCM.
The SCM on the Web/CLB routing member determines that the request is for an object instantiation of a component marked as supporting dynamic load balancing and selects a COM+ application server based on the CLB load-balancing algorithm.
The SCM on the Web/CLB routing member resends the CCI with the client address of the client running the Win32 API over a TCP connection to the selected COM+ application server.
The COM+ application server accepts the TCP connection and hands the CCI to its SCM.
The COM+ application server instantiates the requested component and returns its response directly to the original client running the Win32 API.
All subsequent method, addref, and release requests are made over direct TCP connections between the client running the Win32 API and the COM+ application server on the CLB tier.

The trick in getting this to work is that your Win32 client proxy must be configured to make a call to the Web/CLB routing cluster (by name or IP address), rather than the name or dedicated IP address of the exporting server. There are two ways to accomplish this:

Open component services. Right-click My Computer, and then on the pop-up menu, click Properties. Click the Options tab, and then under Export set Application proxy RSN to the cluster name or cluster IP address. Then, export the application as an application proxy. The resulting Windows Installer package can be installed on any client running the Win32 API and will automatically point the proxy at the cluster.
Create a stand-alone server and install the COM+ application (or take the server off the network), and configure the server with the cluster's name and virtual IP address. Next, export the proxy on this server, and then reconfigure its IP address and name to legal values. Finally, add the server to the network.

As you can see from the preceding example cluster topologies, using CLB in conjunction with a load-balanced Web cluster provides a high degree of flexibility. The way in which you can combine these load-balancing technologies will be determined by the particular applications that you want to host.

Chapter 8, "Creating Clusters and Deploying Applications," steps through the creation of a multi-tier cluster that employs NLB and CLB. Chapter 8 also describes how COM objects are installed correctly on the Web cluster and CLB application cluster and enabled for CLB.

Three-Tier with Fail Over

In the three-tier model shown in Figure 5.18, which is a variation on the three-tier model shown in Figure 5.17, a front-end cluster of Web servers passes component requests to a load-balanced middle tier that's set up as a COM+ routing cluster. In this model, only the front end handles HTTP requests; the middle tier handles only component requests.

Note This model mimics a traditional failover scenario where the failover state must provide at least as much throughput as the normal state and assumes the backup member is equivalent in power to the cluster controller. If you do not have such a requirement you should use the general three-tier model outlined in this section, because it will better utilize the routing tier processing capacity.

The middle tier consists of two members, but because the controller is the only member online for load balancing, it receives all the incoming requests. The second member acts as a standby member that can take over as the cluster controller if the current controller fails. The member is in the synchronization loop so it has all the configuration settings, such as the component server routing list, necessary for it to step into the current controller's role.

This model also supports access to the middle tier by clients running the Win32 API. COM proxy requests are sent to the cluster controller in the middle tier for processing.

The following table (Table 5.9) summarizes the cluster settings for the three clusters in Figure 5.18.

Table 5.9 Key Cluster Settings for a Three-Tier Cluster with Fail Over

Setting	Cluster on front tier	Cluster on middle tier	Cluster on back tier
NLB	Yes	Yes	No
Is a router	Yes	Yes	No
Component installed	COM Proxy	COM Object	COM Object
COM proxy RSN	Middle cluster IP	n/a	n/a
Component is marked for load balancing	No	Yes	No

Bb734910.f05uj18(en-us,TechNet.10).gif

Figure 5.18 A three-tier cluster with the middle tier used for fail over

The main difference between this three-tier scenario and the two-tier scenario shown in Figure 5.16 is that COM activation calls are proxied to the member in the second tier.

Note Although any COM object that is properly installed and marked for load balancing will be load-balanced in a CLB cluster, COM proxies are the exception—they are not load-balanced.

Resources

The following book and Web sites provide additional information about Windows 2000 Advanced Server, NLB, and COM+.

Books

The Internet Information Services Resource Guide (Microsoft Press, 2000). This resource guide is one of the volumes that make up the Microsoft Windows 2000 Resource Kit. The guide provides information not found in the core documentation and software tools on a CD.

Web Links

https://www.microsoft.com/applicationcenter/

The Application Center Web site.

https://www.microsoft.com/windows2000/default.asp

The Windows 2000 Web site provides the most up-to-date information about Windows 2000 Server and other server products, such as Application Center.

Chapter 5 - Load Balancing

On This Page

Load Balancing Options

Network Load Balancing

Network Load Balancing Architecture

Low-Level Architecture and Integration

Network-Level Architecture

NLB Performance

Load Balancing Distribution

No Affinity

Single Affinity

Class C Affinity

Convergence—Redistributing the Load on an NLB Cluster

Removing a Member

Server Failure

Set Offline/Remove Server

Adding a Member

Maintaining Session State with Network Load Balancing

The Request Forwarder

The Request Forwarder Architecture

The Request Forwarder Filter

The Request Forwarder Engine

Request Forwarding Notes

Header Information

HTTPS

Request Forwarder Filter Positioning

Performance Counters and Error Messages

Performance Counters

Error Messages

Scenarios that Require Request Forwarding

Request Handling Examples in a Network Load Balancing Cluster

Stateless Session

Processing Activities and Their Sequence

Stateful Session

Processing Activities and Their Sequence

Network Load Balancing Administration

Setting a Server Offline or Online

Configuring Load Balancing Weights

Configuring Client Affinity

Configuring Request Forwarding

Component Load Balancing

CLB Architecture

The Component Server Routing List

The COM+ CLB Service

Polling

The CLB Tracker and Tracker Objects

The Polling Process

The CLB Activator

Component Load Balancing Scenarios

Two-Tier with Full Load Balancing

Three-Tier with Full Load Balancing

Three-Tier with Fail Over

Resources

Books

Web Links

Additional resources