Cache Array Routing Protocol

With the Cache Array Routing Protocol (CARP), an array containing multiple Forefront TMG servers can act as a single logical cache. CARP allows an array to efficiently balance Web-based client load and distribute cached content among array members. CARP provides client computers with the information and algorithms required to identify the best server in the array to serve their request, eliminating the need for array members to forward requests between them. CARP also supports array member selection by the servers and by chained proxy servers.

Note  Arrays that contain more than one Forefront TMG server are supported only in Forefront TMG Enterprise Edition.

CARP uses hash-based routing to determine the best path for resolving a request within an array. The request resolution path is based upon hashes of the array member identities and the host name in the Uniform Resource Locator (URL) specified in the request. For any given host name from a URL, a Web browser can determine exactly where in the array the information is stored if the information has already been cached from a previous request, and array members can determine where the information is to be cached for future requests.

CARP provides powerful benefits:

  • Because CARP determines the best request resolution path, there is no query messaging between proxy servers in an array, as is found with conventional Internet Cache Protocol (ICP) networks. By doing this, CARP avoids the heavier query congestion that normally occurs with a greater number of servers.
  • CARP eliminates the duplication of content that otherwise occurs on an array of proxy servers. With an ICP network, an array of five proxy servers can rapidly evolve into duplicate caches of the most frequently requested objects. The hash-based routing of CARP keeps this from happening by allowing all five Forefront TMG servers to exist as a single logical cache. The result is a faster response to queries and a far more efficient use of server resources.
  • CARP has positive scalability. Due to its hash-based routing and its resultant independence from peer-to-peer pinging, CARP becomes faster and more efficient as more proxy servers are added. ICP arrays must conduct queries to determine the location of cached information. This is an inefficient process that generates extraneous network traffic. ICP arrays have negative scalability: the more servers added to the array, the more querying required between servers to determine location.
  • CARP automatically adjusts to additions or deletions of servers in the array. The hash-based routing means that, when a server is either taken offline or added, only minimal reassignment of caches for specific URLs is required.
  • CARP ensures that the cached objects are either distributed evenly between all servers in the array or by the load factor that you configure for each server.

How CARP Works

The CARP process provides efficient routing for requests on the client side and on the server side.

Clients can be Web browsers or downstream proxy servers. Client-side CARP works by having the client select an array member for each host name contained in URLs. On the client side, Forefront TMG uses CARP as follows:

  1. Web browsers retrieve a proxy selection script from an array. This script is generated by Forefront TMG in response to automatic discovery and specific queries sent to an array member using its name, its IP address, or the DNS name (DNSName) of the array for Wpad.dat and Array.dll?Get.Routing.Script, and it is specific to the network where the client resides. A downstream proxy server sends an array.dll?Get.Info.v2 request to an upstream proxy server for a script containing a table that lists the fully qualified domain name (FQDN), NetBIOS name, and default IP address of the upstream array members on the network where the downstream server resides.
  2. The host name in the URL entered by the user in a Web browser is passed to the script, which computes a prioritized list of array members that will serve Web content from any URL containing that host name. Each array member in the list is identified by its FQDN, NetBIOS names, or default IP address in the client's network, depending on the value of the CARPNameSystem property of the Forefront TMG Web proxy.
  3. The Web browser connects to the first server in the list and requests that it retrieve the page. If the first server does not respond, the next server in the list is contacted, and so on until the object can be retrieved.
  4. The script always returns the same server list for a given host name, ensuring that content from all URLs containing the same host name is cached on one array server.

With server-side CARP, clients select the array member for sending a request in a predetermined, round-robin, or random fashion. If the array member receiving the request cannot serve the requested object from its cache, it uses the CARP algorithm to create a prioritized list of array members for the host name in the URL and forwards the request to the IP address for intra-array communication (IntraArrayAddress) of the first server in the list. If that array member cannot serve the requested object, the next server in the list is contacted until all the servers in the list have been contacted.

Server-side CARP is enabled for Web requests coming from a particular network when the ResolveInArray property of the network's Web listener is set to True. However, no Web requests from the network are handled unless the EnableWebProxyClients property for the network is also set to True. Note also that no objects are cached on any server in the array unless the CacheLimitInMegs property of at least one cache drive of at least one server belonging to the array is set to a nonzero value.

The script generated by Forefront TMG implements the CARP algorithm. The script includes information about the configuration and current status of the array. The script ensures that the URL space is divided evenly and in accordance with configurable load factors between the array members.

The result is a specific location for each cached object, meaning that the Web browser or downstream server can know exactly where a requested URL is either already stored locally or will be located after caching. The hashing functions used ensure that the load is statistically distributed and balanced across the array.

Because Forefront TMG servers in an array may have different hardware and some may be more powerful than others, you may want to divide the cache load accordingly. For this reason, you can configure the CARP functions by adjusting the load factor for each server in the array using the LoadFactor property.

CARP Exceptions

Some Web sites require that client IP addresses remain unchanged throughout a session. Requests should be sent to these Web sites without using the CARP algorithm. CARP can be disabled for these Web sites by adding them to a domain name set referenced by the CARPExceptions property of the FPCWebProxy object.

Send comments about this topic to Microsoft

Build date: 6/30/2010