Using FetchUrl to Extend the Cache
The Forefront TMG Web proxy normally fetches objects from the Internet and caches them automatically as proxy clients request them or as predefined content download jobs require them. However, occasions occur when an application needs a mechanism to override these automatic operations and exercise greater control over the source of objects that are placed in the cache, or the length of time that they are to remain cached. For example, an application may need to:
- Make data downloaded from a nonstandard network source (such as a satellite or private network) accessible to Web browsers as if it were normal Internet data.
- Update certain objects within the Web proxy's cache on a fixed schedule.
- Place objects in the Web proxy's cache that would normally not be cached.
The FetchUrl method facilitates these operations by allowing an application to cause the Web proxy to fetch an object from a given site (which is specified in the FetchUrl parameter) and store the object in its cache. The object may also be assigned a new name (which is specified in the CacheUrl parameter), which Web proxy clients will use to retrieve the object from the Web proxy cache and/or be assigned a particular time to live (TTL) within the cache (which is specified in the TtlInMinutes parameter).
FetchURL Usage Examples
Here are three examples of how you might use the FetchUrl method.
Consider a proxy installation with a very slow Internet connection that requires quick access to the fictional Web site called "fiction.microsoft.com". One solution to this problem would be for the owners of fiction.microsoft.com to produce a CD-ROM that contained all the content found on their Web site and mail it weekly to the proxy customer. An application could use this function to cause the proxy to fetch the data from the CD-ROM and store it in the cache as if it were data from the Internet fiction.microsoft.com site, with a TTL of one week. However, the Forefront TMG Web proxy only supports the HTTP and FTP protocols, so the data cannot be fetched directly from the CD-ROM. There are several possible solutions to this problem:
- The data on the CD-ROM could be copied into the publishing directory of an HTTP or FTP server or a publishing server could be configured to directly serve the data from the CD-ROM. This is a good solution if the data is provided as normal files (HTML, GIF, and so on) without any HTTP headers already added from the original publishing server, fiction.microsoft.com. If the files already contained HTTP headers from fiction.microsoft.com, serving the data from another HTTP server would add a second set of headers which would render the objects unintelligible to the proxy or to browsers.
- If the files already contain a set of HTTP headers (for example, if they were the product of a raw Internet download of the contents of fiction.microsoft.com instead of a copy of the data files from the fiction.microsoft.com Web servers), the easiest solution might be to write an alternative HTTP server application that would serve up the data files without adding HTTP headers. The HTTP server application would call FetchUrl to listen on the external socket and receive the data before it gets to the Web proxy. This would allow the application to have complete control over the data being sent to the Web proxy. This method is also particularly useful when the application needs to perform simple housekeeping operations on cached objects such as generating a "304 Not Modified" HTTP response to leave an object unmodified but extend its TTL, or a "404 Gone" response to delete an object from the cache.
In a Web hosting scenario, your organization may regularly update part of the information on its Web site. All or part of the Web site data can be cached on a Forefront TMG cache. When the Web site information changes, the cache will have to be updated. You can use the FetchUrl method to streamline the update process. In the central database for the Web site information, create an application that generates a file listing all of the files that have changed. Use FetchUrl to retrieve that file. Parse the file to create a list of files that require updating. Then use FetchUrl to retrieve each updated file from the database. This use of FetchUrl allows you to update only the cache files that have changed.
In a Web hosting scenario, users may upload updated Web objects to your publishing server at unpredictable frequencies. These items will not be cached by Forefront TMG if the existing cached object still has a valid TTL. You can use FetchUrl to delete the cached object from the cache, so that the updated object will be cached. For more information, see Using FetchURL to Delete a Cached Object.
To retrieve an HTTP object in response to a call to FetchUrl, the Web proxy applies the normal HTTP GET method to the URL specified in the FetchUrl parameter. If there is already a cached copy of the URL specified by CacheUrl (either loaded into the proxy previously by normal operations or by an earlier call to FetchUrl) the proxy includes an "If-Modified-Since" HTTP header in the request, also known as a conditional GET. The response to a conditional GET may be either a new copy of the data that will be stored in the cache or an HTTP status code 304 "Not Modified" response which will cause the proxy to extend the TTL of the object without changing the contents of the data in the cache.
If FetchUrl is used in an array containing multiple Forefront TMG computers (available only in Forefront TMG Enterprise Edition), the object is automatically fetched by the single array member that is the designated owner of that object according to the CARP algorithm.
The following code sample demonstrates VBScript usage of FetchUrl.
The example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious. No association with any real company, organization, product, domain name, email address, logo, person, places, or events is intended or should be inferred.
' When using VBScript, the enum values should be declared as constants ' or explicit numeric values must be used instead Const fpcTtlIfNone = 1 customTTL = 10 Set root = CreateObject("FPC.Root") Set isaArray = root.GetContainingArray() Set myCache = isaArray.Cache.CacheContents ' Fetch "http://www.wideworldimporters.com/home.htm" and to store it in the cache under the name "http://cache_server/home.htm" myCache.FetchURL "http://www.wideworldimporters.com/home.htm", "http://cache_server/home.htm", customTTL, fpcTtlIfNone
Build date: 7/12/2010