How can I make my web site faster with caching ?
My name is Kai Lee, a program manager from the Web Content Management team. Today, I would like to explore the caching features in MOSS 2007 and help you answer the question: “How can I make my web site faster with caching ?”. Imagine for a minute that you have an external-facing web site with these types of user access:
I. One portion that is available to the public with anonymous user access
II. Another portion that is only available to some authenticated users, such as registered subscribers to your site or paid customers.
III. Yet another portion that is available to both anonymous and authenticated users.
So, let’s take a look at them one at a time to see how caching can help to make your web site faster.
I. Anonymous User Access.
Usually this part of your site is the most heavily accessed because it is open to the public; it is also sufficient in many cases to present new updates every few minutes instead of instantaneously whenever changes are made. In this case, proper use of output caching can really make a huge difference in system throughput and response time.
MOSS 2007 uses ASP.NET output caching to store the HTML markup generated by ASPX pages so that subsequent requests for the same page will be served from the cache instead. By saving the HTML markup in a cache, ASP.NET is able to serve up web pages much faster with less CPU usage and fewer database roundtrips because it does not need to run code to generate the same HTML markup every time the page is requested. In an environment where anonymous users and output caching are enabled, you can achieve over 1000 ASPX requests per second with a single web front-end; without it, you may only get about 80 RPS or less. Obviously, the exact performance numbers will change depending on the nature of the pages and the hardware you are using, but you can appreciate the relative performance gain between cached and uncached web pages.
To enable output caching for your site, the MOSS Publishing Infrastructure feature must be enabled for your site collection. If you have created your site collection using either Publishing Portal or Collaboration Portal site template, this feature should be enabled by default.
To check if this is enabled, navigate to the site collection features page as follows:
1. Click Site Actions->Site Settings->Modify All Site Settings
2. Click Site Collection Features under the Site Collection Administration column.
Once you have enabled the publishing infrastructure feature, follow these steps:
Step 1: Create a cache profile that specifies the behavior of output caching.
MOSS 2007 provides additional support on top of ASP.NET in the configuration of the cache settings using output cache profiles. A cache profile is basically a list item that specifies the behavior of output caching. Each site collection may contain one or more these profiles, and they can be assigned to the different portions of your web site in order to optimize the system performance.
Below is an example of what a cache profile looks like. There are quite a few settings on this cache, but the key ones for our discussion are Checking for ACL, Vary by User Rights, Duration and Check for Changes. This cache profile is intended for the part of your web site where the content is publicly available. Therefore, the settings for checking ACL (i.e. permission) and vary by user rights are off to optimize performance. I will discuss these settings later in this post. When the page is cached, all requests to that page are served from the cache. The cache expires after the specified duration (e.g. 180 seconds), and then the updated content is cached for the next set of users.
There are a few out of the box cache profiles stored in a centralized list on each site collection, but you can create a new one or modify existing one by editing the list item to fit your specific needs. The name of the list is Cache Profiles, and you can get to this list by clicking Site actions->Site settings->Modify all site settings at the root web . Then, click Site collection cache profiles under the Site collection Administration Column on the site settings page.
Step 2: Enable output caching and assign the cache profile to the right section of your web site.
Once you have created your cache profile, you are ready to assign this to the right section of the site. For this post, I will assume that most of your site is available for anonymous access. So, follow these steps.
a) Go to the site settings page, and click Site Collection Output Cache Settings. You should see this page:
b) Enable the Output Cache checkbox.
c) Set the anonymous cache profile to the sample one mentioned above – Public Internet. This will allow anonymous users to take advantage of output caching for the entire web site.
d) Check the two boxes in the page output cache policy section so that you can override specific section of the sub sites or web pages later.
Once you have done these steps, output caching is now enabled for the entire site collection, using the specified cache profile for anonymous user. However, you also have the option to selectively control the behavior of output cache by using the cache profile at 2 other levels: sub-site and page layout. I will discuss those optional settings later on in this post.
By the way, if you enable debug cache information in the cache profile, a comment will be left at the end of the HTML markup of the web page indicating if output caching is enabled. For example:
<!-- Rendered using cache profile:Public Internet at: 2006-11-03T14:38:59 -->
This option is really for debugging purposes, and it is not required to enable output caching. It tells you what profile the page is using to cache the content, and the timestamp of the cached version of the page.
Step 3: Enable disk-based caching
In a typical web site, each web page may contain one or more references to images or CSS files. These resources may be stored in libraries in the MOSS site. MOSS provides a disk-based cache for these objects. The purpose of this cache is to avoid extra database roundtrips required to pass down these objects when the page loads. This is done by storing cached versions of the resources on the file system of the Web front-end server. By default, the disk-based cache is disabled, so you must enable it to take effect. Each web front end server in your farm has its own disk-based cache, and they are independent of each other.
To enabled disk-based cache (also known as the “blob” cache) on a web front end server, change enabled to true in the following line in the web.config file of the web application in which you are hosting your web site. The maxSize attribute is the maximum size of the cache in gigabytes, and location specifies the location of the cache on disk. Given you want disk access to be fast, you may not want to place the disk-based cache in a disk location where you are also doing system paging, and you should make sure the location has sufficient space without much file fragmentation.
<BlobCache location="C:\blobCache" path="\.(gif|jpg|png|css|js)$" maxSize="10" max-age="86400" enabled="True"/>
One interesting new setting, added after Beta 2 Technical Refresh, is the max-age attribute. Max-age is the amount of time (in seconds) that client web browsers will cache the objects locally on the client machines. If this setting is non-zero, the client browser will not re-request the same URL of the resources until its local version has expired. If you have a number of CSS files or pictures (say, a company logo) that you don’t change very often, and these CSS and images files are being referenced in many web pages within your site, you should definitively take advantage of this setting. If disk-based caching is not enabled, or max-age is set to zero, the client browser will re-request the same CSS files/images again and again from your site. Sure, MOSS 2007 is smart enough to send a return code of 304 back to the client browser indicating that the local version is up-to-date. This 304 return code will instruct the client browser to avoid the extra download, but it still generates the unnecessary http requests over the network and possible database hits for permission checking asking for the same resources that the client browser already has. So, use this setting whenever it is appropriate.
If you have other resources with a special file extension, you can change the path attribute to include them. The path attribute is just a regular expression, so you can specify specific file extension or path to cache your resources. An example of adding a bitmap file is:
The disk-based caching is useful not just for anonymous access, but also for authenticated user access. Therefore, use it independent of type of the access in your site. The disk-based cache only manages resources that are published and stored in a SharePoint library. Draft or checked out items are not cached.
See more here: http://msdn2.microsoft.com/en-us/library/aa604896.aspx.
First time access
Once these caches are in place, the very first user access will not be faster (in fact, it may even be a bit slower than before), but subsequent access should be a lot faster than before. In some sites, it may also be wise to warm up the site by hitting some popular pages during system startup so that real users will not even experience the delay in hitting the pages the very first time.
II. Site with Authenticated Users access.
For the part of your web site where you need authenticated users access, you can override the output caching behavior by overriding the cache profile previously assigned to the entire site collection. This can be achieved in two ways. But first, let’s create a new cache profile for authenticated users. In the following cache profile, I have enabled the options to perform ACL check (i.e. permission check) and Vary by User Rights. I enable these settings to account for the authenticated user access so that the access of the cached items are trimmed based on access control on the items (ACL check), and the system doesn’t share output cache with users with different permission settings in MOSS (vary by user rights). However, I also want to disable change checking to give better performance since I don’t need to expire the cached page immediately whenever any change is made in the site collection. Instead, the update of the pages will show up after a fixed period of time.
Once you have created a new cache profile, you can use these two ways to override the default output caching behavior set previously on the entire site collection:
1. Assign a different cache profile for a sub site in your site collection
a) Navigate to the site settings page of your sub-site
b) Click Site Output Cache under Site Administration category; (don’t confuse this link with the site collection output cache page).
c) Underneath the Anonymous Cache Profile, simply select the Disabled option.
d) Underneath the Authenticated Cache Profile section, check Select a page output cache profile.
e) Select your new cache profile that you created above.
Use this approach if your entire sub-site is dedicated for authenticated users.
2. If instead you have specific pages that you want to assign a different cache profile to, you can do so as follows:
a) Navigate to the master page gallery.
b) In the list, edit the properties of the specific page layout.
c) At the end of the edit form, you should see a drop down for authenticated cache profile; select the new cache profile. Remember to check in and publish the page layout if required to take effect.
Choose the second way if you want certain pages throughout the site collection to be cached differently for authenticated user. Then, make sure these pages share a special set of page layouts. Independent of where the pages are created, the output caching behavior will be changed according to the cache profile you have assigned to the page layout.
By the way, the two approaches are not mutually exclusive. You can apply both at the same time, but the page layout settings will override the other settings in case of conflict.
Now, output caching for authenticated user is not as fast as the one for anonymous user, but it is still much better than the one without any caching.
III. Web site with both anonymous and authenticated user access
In some cases, a part of your site or your entire site may support both anonymous and authenticated users at the same time. The web page may serve up slightly different content based on whether a user is logged in or not. In that case, you can still enable different caching profiles for anonymous and authenticated users.
As you may have noticed by now, when you assign cache profile at the various levels (site collection, site, and page layout), there are always two entries: authenticated and anonymous cache profiles. If you enable both forms of access on your site, you can select a different caching profile for each type of user access, and the output caching will behave differently for the two types of user access. Here’s an example of the output cache configured with one cache profile for anonymous users, and another for authenticated users:
So far, I have shown you how to set up output and disk-based caches for web sites with different type of user access. Now, I would like to mention another type of cache: the Object Cache. Unlike the output cache or disk-based cache, this cache is enabled by default, and is useful not only for published pages or resources but also for draft items.
MOSS 2007 stores its content in a SQL Server database, and the various controls and parts on a page layout will retrieve the data from the database at run time and editing time. However, this retrieval can be very expensive because database round trip is slow relative to the access of virtual memory or the file system. Even though you may have output caching and disk-based caching enabled, access to these items may still need to occur whenever a change is made during editing, and when the output and disk-based caches expire. Hence, MOSS 2007 uses an object cache to store the most frequently accessed items, such as the site navigation structure, and content in the different fields of a document library item or a list item. In doing so, the caching occurs at more granular level, and it is less expensive over time. This cache is very much internal to MOSS 2007, so you should not have to change much except for specifying the size of this object cache.
As a system administrator, you can control the maximum amount of memory allocated for this object cache. The default is 100MB per site collection. There are Performance Monitor counters you can use to monitor the usage of this object cache, and it is available in a performance object called Sharepoint Publishing Cache. Check out the publishing cache hit ratio, and the total object discards counters before you adjust the object size. Note: these performance counters are not enabled for beta 2 or B2 technical Refresh. Look for these counters in the final release of MOSS 2007. If you are not familiar with Performance Monitor, check out this link: http://technet2.microsoft.com/WindowsServer/f/?en/library/90c2549c-4eb5-45d8-86a9-ea009a4fcd6e1033.mspx.
The steps to set the object cache are as follows:
1. Click site actions->site settings->modify all site settings
2. Click Site Collection Object Cache
3. Change the value of Max. Cache Size and click OK.
Keep in mind that the different caches are sharing the same virtual memory address space within the same process. So, if you set the object cache size to a large value, you may not have enough space for other caches, such as output cache. This is particularly important if you are running on a 32-bit OS, as each IIS worker process (W3WP) really has a maximum of 2 GB of memory for the application to play with (even though you may have a lot more physical memory available), and this limited space is also shared by the dlls and modules loaded and used by MOSS 2007. If you use output caching and have a big site, you should seriously consider using 64-bit hardware and OS, so that you have more memory space for each worker process. If you have to run on 32-bit OS, be careful when setting this cache size to a value greater than 300MB. My advice is to start with a low number (say 200MB or less), and then increase it if you see the hit ratio is low (< 85%). Double check the users’ experience while editing a page. This is also a good indicator of whether your cache size is set appropriately.
See more here: http://msdn2.microsoft.com/en-us/library/aa661294.aspx
Use this type of caching…
Output Caching and Cache Profiles
Individual page level
Ideal for heavily accessed web sites that need to present updated content only every few minutes or more.
Individual Web Part control, field control, and content level
Includes cross-list query caching and navigation caching. Great for editing and viewing operations.
Disk-based Caching for Binary Large Objects
Individual binary large object (BLOB) level and caches images, sound, movies, and code
Supports .gif, .jpg, .js, and .css out of the box, but can be extended to cover specific file extension.
I hope that this blog post will help you with making your web site faster! Let me know if you have questions, as well as if you’d like me to discuss any of the above topics in more detail in a future post.
Kai Lee – Program Manager