Improving Report Performance with Caching
Hey There – I've not blogged in a while, and as a Reporting Services admin I’ve recently taken an interest in report caching issues, so I thought I would share some of my learnings and recommendations.
First of all, I’ll give a quick review of the factors you need to consider when making report caching decisions.
Considerations for Caching Reports
How long does the report take to run?
Caching a long running report will allow the report to be rendered faster and give a better experience for the user, but it will not improve the time for reports that are already fast for live executions. Long running reports usually also have a high processing cost for both the report server that is processing the report, as well as the data source server that is executing the report’s queries. Considering this will allow you to determine if a report should be cached.
How “fresh” does the report data need to be?
Some reports, for example, those that reflect information in a fixed prior date range, will always return the exact same data every time the report queries are executed. These are perfect candidates for caching. Some reports, for example those showing current “status” information are only valid with the most current data available. These are the worst candidates for caching. Most other reports fall somewhere in the middle; the value and accuracy of the report is directly related to the age of the data, some latency is ok, but beyond a certain point it really should be refreshed. Considering this will also allow you to determine if a report should be cached, as well as how long the report should stay in the cache.
How often is the report used, and how many people use it?
If a report is used very frequently, the costs of caching the report will be justified, but if the report is not used very often, the cost of caching it will not have any return.
When do users commonly use the report?
Some reports are repeatedly used at predictable times of the day, week, month etc, for example, a morning report showing the activity for the previous day. Other reports see a flurry of activity at non-predictable times, for example, when your manager sends the group an email questioning the status of a large project. Considering this will allow you to determine how and when the report should be cached.
What parameter values are commonly used for the report?
If it is a report without any parameters, or users commonly choose the same parameters every time the report is rendered, the cached version will be valid for most requests. But if users commonly specify parameters that are not the same as those used in cached versions of the report, the cached version will not be used, and the report’s queries will be executed live.
Will users want to go back and see old versions of the report created at various points in the past?
This is common for daily, weekly, or monthly status reports where users want to see older historic versions of the report.
All of these options must be considered together to determine if the report should be cached and how it should be cached.
Types of Report Caching
There are two main types of report caching. When you click on the "Properties" tab of a report in Report Manager and choose the “Execution” option on the left side of the page, you will be configuring “Execution snapshots” that will only be kept in the cache for a specified period of time. When you choose the “History” option, you will be configuring “History Snapshots” that will store several cached versions of the report that can be manually chosen from the “History” tab of Report Manager at some time in the future. History snapshots can also store the cached versions of Execution snapshots.
Execution Snapshot Configuration Options
There are two primary options for configuring “Execution Snapshot” report caching. You will see that by default, the “Always run this report with the most recent data” and “Do not cache temporary copies of this report” options are chosen by default. This of course means that the report’s data set queries will be executed each time the report is rendered. You may be able to avoid the execution time experience for the report user, as well as the reducing the processing cost for the report server and the database server the report is based on, by caching the report.
The first option is controlling how the caching is triggered. You can have the report cached when a user renders the report and a compatible cached version is not available. This is good for reports that are commonly used at unpredictable times of the day/week/month. The first person to run the report will see the time delay for the report queries to be executed live, but it will be cached for later users, and when activity slows down again, the report server and data source servers will not pay the price of executing and caching the report. This is a good option if you want to cache several versions of the report with different parameter values. This is the method that will be used when you choose one of the “Cache a temporary copy of the report…” options in Report Manager.
Or you can have the report server automatically cache a new version on a specified interval or schedule, so a reasonably fresh cached version is always available. This is the method that is used when you choose the “Render this report from a report execution snapshot” option in Report Manager. It is a good option if you know when and how often the report will probably be used, but if the report is not used, the cost of executing and caching the report will be unnecessarily spent for no benefit to report users. This option is only good if the report has no parameters, or all the parameters have defaults, because the report server cannot prepare a cached version of the report if it does not know what users will specify when they run the report. One precaution to be made here is that you need to consider how long the report takes to execute live when you are configuring how often it should be triggered. I mentioned in my blog post about report performance a story about a subscription that fired every 15 minutes for a report that ran for over an hour. This resulted in several executions running in parallel and slamming the SQL Server box that was executing the query. The same situation can happen with report snapshots, and the obvious suggestion is to make sure you don't initiate frequent snapshots for long running reports, or you will make enemies in dba and report server admin circles!
The second option is choosing how long the report will stay in the cache. You can have the cached version expired and removed at set interval or a scheduled basis, depending on when the age of the data in the report becomes outdated. This can be specified by choosing one of the “Cache a temporary copy of the report…” options in Report Manager.
History Snapshot Configuration Options
The options for history snapshots are exposed via the History option of the Properties tab in Report Manager. They are pretty straight forward and similar to execution snapshots in many aspected; they can be created on a scheduled basis or by active live requests from execution snapshots, and they can be automatically be cleaned up on a scheduled basis. There are a couple key differences; they are exposed and visible for viewing by
selecting them in the History tab of Report Manager, whereas execution snapshots are automatically used (or not) depending on if they exist or have compatible parameters. You can also manually generate them (assuming the parameters all have defaults) using the "New Snapshot" option on the History tab.
Report Design Considerations for Caching Reports
There are a couple considerations to keep in mind for cached reports. The first issue is that it is important for users to know that they are seeing a cached and potentially outdated view of the report. You can accomplish this by adding a textbox near the top of the report with an expression similar to:
="Report Executed at: " + Now().ToShortTimeString
As I’ve mentioned several times above, the cached version of the report is only valid if the parameter values of the cached report are the same as the values chosen by the person running the report. For this reason, if you want to take advantage of report caching, you should ensure that the report parameters have defaults that will be commonly used by the report users.
Pre-loading the Execution Snapshot Cache
One of the drawbacks of non-scheduled execution snapshots is that they are only cached after the report was run once, and the first person to run the live version of the report pays the price of building the cached version. One way to avoid this is to have the report executed via a subscription that fires on a scheduled basis, and its execution caches the report without a user having to pay the price. A static subscription can be used if there are no parameters, all the parameters have defaults, or a single set of parameters will fit the needs of most users. You can also use a data driven subscription with a query that returns multiple sets of parameters to cover the needs of various combinations of parameters.
Identifying Existing Reports that are Good Candidates for Caching
As I’ve mentioned in another blog posting, and Robert Bruckner covers extensively in his blog, the report server catalog database has a view called ExecutionLog2 that has entries for every report execution. You can use information in this view to identify the reports that are frequently executed live along with the parameter values that are specified, and are good candidates for caching based on the caching considerations listed above. I created a report to serve this purpose, and it is based on the query below.
select top 1000
el.ReportPath, 'LastModBy'=u.UserName, 'Parameters'=convert(varchar(max),el.Parameters),
from ReportServer.dbo.ExecutionLog2 el
left outer join ReportServer.dbo.Catalog c on c.Path = el.ReportPath
left outer join ReportServer.dbo.Users u on u.UserId = c.ModifiedByID
-- report parameters
(@reportpath = 'ALL' or el.ReportPath = @reportpath)
and el.TimeStart < DATEADD(dd,@daycount,GetDate())
and el.Source = 'Live' -- only show live executions, not cached executions
and el.ReportAction = 'Render' –- only include rendering operations
and el.ReportPath != 'Unknown' -- exclude reports that have been deleted after executing
and el.Status = 'rsSuccess' --
group by el.ReportPath, u.UserName, convert(varchar(max),el.Parameters)
-- Only show the report executions that happen frequently, report parameter
having COUNT(*) > @minexeccount
order by Avg(datediff(ss,el.TimeStart, el.TimeEnd)/60.0) desc
By putting interactive sorting on the report’s table columns, I can look for the longer running reports, those that are executed frequently with a given set of parameters and so on. And I added drillthrough links that allow me to easily filter to just the data for a given report, as well as a MAILTO: URL that lets me easily contact the person who owns the report.
You can also use a similar query to monitor cached report activity. Reports that are executed live will have an ExecutionLog2 entry with RequestType=Interactive or Subscription, Report Action=Render, and Source=Live. Those that are rendered from an execution snapshot will have Source=Snapshot, and those from a history snapshot will have Source=History. If you have created scheduled execution or history snapshots for expensive reports, but don't see anyone actually using them, you may want to consider canceling the snapshot.
Internal Scheduling and Storage of Report Snapshots
Disclaimer: we do not officially support querying all of the Reporting Services tables mentioned below, the columns and information in them may change in the future.
In my blog post about monitoring and troubleshooting subscriptions, I described how the scheduling of subscriptions is controlled by a SQL Agent Job that places an entry in the Events table, and it is moved to the Notifications table when the report server is processing the subscription. The same scheduling and delivery process is used by the schedules that refresh execution and history snapshots, with the exception that snapshots are stored in the catalog database rather than being sent to a file or email address. You can use the same basic techniques for monitoring, diagnosing, and troubleshooting snapshots.
I mentioned several times above that the benefits of caching does come with a cost. Part of this is the execution cost of the report's dataset queries in the datasource server and part of this is in the report processing cost in the report server. But it also has a processing cost of managing snapshot cache entries, and a storage cost for storing the cached version of the report in the report server catalog database.
The report snapshot cached entries are stored in the Snapshot table with information regarding when the cached entry was created, when it will expire, the parameter hash values used to determine if the cached report is valid for a new user request and so on.
cached reports are stored in the Chunkdata table, and large report snapshots can require significant storage. For example, the Chunkdata table in my report server has 258 cached snapshot entries, with about 85mb of total data in the Content column, a max size of 14mb, and an average size of 335kb. The server does not use snapshots particularly heavily, at least until I start asking people to use them:), but this can be a significant storage issue in the report server catalog database if you do cache a lot of large reports, so it is something worth montoring from time to time in your server.
I hope this information helps, take care, thanks, happy reporting, and of course:
"This blog is provided 'AS IS' with no warranties, and confers no rights."