Monitoring and Troubleshooting Subscriptions
As the owner of several production reports, I have subscriptions that fire on a daily or weekly basis to remind people to close their bugs or verify their labruns. As usual, the password for the account I use in the report datasources will expire, and I invariably forget to update all of them and I'll get the "I've not received the report for days" email. As the owner of the Report Server, I also don't want my box to waste time and resources processing a bunch of invalid reports or subscriptions.
For these reasons, I want to monitor the Report Server and know what is not working in the reports and subscriptions.
Let me start by giving a high level view of how subscriptions are defined and processed, but first:
Disclaimer: we do not officially support querying all of the Reporting Services tables mentioned below, the columns and information in them may change in the future.
Note: This information refers to the SQL 2008 catalog, this info will generally also apply to SQL 2005, but I've not verified it there yet. Also, unless otherwise specified, the tables mentioned below are in your Report Server catalog database with whatever name you gave it, and yes, you need permissions necessary to query them.
Ok, back to the fun stuff.
When you create a subscription several things are added to the RS server:
- A row is placed in the Subscriptions table identifying the name of the report, along with parameter settings, data driven query info and so on to process the subscription
- A row is placed in the Schedule and ReportSchedule tables with the timing of the subscription
- A SQL Server Agent job is created to control the scheduled execution of the report, and this is stored in the sysjobs and sysjobsteps of the MSDB database. The agent job name is a guid value that represents the ScheduleId (yes, in case you've not yet noticed, this makes your Agent Job list messy with a bunch of guids that make it hard to work with your non-subscription jobs, we are hearing you:).
This query shows how the subscription information is stored in the various related tables, along with the most useful information you will need to diagnose issues later:
ReportServer.dbo.Catalog c on c.ItemID = s.Report_OID
ReportServer.dbo.ReportSchedule rs on rs.SubscriptionID = s.SubscriptionID
ReportServer.dbo.Users uc on uc.UserID = c.ModifiedByID
ReportServer.dbo.Users us on us.UserID = s.OwnerId
msdb.dbo.sysjobs j on j.name = CONVERT(nvarchar(128),rs.ScheduleId)
I use a similar query in a report with parameters to filter to certain reports, subscriptions, owners, and last_status values.
When the subscription runs several things happen:
- The SQL Server Agent job fires and puts a row in the Event table in the RS catalog with the settings necessary to process the subscription
- The RS server service has a limited number of threads (2 per CPU) that poll the Event table every few seconds looking for subscriptions to process
- When it finds an event, it puts a row in the Notifications table and starts processing the subscription, which includes executing the data driven subscription query (if present), processing the report, rendering it to the specified format (HTML, PDF, Excel etc), and delivering the final result to either an email address or a file share. Note that there will also be a row in the ExecutionLog table for the report execution.
This query shows the subscription processing information that you can use to diagnose problems:
Subscriptions s on n.SubscriptionID = s.SubscriptionID
Catalog c on c.ItemID = n.ReportID
Users uo on uo.UserID = s.OwnerID
Users um on um.UserID = s.ModifiedByID
Common problems with subscription delivery:
If your subscription emails are not being delivered to Inboxes across the company, there are several different things that can cause this:
- The SQL Server Agent job is not firing
This could be because the service is not running due to a password expiration or some other cause, and RS does not see anything to process. In this case, get the ScheduleId guid from the above query, find the job with that guid as its name, and check the job history to make sure the job fired. You will also see that the Event and Notifications tables are empty.
- RS is not processing the subscription events
The main issue here is that the RS service event processing threads are not processing the events. The main cause here is that the event processing threads are all tied up, and not processing the events as fast as they arrive. You will see that the agent jobs are firing, but the Events and Notifications tables have a large number of entries. The number of threads is fixed, you cannot increase it, so you need to figure out what they are processing and why they are not firing or keeping up. You can get the subscription and report information, and run the report or the data driven query manually to see if they are healthy. One thing I noticed is that a person had scheduled a report execution snapshot to fire every half hour, but the report took over an hour to process, so at any given time, several instances of the event were running at the same time occupying all of the threads, and nothing else was getting executed.
You may also notice that everything is healthy, but there are common hotspot times in the day (for example, 8am or midnight) when a lot of subscriptions are set to fire at the same time, and they may take a long time to complete, so the subscription processing cannot keep up and falls behind. In this case, you may encourage people to run them at different times if possible, or you could consider adding a second reporting server configured as a two node NLB cluster with the other. In this case, both servers will be polling the events table in the RS catalog and processing the events in parallel.
- SMTP is not delivering the emails
This could be because the email address is bad, or there are problems with the SMTP configuration. I won't go into SMTP here, I'm a database guy, but RS needs it to be working to deliver the report. The Event and Notifications tables will be empty (RS has done its part), the LastStatus column of the Subscriptions table and the subscription status in Report Manager will show an error similar to "Failure sending mail: The user or group name 'domain\username' is not recognized. Mail will not be resent."
- The report or subscription is invalid
Again, this can usually be tested by running the report manually, but there are a myriad of causes that will also show in the subscription status. For example, the datasource may be invalid due to an expired password or an account for a person who has left the company, a subreport may have been removed or renamed, a server, database, or table used in the query may have been removed or renamed and so on. Statistically, this is the source of over 90+% of failed subscriptions I've seen.
So in summary, you can write reports based on the information in these catalog tables to monitor subscriptions, and take action before you get the "I've not received reports for days" emails, and encourage the report and subscription owners to keep their stuff clean and healthy and be good reporting citizens.
In case its not obvious, you may want to host these monitoring reports on a different server because if subscriptions are not firing, you may not get the subscription based monitoring report to tell you that subscriptions are broken!
I hope this helps, please let me know!
As usual, "This blog is provided 'AS IS' with no warranties, and confers no rights."