Chasing down slow page renders in SharePoint

So we know there’s many ways to skin a cat by now (I hope)…and this is yet another way. I’m sure we’ve all stress tested our SharePoint environments and know where all the hardware and software bottlenecks are as well as what is to be expected. Well, what happens when that one user decides to go all ‘power user’ on you and create all kinds of neat and complex lists and filters. At first, you might think it’s cool and all, until they come to you and tell you that all of the sudden their home page to their site takes 30+ seconds to render on every page hit.

Some of you may look at Task Manager to see what’s going on at a high level. You notice there’s not much going on from the WFE’s memory and CPU utilization. Nothing looks abnormal. Next, you might hit the event logs to see if there’s any errors, and you’d come up empty. Now what? On to the diagnostic logs for SharePoint (ULS logs), but what might we look for? I’d always start my searches for “exception”, “fail”, “error” and in this case I’d come up empty (well, empty for chasing down this particular problem anyway). Hm, what’s next? Dumps maybe? Well, if you’re so inclined and know what you’re looking at, sure…go for it. What you’d likely find in this case is one thread per user request sitting and hanging out waiting for a response back from SQL.

Interestingly enough, SharePoint has a query timer built right in the product and it logs long running queries and the time taken right to the ULS logs, you just have to know what you’re looking for.

Enter the great 8sli. You might say you were just in the ULS logs and didn’t find anything. Well, that’s because you, like me at the time, probably didn’t know what it was you were looking for. Here’s an example of what this particular entry would look like:

09/06/2011 12:59:21.04 w3wp.exe (0x1B74) 0x23F4 Windows SharePoint Services Database 8sli Monitorable List item query elapsed time: 35547 milliseconds, Additional data (if available): Query HRESULT: 0 List internal name, flags, and URL: {44B9C682-4C16-4731-9861-7681793B4BEC}, flags=0x0000000004801008, URL="REMOVED" Query XML: "<Query><Where><Eq><FieldRef Name="ID"/><Value Type="Integer">15</Value></Eq></Where></Query>" SQL Query: " SELECT t2.[tp_Created] AS c3c8,t1.[Type] AS c0,t3.[tp_ID] AS c10c5,t1.[Id] AS c15,t4.[tp_Created] AS c17c8,UserData.[tp_ItemOrder],UserData.[tp_ModerationStatus],UserData.[nvarchar1],UserData.[nvarchar6],UserData.[tp_Created],t1.[CheckinComment] AS c23,UserData.[tp_WorkflowInstanceID],t2.[nvarchar4] AS c3c6,t3.[tp_Created] AS c10c8

As you can see from the above, the reason we didn’t find it in the first place when we were in the ULS logs was that we weren’t looking for “monitorable” because that would have yielded us far too many results to sift through. In this case, though, it would have been lucrative to do so given we were chasing a reason why a page might take so long to render.

This entry in and of itself does not provide us a smoking gun, but the URL (which I removed) would have contained a pointer to where the query originated which is a huge step in the right direction. Once you have the URL nailed, you can then fire up a SQL profiler trace when you hit that URL and you would find some parameter data that would help you figure out why the query was taking so long. You’d also have things like the ListID, SiteID and other valuable things to help you on your hunt.

Pro Tip #489: Use ULSViewer, it makes short work of filtering ULS logs by these tags. It can be downloaded here: Ulsviewer

Thanks to my buddy Alonzo Milow for pointing this out to me. He’s a MS PFE on my team, but I don’t think he blogs yet. I’ll work on that.

Happy hunting the great 8sli.