Perfmon - High CPU Analysis
There are five major resources in the operating system, they are the physical disk, memory, process, CPU and network. Any one of these resources when not utilised properly, will be leading to performance deterioration like system crash, process hang.
Perfmon.exe is a process monitoring tool and is a tool to visually display the built-in performance of the operating system by using the counters.
Counters are used to provide information about how well an operating system, application, service, or driver is performing.
Configure Perfmon Data - Data Counters need
There are several counters in perfmon that can help us to understand the issues we are facing when trying to troubleshoot the application.
But we will be including only those counters that will be more helpful for anlaysing the High CPU issue.
The counters are :
- "\Process (*) \%Processor Time"
- "\Process (<process name>) \%ID Process"
- "\Thread (<process name>) \%Processor Time"
- “ \Thread (<process name>) \ID Thread”
I'll be explaining about the counters in detail, in the following sections.
Analysis of perfmon trace - counters to be added
One of the important scenarios where perfmon traces is really helpful is the high CPU issue. A lot of information relevant to the root cause of the issue can be pulled out from the perfmon trace.
Performance counters can be added as shown in the figure. Right Click on the screen and select Add Counters.
From the counters screen that pops up choose the required counter.
Analyzing the high CPU dumps become much simpler if we are able to figure out the process causing the high CPU. The below are the counters that have to be added for determining high CPU:
- Add "\Process (*) \%Processor Time” : Initially the process causing the high CPU is unknown and in order to determine the exact process that is causing the high CPU, we add the %Processor Time counter for all the processes under the Process This counter lists the percentage of CPU consumed by a particular process, during the time of issue. From this counter we will be able to filter out the maximum CPU consuming process.
There can be occasions in which multiple instances of the same process is running and the only help in such cases is the process identifier (PID).
- ” \Process (<process name>) \%ID Process” counter helps in figuring out the Process identifier of the process. Search for the process name as shown in the below figure and select Add Searched Instances
After adding the searched instances click on OK and the process ID of each of the w3wp process gets listed as shown in the below figure.
Every process has multiple threads running within the process. Now that we have located the process that is causing the high CPU and also the corresponding ID of the process, the next step will be to find out which of those several threads is consuming more CPU cycles and eventually leading the process to be the culprit for high CPU. For which we will be needing to add another counter to retrieve the thread related information.
- ” \Thread (<process name>) \%Processor Time” counter adds the threads relevant to the process. In the below figure we are able to see that multiple threads have consumed some amount of CPU cumulatively leading to the spike in CPU.
As we can see in the above figure, there are particular set of threads causing the issue and another set of threads that are actually consuming less amount of CPU in the range of 0-20%, which can be eliminated during the analysis and look into the threads that are responsible for high CPU.
- We’ll figure out the ID related to the list of threads using the “ \Thread (<process name>) \ID Thread” counter. Now match this ID value to the one in DebugDiag trace, and look into the corresponding stack trace, and figure out what might have possibly caused the CPU spike.
And thus we have the list of counters that will give us a rich understanding as to why a particular thread in a particular process was responsible for high CPU.
Configure debugdiag for collecting dumps
Download the debugdiag tool from : https://www.microsoft.com/en-us/download/details.aspx?id=49924
Install the tool, from the site, open the debugdiag tool for collection and a window similar to the one below should be visible :
Click on Cancel and choose the Processes tab,
Now the list of processes that are running in the machine, gets listed. Choose the w3wp (w3wp.exe) process corresponding to the application pool that is causing the performance issue and right click and select Create Full User Dumps, as shown below :
Now the dump gets collected and saved.
Now, let's look on how to analyse the dumps and correlate the details obtained from perfmon, using the debugging tool "windbg" in the next section.
Correlating with the dumps
In computers, debugging is the process of locating and fixing or bypassing bugs (errors) in computer program code or the engineering of a hardware device. To debug a program or hardware device is to start with a problem, isolate the source of the problem, and then fix it. Windbg is a tool used for debugging.
You can refer : https://msdn.microsoft.com/en-us/library/windows/hardware/hh406283(v=vs.85).aspx to understand how to Debug dumps using windbg.
Open the dump in the windbg tool and once the dump is loaded, run the below commands to anlayse the collected dump for performance related issue :
" ~ " can be used to list out the threads, that are running in the application,
After figuring out the thread ID, for example if we figure out the thread id from perfmon to be 4496, run the below command in the windows debugger to figure out the equivalent hexadecimal value.
Now going back to the output we got in “ ~ ” we will be able to see the thread with the instance id 44 has the corresponding hexadecimal value, 1190
Now if we have to look into the call stack of this thread alone, run the below command.
The above command can be used to check the native call stack of the thread.
If you are interested to check the managed call stack of the thread, load the netext extension, which you can downoad from : http://netext.codeplex.com/ , save it to the path : C:\Program Files (x86)\Windows Kits\10\Debuggers\x64 (in case if the dump is for a 64-bit process, else save it in : C:\Program Files (x86)\Windows Kits\10\Debuggers\x86) where your windbg is present, and then run the below command to load the extension.
For netext to work we need to index it, the below command
To check the managed call stack of the thread, run the below command:
This can also be achieved by loading the sos extension by .loadby sos clr and executing the command !CLRStack
Hope the post helps in simplifying the analysis process !! :)