Deciphering Storport Traces 101
Welcome back to the CORE Team Blog -- Paul Reynolds here.
UPDATE 01-04-17: In previous blogs, I wrote about how to capture Storport traces in Windows 8 and Windows 2012 using the Performance Monitor GUI (see Tracing with Storport in Windows 2012 and Windows 8 with KB2819476 hotfix a nd Tracing with Storport in Windows 2012 and Windows 8 without KB2819476 hotfix). I would like to suggest another approach to capturing Storport traces in Windows (2008 R2 SP1 and up) which is simpler and more elegant - a batch file:
ECHO These commands will enable tracing:
logman create trace "storport" -ow -o c:\perflogs\storport.etl -p "Microsoft-Windows-StorPort" 0xffffffffffffffff 0xff -nb 16 16 -bs 1024 -mode Circular -f bincirc -max 4096 -ets
ECHO Reproduce your issue and enter any key to stop tracing
logman stop "storport" -ets
echo Tracing has been captured and saved successfully at c:\perflogs\storport.etl
NOTE: copy this to notepad first to get rid of any unrecognizable characters
This creates a circular 4GB Storport trace. You can edit it and create event tasks in the System or Application Event Logs to stop it so it is handy for troubleshooting. This has proven to be a much better way to create, start, and stop the Storport traces and I recommend you use this approach instead. It does not have filtering, but I no longer recommend using filtering in Storport traces as it can lead to misleading results and we now have tools (PowerBI, etc.) in Excel that can handle millions of rows of data.
END UPDATE 01-07-16
I would like to explore what information you can draw from the raw data contained in a Storport trace. What conclusions can you reach regarding your disk performance? Do you have a disk that does not perform well?
To accomplish this, we will take advantage of free tools available to Windows and Office (2013 and up) users:
- Windows PowerShell
- Windows Performance Toolkit
- Excel PowerPivot
First, we need to talk briefly about the Windows Storage Stack and where Storport traces are taken. It is important to note that Storport traces are at the very last “rung of the ladder” before Windows hands off I/O request packets to hardware. Hardware in this case encompasses firmware, drivers, HBAs, storage fabrics – anything after the Windows Operating System. It is important because it can help delineate where the problem is. Is the problem with the Operating System or with the hardware? Should I call Microsoft to open a support case or it is more appropriate to talk to my Storage Vendor? Storport traces are the perfect place to start to answer questions such as these.
As the old adage goes: a picture is worth a thousand words. When it comes time to deciphering Storport Traces, viewing graphs that summarize data and show trends, and viewing charts that have average and maximums, are much more helpful than looking at raw data. The top-level steps we will undertake are:
- Capture the Storport Trace
- Convert to the Storport Trace into CSV format
- Scrub the data in the CSV file
- Import the data into an Excel Spreadsheet
The easiest way to accomplish the last 3 steps above is to automate it using a Windows PowerShell script. At the end of the blog, there is a zip to download that has this file. Save the PowerShell script as StorPortPACMAN.PS1 in a directory on your C: drive called StorPortPACMAN :
PLEASE NOTE: you must have the following prerequisites installed to successfully run the script above:
1. XPERF needs to be installed and in your system path.
Windows 8 or 10 – use the Windows Assessment and Deployment Kit (ADK): https://msdn.microsoft.com/en-us/windows/hardware/commercialize/adk-install
2. PowerPivot is part of Office 2013 and up, but it needs to be added as a COM Addin. Click here for instructions.
At the end of this blog, there are also 2 spreadsheets in the zip you must download into the C:\StorPortPACMAN directory.
Run the script, and depending on which version of Windows was being run on the server, one of the Excel Spreadsheets will open and be refreshed with data in your Storport trace.
Here are screenshots of sample graphs and charts you will see in the first page of the spreadsheets:
This first graph shows request duration values over the time of the Storport trace. It is very useful to get the “big picture”. For the most part, the disk in this graph is performing fine except for a period near the beginning. Depending on the application you are investigating, this may be fine and the average value is what is important to you. Averages and maximums are shown in the chart below, which is next to this graph in the spreadsheet:
Finally, the chart data is presented in the graph below:
The Excel spreadsheets and their corresponding graphs and charts are only samples of what you can do for your disk analysis using Storport traces and Excel PowerPivot tools. Why use PowerPivot? For Storport traces it means being able to view and summarize more data. Excel by itself is an excellent tool, but problems may start to develop when your data approaches or exceeds a million rows of data.
It is not uncommon for a Storport trace to contain tens of millions of rows of data, especially if you decide to not use a filter while capturing the data. I generally suggest to not use a filter as your averages will be closer to results you obtain from other tools such as Perfmon or XPERF. Using filters will cause your disk to look worse than it really is.
Request Duration Times, as a rule of thumb, can be summed up as follows for SCSI Read(10) and Non-cached Write(10) I/O:
< 9ms = excellent
< 15ms = good
< 25ms = fair
> 25ms = poor
We purposely use the caveat, rule of thumb, because these values are using an assumed 64KB data I/O size. If a read or write is larger or smaller than that, these values should be adjusted.
Using an Excel Pivot table and chart lends well to focusing on a specific LUN through the use of filters built into the table or chart.
We occasionally see a Storport trace that does not make sense. One example is a VM that has more than one SCSI disk and the Excel spreadsheet shows only one LUN. What happened to my other disk? It is there, but because of the way storage may be presented, the same LUN number is used for more than one disk. So what does one do in a situation such at this? The easiest way is to drop and drag a second property of a disk into the chart or graph you are viewing. There are 4 properties we gather in a Storport trace:
Since using LUN is not enough, we can add a second property to the chart, Target. To do this, at the top of the spreadsheet click on the Analyze tab under Pivottable Tools, then click on Field List to expose the data available to you. Drag and Drop the Target field onto Rows and underneath LUN. You will see something similar to this:
This will let you see all your disks if they have the same LUN number.
If you are at all like me, viewing charts and graphs might be fine for seeing the big picture and giving presentations, but your inner engineer is screaming to look at the data in detail, to make sure you are comfortable with it and understand it. To do this, click on the POWERPIVOT tab and then the Manage button in the Data Model area. This will open a new window that exposes the raw data to you so that you may filter it, sort it, or anything else you wish to make sure it all makes sense to you before using it.
I hope you find this blog helpful and gives you a new tool to use when investigating disk performance.