Case Study: Part 1: CXPACKET Wait Stats & 'max degree of parallelism' Option: Introduction to Using Wait Stats to Identify & Remediate Query Parallelism Bottlenecks
This is the first of a four-part series:
- Introduction to Query Parallelism (this post)
- Flipping the Bit
- Expensive Query Identification & Remediation
The SQL Server configuration option max degree of parallelism determines the maximum number of processors which can participate in query execution. The default value is 0, allowing SQL Server to utilize all available cores at query runtime to contribute in parallel to query processing.
By dividing operations such as range scans into smaller chunks, and amalgamating the results, parallelism can provide performance benefits for queries used for, e.g., for reporting, DSS, & warehouse systems.
Yet, generally speaking, the aforementioned default value, 0, is not optimal for OLTP systems.
Detecting Bottlenecks: The CXPACKET Wait Type
Documenting whether query parallelism is a bottleneck is done by interrogating SQL Server wait stats:
SQL Server Edition Command 2000 DBCC sqlperf(waitstats) 2005; 2008 sys.dm_os_wait_stats
See my post Drum Roll, Please...The Debut of The SQL DMV All-Stars Dream Team! for syntax.
The cardinal characteristic of parallelism bottlenecks is the CXPACKET wait stat. Former SQL Customer Advisory Team Member Tom Davidson provides guidance—along with sidekicks Danny Tambs & Sanjay Mishra—in the Performance Tuning Waits Queues.doc in their Best Practices article:
SQL Server 2005 Waits and Queues
CXPACKET waits accounting for more than 5% of total relevant resource waits indicate a query parallelism bottleneck. Per Tom's doc: "Ordinarily a well tuned OLTP application would not parallelize unless an index is missing, there is an incomplete WHERE clause, or the query is not a true OLTP transaction."
Suppressing Query Parallelism
Set max degree of parallelism to 1 to suppress query parallelism:
EXEC dbo.sp_configure 'show advanced options', 1; GO RECONFIGURE WITH OVERRIDE; GO EXEC dbo.sp_configure 'max degree of parallelism', 1; GO RECONFIGURE WITH OVERRIDE; GO
SQL α-Geek Joe Chang has some interesting things to say about this option (& a lot of other good stuff) in this post:
Storage Performance for SQL Server
"Large queries on transaction processing servers should be controlled by setting the max degree of parallelism to 1 or perhaps no more than 1 out of 4 processors."
"Since SQL Server cannot know whether it is being used for transaction processing, or data warehousing, it cannot automatically determine the best disk IO settings. There really should be a setting for the database usage model. In addition, it is helpful to declare the number of physical disk drives in each array. This allows the SQL Server engine to know how many concurrent IO can be issued while maintaining low latency, without relying entirely on the disk access time feedback. The SQL Server engine internal IO governor is beneficial, but this alone cannot mask transient IO surges effectively on a weak disk system. If IO is not throttled enough, disk access latency can become excessive. If IO is throttled, then the full IO performance capability of the disk system is not realized."
See my next post, part 2 of this 4-part series, to see the effects of changing the max degree of parallelism option from the default value of 0 to 1.
For systems with hybrid characteristics, e.g., mixed OLTP & DSS, use representative workloads to experiment with values other than 0 & 1.
Since a system re-start is not required to change the setting, consider changing the instance-wide option from 1 to 0 prior to tasks such as index maintenance, then afterward re-setting it to 1. For production queries righteously worthy of parallelism, there are two other options:
- To target specific queries, use the MAXDOP query hint.
- For a general approach, elevate the cost threshold for parallelism option from the default value of 5. The optimizer is prohibited from considering parallel plans until the estimated query cost exceeds this option's value.
But Wait, There's More!
Here's a list of additional considerations:
- Make certain statistics are up-to-date; the optimizer thrives—or dies—based on up-to-date statistics.
- Fragmentation of internal structures & external disk must be kept low.
- In no case should max degree of parallelism be set to a value higher than the number of physical cores.
- For most servers max degree of parallelism should be set to no more than eight, even if more than eight cores are available.
- For NUMA-enabled servers, max degree of parallelism should not exceed the number of cores assigned to each NUMA node.
- Hyper-Threading often (not always) compromises SQL Server performance. My recommendation: In the absence of unequivocal supporting evidence that H-T enhances performance in your environment, H-T should be disabled. (Stay tuned for an upcoming posts on SQL Server & H-T, including information on Intel's new i7 CPU & its new! improved? implementation of H-T.)
Pet Peeve re: Precise Usage
For the record & to be precise, max degree of parallelism & MAXDOP are not the same thing. Though related, the former is a configuration option, the latter is a query hint. Though the former is a seven-syllable mouthful, I encourage the use of engineering discipline & communicating with precision.
My Related Posts
Jimmy May , MCDBA, MCSE, MCITP: DBA + DB Dev | Senior Performance Consultant: SQL Server
A.C.E.: Assessment Consulting & Engineering Services
Performance is paramount: Asking users to wait is like asking them to leave.