The RegexOptions.Compiled Flag and Slow Performance on 64-Bit .NET Framework 2.0 [Josh Free]

Developers using System.Text.RegularExpressions.Regex with the RegexOptions.Compiled flag may notice performance degradation in their 2.0 apps when running on 64-Bit .NET Framework 2.0.

The performance problem occurs in the Regex(String pattern, RegexOptions options) constructor when instantiating very large, un-optimized regular expressions and while specifying the RegexOptions.Compiled flag:

private static Regex nonwords = new Regex(@"\b("

   +@"a|aboard|about|above|absent|according\sto|across|after|against|ago|ahead\sof|ain't|all|along|alongside|"

   +@"also|although|am|amid|amidst|among|amongst|an|and|anti|anybody|anyone|anything|apart|apart\sfrom|are|"

   +@"aren't|around|as|as\sfar\sas|as\ssoon\sas|as\swell\sas|aside|at|atop|away|be|because|because\sof|before|"

   +@"behind|below|beneath|beside|besides|between|betwixt|beyond|but|by|by\smeans\sof|by\sthe\stime|can|cannot|"

   +@"circa|close\sto|com|concerning|considering|could|couldn't|cum|'d|despite|did|didn't|do|does|doesn't|don't|"

   +@"down|due\sto|during|each_other|'em|even\sif|even\sthough|ever|every|every\stime|everybody|everyone|"

   +@"everything|except|far\sfrom|few|first\stime|following|for|from|get|got|had|hadn't|has|hasn't|have|"

   +@"haven't|he|hence|her|here|hers|herself|him|himself|his|how|i|if|in|in\saccordance\swith|in\saddition\sto|in\scase|"

   +@"in\sfront\sof|in\slieu\sof|in\splace\sof|in\sspite\sof|in\sthe\sevent\sthat|in\sto|inside|inside\sof|"

   +@"instead\sof|into|is|isn't|it|itself|just\sin\scase|like|'ll|lots|may|me|mid|might|mightn't|mine|more|most|"

   +@"must|mustn't|myself|near|near\sto|nearest|no|no\sone|nobody|none|not|nothing|notwithstanding|now\sthat|of|"

   +@"ya|ye|yes|you|your|yours|yourself"

   +@")\b", (RegexOptions.IgnoreCase | RegexOptions.Compiled));

The compilation performance problem in the 64-Bit .NET Framework 2.0 is fixed with this hotfix http://support.microsoft.com/kb/917507, and will be released broadly in Service Pack 1 of .NET Framework 2.0. 

There are also several workarounds to this issue.

Reduce the Regular Expression Pattern

Developers can reduce the size of their regular expressions by simplifying the expression.  For instance the un-optimized pattern

"aa|ab|ac|ad|ae|af|ag|ah|ai|aj|ak"

can be replaced with this pattern:

"a[a-k]"

Use Regex Pre-Compilation Instead of Compiling-on-the-Fly

Developers can use Regex.CompileToAssembly to build an assembly containing their regular expression, instead of always compiling the regular expression during application startup.  For more details on Regular Expression Compilation options please see the CLR Inside Out article in the January 2006 edition of MSDN Magazine.

Remove the RegexOptions.Compiled Flag From Your Code

If you have never profiled the performance of your application or if you have profiled your app, and the run-time bottleneck is not Regex, you can consider dropping the RegexOptions.Compiled flag as a workaround, until .NET Framework 2.0 Service Pack 1 is released.