Events
Mar 17, 11 PM - Mar 21, 11 PM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
You can optimize the performance of applications that make extensive use of regular expressions by understanding how the regular expression engine compiles expressions and how regular expressions are cached. This article discusses compilation, source generation, and caching of compiled regular expressions.
By default, the regular expression engine compiles a regular expression to a sequence of internal instructions (these are high-level codes that are different from common intermediate language, or CIL). When the engine executes a regular expression, it interprets the internal codes.
If a Regex object is constructed with the RegexOptions.Compiled option, it compiles the regular expression to explicit CIL code instead of high-level regular expression internal instructions. This allows .NET's just-in-time (JIT) compiler to convert the expression to native machine code for higher performance. The cost of constructing the Regex object may be higher, but the cost of performing matches with it is likely to be much smaller.
Source generation for regular expressions is available in .NET 7 and later versions. The source generator emits, as C# code, a custom Regex
-derived implementation with logic similar to what RegexOptions.Compiled
emits in IL. You get all the throughput performance benefits of RegexOptions.Compiled
and the start-up benefits of Regex.CompileToAssembly
, but without the complexity of CompileToAssembly
. The source that's emitted is part of your project, which means it's also easily viewable and debuggable.
Where possible, use source-generated regular expressions instead of compiling regular expressions using the RegexOptions.Compiled option. For more information about source-generated regular expressions, see .NET regular expression source generators.
To improve performance, the regular expression engine maintains an application-wide cache of compiled regular expressions. The cache stores regular expression patterns that are used only in static method calls. (Regular expression patterns supplied to instance methods aren't cached.) Caching avoids the need to reparse an expression into high-level byte code each time it's used.
The maximum number of cached regular expressions is determined by the value of the static
(Shared
in Visual Basic) Regex.CacheSize property. By default, the regular expression engine caches up to 15 compiled regular expressions. If the number of compiled regular expressions exceeds the cache size, the least recently used regular expression is discarded and the new regular expression is cached.
Your application can reuse regular expressions in one of the following two ways:
Because of the overhead of object instantiation and regular expression compilation, creating and rapidly destroying numerous Regex objects is an expensive process. For applications that use a large number of different regular expressions, you can optimize performance by using calls to static Regex
methods and possibly by increasing the size of the regular expression cache.
.NET feedback
.NET is an open source project. Select a link to provide feedback:
Events
Mar 17, 11 PM - Mar 21, 11 PM
Join the meetup series to build scalable AI solutions based on real-world use cases with fellow developers and experts.
Register nowTraining
Module
Improve performance with a cache in a .NET Aspire project - Training
In this module, you'll learn about caches in a .NET Aspire cloud-native app and how to use them to optimize the performance of your microservices.