.NET 中的正則運算式最佳做法Best practices for regular expressions in .NET

.NET 中的規則運算式引擎是一項強大而功能完整的工具,會依據模式比對而非比較與比對常值文字的方式處理文字。The regular expression engine in .NET is a powerful, full-featured tool that processes text based on pattern matches rather than on comparing and matching literal text. 在大部分情況下,它會快速且有效率地執行模式比對。In most cases, it performs pattern matching rapidly and efficiently. 不過,在某些情況下,規則運算式引擎速度可能變得相當慢。However, in some cases, the regular expression engine can appear to be very slow. 而只有鮮少情況下,它甚至可能在處理相對小的輸入卻耗費數小時甚至數天時停止回應。In extreme cases, it can even appear to stop responding as it processes a relatively small input over the course of hours or even days.

本主題說明一些開發人員可以採用的最佳做法,確保其規則運算式達到最佳效能。This topic outlines some of the best practices that developers can adopt to ensure that their regular expressions achieve optimal performance.

考慮輸入來源Consider the input source

一般而言,規則運算式可以接受兩種類型的輸入:受限制或未受限制。In general, regular expressions can accept two types of input: constrained or unconstrained. 受限制的輸入是來自已知或可靠來源,並且遵循預先定義格式的文字。Constrained input is text that originates from a known or reliable source and follows a predefined format. 未受限制的輸入是來自不可靠來源 (例如 Web 使用者) 的文字,且可能未依循預先定義或預期的格式。Unconstrained input is text that originates from an unreliable source, such as a web user, and may not follow a predefined or expected format.

通常撰寫規則運算式模式的目的在於比對有效輸入。Regular expression patterns are typically written to match valid input. 也就是說,開發人員會檢查要比對的文字,然後撰寫比對該文字的規則運算式模式。That is, developers examine the text that they want to match and then write a regular expression pattern that matches it. 接著開發人員會利用多個有效的輸入項目進行測試,藉此判斷此模式是否需要更正或進一步詳述。Developers then determine whether this pattern requires correction or further elaboration by testing it with multiple valid input items. 當模式符合所有假設的有效輸入時,即宣告準備好實際執行,並且可以納入已發行的應用程式中。When the pattern matches all presumed valid inputs, it is declared to be production-ready and can be included in a released application. 這種方式使得規則運算式模式相當適合比對限制的輸入,This makes a regular expression pattern suitable for matching constrained input. 不過卻不適合比對未受限制的輸入。However, it does not make it suitable for matching unconstrained input.

若要比對未受限制的輸入,規則運算式必須能夠有效率地處理三種文字:To match unconstrained input, a regular expression must be able to efficiently handle three kinds of text:

  • 符合規則運算式模式的文字。Text that matches the regular expression pattern.

  • 不符合規則運算式模式的文字。Text that does not match the regular expression pattern.

  • 幾乎符合規則運算式模式的文字。Text that nearly matches the regular expression pattern.

最後一種文字對於專為處理受限制輸入的規則運算式而言尤其繁瑣。The last text type is especially problematic for a regular expression that has been written to handle constrained input. 如果該規則運算式也依賴大量回溯,則規則運算式引擎可能耗費相當長的時間 (有些情況需要許多小時或許多天) 處理看似無關緊要的文字。If that regular expression also relies on extensive backtracking, the regular expression engine can spend an inordinate amount of time (in some cases, many hours or days) processing seemingly innocuous text.

警告

下列範例將使用容易造成大量回溯,而且可能拒絕有效電子郵件地址的規則運算式。The following example uses a regular expression that is prone to excessive backtracking and that is likely to reject valid email addresses. 這個規則運算式不應該在電子郵件驗證常式中使用。You should not use it in an email validation routine. 如果您想要會驗證電子郵件地址的規則運算式,請參閱如何:確認字串是否為有效的電子郵件格式If you would like a regular expression that validates email addresses, see How to: Verify that Strings Are in Valid Email Format.

例如,像是驗證電子郵件地址別名的規則運算式,這種規則運算式相當常用卻也極為繁瑣。For example, consider a very commonly used but extremely problematic regular expression for validating the alias of an email address. 規則運算式 ^[0-9A-Z]([-.\w]*[0-9A-Z])*$ 主要用來處理一般視為有效的電子郵件地址,其中包含英數字元,後面接著零個或多個字元,而這些字元可以是英數字、句號或連字號。The regular expression ^[0-9A-Z]([-.\w]*[0-9A-Z])*$ is written to process what is considered to be a valid email address, which consists of an alphanumeric character, followed by zero or more characters that can be alphanumeric, periods, or hyphens. 規則運算式的結尾必須是英數字元。The regular expression must end with an alphanumeric character. 不過,如下面的範例所示,雖然這個規則運算式可輕鬆處理有效的輸入,但是當它處理幾乎有效的輸入時就非常沒有效率。However, as the following example shows, although this regular expression handles valid input easily, its performance is very inefficient when it is processing nearly valid input.

using System;
using System.Diagnostics;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      Stopwatch sw;    
      string[] addresses = { "AAAAAAAAAAA@contoso.com", 
                             "AAAAAAAAAAaaaaaaaaaa!@contoso.com" };
      // The following regular expression should not actually be used to 
      // validate an email address.
      string pattern = @"^[0-9A-Z]([-.\w]*[0-9A-Z])*$";
      string input; 
      
      foreach (var address in addresses) {
         string mailBox = address.Substring(0, address.IndexOf("@"));       
         int index = 0;
         for (int ctr = mailBox.Length - 1; ctr >= 0; ctr--) {
            index++;

            input = mailBox.Substring(ctr, index); 
            sw = Stopwatch.StartNew();
            Match m = Regex.Match(input, pattern, RegexOptions.IgnoreCase);
            sw.Stop();
            if (m.Success)
               Console.WriteLine("{0,2}. Matched '{1,25}' in {2}", 
                                 index, m.Value, sw.Elapsed);
            else                     
               Console.WriteLine("{0,2}. Failed  '{1,25}' in {2}", 
                                 index, input, sw.Elapsed);
         }
         Console.WriteLine();
      }
   }
}

// The example displays output similar to the following:
//     1. Matched '                        A' in 00:00:00.0007122
//     2. Matched '                       AA' in 00:00:00.0000282
//     3. Matched '                      AAA' in 00:00:00.0000042
//     4. Matched '                     AAAA' in 00:00:00.0000038
//     5. Matched '                    AAAAA' in 00:00:00.0000042
//     6. Matched '                   AAAAAA' in 00:00:00.0000042
//     7. Matched '                  AAAAAAA' in 00:00:00.0000042
//     8. Matched '                 AAAAAAAA' in 00:00:00.0000087
//     9. Matched '                AAAAAAAAA' in 00:00:00.0000045
//    10. Matched '               AAAAAAAAAA' in 00:00:00.0000045
//    11. Matched '              AAAAAAAAAAA' in 00:00:00.0000045
//    
//     1. Failed  '                        !' in 00:00:00.0000447
//     2. Failed  '                       a!' in 00:00:00.0000071
//     3. Failed  '                      aa!' in 00:00:00.0000071
//     4. Failed  '                     aaa!' in 00:00:00.0000061
//     5. Failed  '                    aaaa!' in 00:00:00.0000081
//     6. Failed  '                   aaaaa!' in 00:00:00.0000126
//     7. Failed  '                  aaaaaa!' in 00:00:00.0000359
//     8. Failed  '                 aaaaaaa!' in 00:00:00.0000414
//     9. Failed  '                aaaaaaaa!' in 00:00:00.0000758
//    10. Failed  '               aaaaaaaaa!' in 00:00:00.0001462
//    11. Failed  '              aaaaaaaaaa!' in 00:00:00.0002885
//    12. Failed  '             Aaaaaaaaaaa!' in 00:00:00.0005780
//    13. Failed  '            AAaaaaaaaaaa!' in 00:00:00.0011628
//    14. Failed  '           AAAaaaaaaaaaa!' in 00:00:00.0022851
//    15. Failed  '          AAAAaaaaaaaaaa!' in 00:00:00.0045864
//    16. Failed  '         AAAAAaaaaaaaaaa!' in 00:00:00.0093168
//    17. Failed  '        AAAAAAaaaaaaaaaa!' in 00:00:00.0185993
//    18. Failed  '       AAAAAAAaaaaaaaaaa!' in 00:00:00.0366723
//    19. Failed  '      AAAAAAAAaaaaaaaaaa!' in 00:00:00.1370108
//    20. Failed  '     AAAAAAAAAaaaaaaaaaa!' in 00:00:00.1553966
//    21. Failed  '    AAAAAAAAAAaaaaaaaaaa!' in 00:00:00.3223372
Imports System.Diagnostics
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim sw As Stopwatch    
      Dim addresses() As String = { "AAAAAAAAAAA@contoso.com", 
                                 "AAAAAAAAAAaaaaaaaaaa!@contoso.com" }
      ' The following regular expression should not actually be used to 
      ' validate an email address.
      Dim pattern As String = "^[0-9A-Z]([-.\w]*[0-9A-Z])*$"
      Dim input As String 
      
      For Each address In addresses
         Dim mailBox As String = address.Substring(0, address.IndexOf("@"))       
         Dim index As Integer = 0
         For ctr As Integer = mailBox.Length - 1 To 0 Step -1
            index += 1
            input = mailBox.Substring(ctr, index) 
            sw = Stopwatch.StartNew()
            Dim m As Match = Regex.Match(input, pattern, RegexOptions.IgnoreCase)
            sw.Stop()
            if m.Success Then
               Console.WriteLine("{0,2}. Matched '{1,25}' in {2}", 
                                 index, m.Value, sw.Elapsed)
            Else                     
               Console.WriteLine("{0,2}. Failed  '{1,25}' in {2}", 
                                 index, input, sw.Elapsed)
            End If                  
         Next
         Console.WriteLine()
      Next
   End Sub
End Module
' The example displays output similar to the following:
'     1. Matched '                        A' in 00:00:00.0007122
'     2. Matched '                       AA' in 00:00:00.0000282
'     3. Matched '                      AAA' in 00:00:00.0000042
'     4. Matched '                     AAAA' in 00:00:00.0000038
'     5. Matched '                    AAAAA' in 00:00:00.0000042
'     6. Matched '                   AAAAAA' in 00:00:00.0000042
'     7. Matched '                  AAAAAAA' in 00:00:00.0000042
'     8. Matched '                 AAAAAAAA' in 00:00:00.0000087
'     9. Matched '                AAAAAAAAA' in 00:00:00.0000045
'    10. Matched '               AAAAAAAAAA' in 00:00:00.0000045
'    11. Matched '              AAAAAAAAAAA' in 00:00:00.0000045
'    
'     1. Failed  '                        !' in 00:00:00.0000447
'     2. Failed  '                       a!' in 00:00:00.0000071
'     3. Failed  '                      aa!' in 00:00:00.0000071
'     4. Failed  '                     aaa!' in 00:00:00.0000061
'     5. Failed  '                    aaaa!' in 00:00:00.0000081
'     6. Failed  '                   aaaaa!' in 00:00:00.0000126
'     7. Failed  '                  aaaaaa!' in 00:00:00.0000359
'     8. Failed  '                 aaaaaaa!' in 00:00:00.0000414
'     9. Failed  '                aaaaaaaa!' in 00:00:00.0000758
'    10. Failed  '               aaaaaaaaa!' in 00:00:00.0001462
'    11. Failed  '              aaaaaaaaaa!' in 00:00:00.0002885
'    12. Failed  '             Aaaaaaaaaaa!' in 00:00:00.0005780
'    13. Failed  '            AAaaaaaaaaaa!' in 00:00:00.0011628
'    14. Failed  '           AAAaaaaaaaaaa!' in 00:00:00.0022851
'    15. Failed  '          AAAAaaaaaaaaaa!' in 00:00:00.0045864
'    16. Failed  '         AAAAAaaaaaaaaaa!' in 00:00:00.0093168
'    17. Failed  '        AAAAAAaaaaaaaaaa!' in 00:00:00.0185993
'    18. Failed  '       AAAAAAAaaaaaaaaaa!' in 00:00:00.0366723
'    19. Failed  '      AAAAAAAAaaaaaaaaaa!' in 00:00:00.1370108
'    20. Failed  '     AAAAAAAAAaaaaaaaaaa!' in 00:00:00.1553966
'    21. Failed  '    AAAAAAAAAAaaaaaaaaaa!' in 00:00:00.3223372

如範例的輸出所示,規則運算式引擎會以大致相同的時間間隔處理有效的電子郵件別名 (無論其長度為何)。As the output from the example shows, the regular expression engine processes the valid email alias in about the same time interval regardless of its length. 但另一方面,當幾乎有效的電子郵件地址包含超過五個字元時,字串中超出的每個字元其處理時間約為兩倍。On the other hand, when the nearly valid email address has more than five characters, processing time approximately doubles for each additional character in the string. 這表示,幾乎有效的 28 個字元字串需要超過一小時的處理時間,而幾乎有效的 33 個字元字串則需要將近一天來處理。This means that a nearly valid 28-character string would take over an hour to process, and a nearly valid 33-character string would take nearly a day to process.

由於這個規則運算式單純是考量所要比對輸入的格式而開發,因此並未考慮不符合模式的輸入。Because this regular expression was developed solely by considering the format of input to be matched, it fails to take account of input that does not match the pattern. 而這種情況就會讓幾乎符合規則運算式模式的未受限制輸入大幅降低效能。This, in turn, can allow unconstrained input that nearly matches the regular expression pattern to significantly degrade performance.

若要解決這個問題,您可以執行下列操作:To solve this problem, you can do the following:

  • 開發模式時,您應考慮回溯可能對規則運算式引擎的效能造成的影響,尤其是規則運算式的設計為處理未受限制的輸入。When developing a pattern, you should consider how backtracking might affect the performance of the regular expression engine, particularly if your regular expression is designed to process unconstrained input. 如需詳細資訊,請參閱控制回溯一節。For more information, see the Take Charge of Backtracking section.

  • 使用無效或幾乎有效的輸入以及有效輸入徹底測試您的規則運算式。Thoroughly test your regular expression using invalid and near-valid input as well as valid input. 若要針對特殊規則運算式隨機產生輸入,您可以使用 Rex,這是 Microsoft Research 提供的規則運算式探索工具。To generate input for a particular regular expression randomly, you can use Rex, which is a regular expression exploration tool from Microsoft Research.

適當處理物件具現化Handle object instantiation appropriately

System.Text.RegularExpressions.Regex 類別是 .NET 規則運算式物件模型的核心,它代表規則運算式引擎。At the heart of .NET’s regular expression object model is the System.Text.RegularExpressions.Regex class, which represents the regular expression engine. 使用 Regex 引擎的方式經常是影響規則運算式效能最重要的一項因素。Often, the single greatest factor that affects regular expression performance is the way in which the Regex engine is used. 定義規則運算式的工作與結合規則運算式引擎和規則運算式模式息息相關。Defining a regular expression involves tightly coupling the regular expression engine with a regular expression pattern. 無論是傳遞規則運算式模式給 Regex 物件的建構函式,藉此將該物件具現化,或是將規則運算式模式連同要分析的字串一併傳遞給靜態方法,藉此呼叫該方法,這個結合的過程都必然相當昂貴。That coupling process, whether it involves instantiating a Regex object by passing its constructor a regular expression pattern or calling a static method by passing it the regular expression pattern along with the string to be analyzed, is by necessity an expensive one.

注意

如需使用已解譯和已編譯規則運算式所造成不良效能影響的詳細討論,請參閱 BCL Team 部落格中的將規則運算式的效能最佳化,第 II 部分:控制回溯 (英文)。For a more detailed discussion of the performance implications of using interpreted and compiled regular expressions, see Optimizing Regular Expression Performance, Part II: Taking Charge of Backtracking in the BCL Team blog.

您可以結合規則運算式引擎與特定規則運算式模式,然後使用引擎透過數種方式比對文字:You can couple the regular expression engine with a particular regular expression pattern and then use the engine to match text in several ways:

  • 您可以呼叫靜態模式比對方法,例如 Regex.Match(String, String)You can call a static pattern-matching method, such as Regex.Match(String, String). 這樣就不需要具現化規則運算式物件。This does not require instantiation of a regular expression object.

  • 您可以具現化 Regex 物件,並且呼叫解譯之規則運算式的執行個體模式比對方法。You can instantiate a Regex object and call an instance pattern-matching method of an interpreted regular expression. 這是將規則運算式引擎繫結至規則運算式模式的預設方法。This is the default method for binding the regular expression engine to a regular expression pattern. 這個方法會在 Regex 物件具現化,但是沒有包含 options 旗標的 Compiled 引數時得出結果。It results when a Regex object is instantiated without an options argument that includes the Compiled flag.

  • 您可以具現化 Regex 物件,並且呼叫編譯之規則運算式的執行個體模式比對方法。You can instantiate a Regex object and call an instance pattern-matching method of a compiled regular expression. Regex 物件具現化且包含 options 旗標的 Compiled 引數時,規則運算式物件就會表示編譯的模式。Regular expression objects represent compiled patterns when a Regex object is instantiated with an options argument that includes the Compiled flag.

  • 您可以建立與特殊規則運算式模式緊密結合的特殊目的 Regex 物件、進行編譯,並且將它儲存到獨立的組件中。You can create a special-purpose Regex object that is tightly coupled with a particular regular expression pattern, compile it, and save it to a standalone assembly. 您可以藉由呼叫 Regex.CompileToAssembly 方法執行這項作業。You do this by calling the Regex.CompileToAssembly method.

您呼叫規則運算式比對方法的特殊方式可能對應用程式造成大幅影響。The particular way in which you call regular expression matching methods can have a significant impact on your application. 下列各節將討論何時使用靜態方法呼叫、解譯的規則運算式以及編譯的規則運算式改善應用程式的效能。The following sections discuss when to use static method calls, interpreted regular expressions, and compiled regular expressions to improve your application's performance.

重要

如果在方法呼叫中重複使用相同的規則運算式,或是應用程式大量使用規則運算式物件,則方法呼叫的形式 (靜態、解譯、編譯) 就會影響效能。The form of the method call (static, interpreted, compiled) affects performance if the same regular expression is used repeatedly in method calls, or if an application makes extensive use of regular expression objects.

靜態規則運算式Static regular expressions

建議您使用靜態規則運算式方法來替代使用相同的規則運算式重複具現化規則運算式物件。Static regular expression methods are recommended as an alternative to repeatedly instantiating a regular expression object with the same regular expression. 不同于正則運算式物件所使用的正則運算式模式,正則運算式引擎會在內部快取從靜態方法呼叫中使用之模式的作業程式碼或已編譯的 Microsoft 中繼語言(MSIL)。Unlike regular expression patterns used by regular expression objects, either the operation codes or the compiled Microsoft intermediate language (MSIL) from patterns used in static method calls is cached internally by the regular expression engine.

例如,事件處理常式經常會呼叫另一個方法來驗證使用者輸入。For example, an event handler frequently calls another method to validate user input. 下列程式碼中會反映這種情況,其中 Button 控制項的 Click 事件會用來呼叫名為 IsValidCurrency 的方法,該方法會檢查使用者是否已輸入貨幣符號且後面至少有一個十進位數字。This is reflected in the following code, in which a Button control's Click event is used to call a method named IsValidCurrency, which checks whether the user has entered a currency symbol followed by at least one decimal digit.

public void OKButton_Click(object sender, EventArgs e) 
{
   if (! String.IsNullOrEmpty(sourceCurrency.Text))
      if (RegexLib.IsValidCurrency(sourceCurrency.Text))
         PerformConversion();
      else
         status.Text = "The source currency value is invalid.";
}
Public Sub OKButton_Click(sender As Object, e As EventArgs) _ 
           Handles OKButton.Click

   If Not String.IsNullOrEmpty(sourceCurrency.Text) Then
      If RegexLib.IsValidCurrency(sourceCurrency.Text) Then
         PerformConversion()
      Else
         status.Text = "The source currency value is invalid."
      End If          
   End If
End Sub

下列範例示範非常沒有效率的 IsValidCurrency 方法實作。A very inefficient implementation of the IsValidCurrency method is shown in the following example. 請注意,每一個方法呼叫都會使用相同的模式重複具現化 Regex 物件。Note that each method call reinstantiates a Regex object with the same pattern. 而這表示,每次呼叫方法時都必須重新編譯規則運算式。This, in turn, means that the regular expression pattern must be recompiled each time the method is called.

using System;
using System.Text.RegularExpressions;

public class RegexLib
{
   public static bool IsValidCurrency(string currencyValue)
   {
      string pattern = @"\p{Sc}+\s*\d+";
      Regex currencyRegex = new Regex(pattern);
      return currencyRegex.IsMatch(currencyValue);
   }
}
Imports System.Text.RegularExpressions

Public Module RegexLib
   Public Function IsValidCurrency(currencyValue As String) As Boolean
      Dim pattern As String = "\p{Sc}+\s*\d+"
      Dim currencyRegex As New Regex(pattern)
      Return currencyRegex.IsMatch(currencyValue) 
   End Function
End Module

您應該用呼叫靜態 Regex.IsMatch(String, String) 方法取代這個沒有效率的程式碼。You should replace this inefficient code with a call to the static Regex.IsMatch(String, String) method. 這樣就不必在每次您想要呼叫模式比對方法時具現化 Regex 物件,並且可讓規則運算式引擎從其快取擷取編譯版的規則運算式。This eliminates the need to instantiate a Regex object each time you want to call a pattern-matching method, and enables the regular expression engine to retrieve a compiled version of the regular expression from its cache.

using System;
using System.Text.RegularExpressions;

public class RegexLib
{
   public static bool IsValidCurrency(string currencyValue)
   {
      string pattern = @"\p{Sc}+\s*\d+";
      return Regex.IsMatch(currencyValue, pattern); 
   }
}
Imports System.Text.RegularExpressions

Public Module RegexLib
   Public Function IsValidCurrency(currencyValue As String) As Boolean
      Dim pattern As String = "\p{Sc}+\s*\d+"
      Return Regex.IsMatch(currencyValue, pattern)
   End Function
End Module

根據預設,會快取 15 個最近使用的靜態規則運算式模式。By default, the last 15 most recently used static regular expression patterns are cached. 針對需要大量快取之靜態規則運算式的應用程式,快取的大小可以透過設定 Regex.CacheSize 屬性加以調整。For applications that require a larger number of cached static regular expressions, the size of the cache can be adjusted by setting the Regex.CacheSize property.

這個範例中使用的規則運算式 \p{Sc}+\s*\d+ 會驗證輸入字串是否包含貨幣符號和至少一個十進位數字。The regular expression \p{Sc}+\s*\d+ that is used in this example verifies that the input string consists of a currency symbol and at least one decimal digit. 模式的定義方式如下表所示。The pattern is defined as shown in the following table.

模式Pattern 描述Description
\p{Sc}+ 比對 [Unicode Symbol, Currency] 分類中的一個或多個字元。Match one or more characters in the Unicode Symbol, Currency category.
\s* 比對零個以上的空白字元。Match zero or more white-space characters.
\d+ 比對一個或多個十進位數字。Match one or more decimal digits.

比較經過解譯與經過編譯的規則運算式Interpreted vs. compiled regular expressions

未透過指定 Compiled 選項繫結程序至規則運算式引擎的規則運算式模式會加以解譯。Regular expression patterns that are not bound to the regular expression engine through the specification of the Compiled option are interpreted. 具現化規則運算式物件時,規則運算式引擎會將規則運算式轉換成一組作業程式碼。When a regular expression object is instantiated, the regular expression engine converts the regular expression to a set of operation codes. 呼叫執行個體方法時,作業程式碼會轉換成 MSIL 並且由 JIT 編譯器執行。When an instance method is called, the operation codes are converted to MSIL and executed by the JIT compiler. 同樣地,當呼叫靜態規則運算式方法而快取中找不到規則運算式時,規則運算式引擎會將規則運算式轉換成一組作業程式碼,並且將它們儲存到快取中。Similarly, when a static regular expression method is called and the regular expression cannot be found in the cache, the regular expression engine converts the regular expression to a set of operation codes and stores them in the cache. 然後引擎會將這些作業程式碼轉換成 MSIL,JIT 編譯器就可以執行這些作業程式碼。It then converts these operation codes to MSIL so that the JIT compiler can execute them. 解譯的規則運算式會藉由放慢執行時間來縮短啟動時間。Interpreted regular expressions reduce startup time at the cost of slower execution time. 因此,在少數方法呼叫中使用規則運算式時,或是雖然不知道規則運算式方法呼叫的確實數目,但是期望數目很少時,最適合使用解譯的規則運算式。Because of this, they are best used when the regular expression is used in a small number of method calls, or if the exact number of calls to regular expression methods is unknown but is expected to be small. 隨著方法呼叫的數目增加,放慢執行速度就會壓縮掉縮短啟動時間所獲得的效能。As the number of method calls increases, the performance gain from reduced startup time is outstripped by the slower execution speed.

透過指定 Compiled 選項繫結程序至規則運算式引擎的規則運算式模式會加以編譯。Regular expression patterns that are bound to the regular expression engine through the specification of the Compiled option are compiled. 這表示,當具現化規則運算式物件,或是當呼叫靜態規則運算式方法而快取中找不到規則運算式時,規則運算式引擎會將規則運算式轉換成一組中繼的作業程式碼,然後將這些程式碼轉換成 MSIL。This means that, when a regular expression object is instantiated, or when a static regular expression method is called and the regular expression cannot be found in the cache, the regular expression engine converts the regular expression to an intermediary set of operation codes, which it then converts to MSIL. 呼叫方法時,JIT 編譯器會執行 MSIL。When a method is called, the JIT compiler executes the MSIL. 與解譯的規則運算式相反的是,編譯的規則運算式會延長啟動時間,但加快執行個別模式比對方法的速度。In contrast to interpreted regular expressions, compiled regular expressions increase startup time but execute individual pattern-matching methods faster. 因此,編譯規則運算式所獲得的效能優勢會與呼叫的規則運算式方法數目成比例。As a result, the performance benefit that results from compiling the regular expression increases in proportion to the number of regular expression methods called.

簡言之,我們建議您在呼叫含有相對較不常用之特定規則運算式的規則運算式方法時,使用解譯的規則運算式。To summarize, we recommend that you use interpreted regular expressions when you call regular expression methods with a specific regular expression relatively infrequently. 而當您呼叫含有相對較常用之特定規則運算式的規則運算式方法時,則應該使用編譯的規則運算式。You should use compiled regular expressions when you call regular expression methods with a specific regular expression relatively frequently. 無論是放慢解譯的規則運算式執行速度所提升的效能超過縮短的啟動時間,或是放慢編譯的規則運算式啟動時間所提升的效能超過加快執行速度,都不容易判斷出其確實的臨界值。The exact threshold at which the slower execution speeds of interpreted regular expressions outweigh gains from their reduced startup time, or the threshold at which the slower startup times of compiled regular expressions outweigh gains from their faster execution speeds, is difficult to determine. 因為臨界值取決於各種不同的因素,包括規則運算式及其處理之特定資料的複雜度。It depends on a variety of factors, including the complexity of the regular expression and the specific data that it processes. 若要判斷究竟是解譯或編譯的規則運算式能為您的特殊應用程式案例提供最佳效能,您可以使用 Stopwatch 類別比較兩者的執行時間。To determine whether interpreted or compiled regular expressions offer the best performance for your particular application scenario, you can use the Stopwatch class to compare their execution times.

下列範例會比較已編譯和已解譯的規則運算式讀取 Theodore Dreiser 所著 The Financier 一文的前十個句子及讀取所有句子時的效能。The following example compares the performance of compiled and interpreted regular expressions when reading the first ten sentences and when reading all the sentences in the text of Theodore Dreiser's The Financier. 如範例的輸出所示,僅對規則運算式比對方法進行十次呼叫時,解譯的規則運算式能提供比編譯的規則運算式更佳的效能。As the output from the example shows, when only ten calls are made to regular expression matching methods, an interpreted regular expression offers better performance than a compiled regular expression. 但是進行大量呼叫 (此案例中為超過 13,000 次) 時,編譯的規則運算式會提供較佳的效能。However, a compiled regular expression offers better performance when a large number of calls (in this case, over 13,000) are made.

using System;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(\w+((\r?\n)|,?\s))*\w+[.?:;!]";
      Stopwatch sw;
      Match match;
      int ctr;

      StreamReader inFile = new StreamReader(@".\Dreiser_TheFinancier.txt");
      string input = inFile.ReadToEnd();
      inFile.Close();
      
      // Read first ten sentences with interpreted regex.
      Console.WriteLine("10 Sentences with Interpreted Regex:");
      sw = Stopwatch.StartNew();
      Regex int10 = new Regex(pattern, RegexOptions.Singleline);
      match = int10.Match(input);
      for (ctr = 0; ctr <= 9; ctr++) {
         if (match.Success)
            // Do nothing with the match except get the next match.
            match = match.NextMatch();
         else
            break;
      }
      sw.Stop();
      Console.WriteLine("   {0} matches in {1}", ctr, sw.Elapsed);
      
      // Read first ten sentences with compiled regex.
      Console.WriteLine("10 Sentences with Compiled Regex:");
      sw = Stopwatch.StartNew();
      Regex comp10 = new Regex(pattern, 
                   RegexOptions.Singleline | RegexOptions.Compiled);
      match = comp10.Match(input);
      for (ctr = 0; ctr <= 9; ctr++) {
         if (match.Success)
            // Do nothing with the match except get the next match.
            match = match.NextMatch();
         else
            break;
      }
      sw.Stop();
      Console.WriteLine("   {0} matches in {1}", ctr, sw.Elapsed);
      
      // Read all sentences with interpreted regex.
      Console.WriteLine("All Sentences with Interpreted Regex:");
      sw = Stopwatch.StartNew();
      Regex intAll = new Regex(pattern, RegexOptions.Singleline);
      match = intAll.Match(input);
      int matches = 0;
      while (match.Success) {
         matches++;
         // Do nothing with the match except get the next match.
         match = match.NextMatch();
      }
      sw.Stop();
      Console.WriteLine("   {0:N0} matches in {1}", matches, sw.Elapsed);
      
      // Read all sentnces with compiled regex.
      Console.WriteLine("All Sentences with Compiled Regex:");
      sw = Stopwatch.StartNew();
      Regex compAll = new Regex(pattern, 
                      RegexOptions.Singleline | RegexOptions.Compiled);
      match = compAll.Match(input);
      matches = 0;
      while (match.Success) {
         matches++;
         // Do nothing with the match except get the next match.
         match = match.NextMatch();
      }
      sw.Stop();
      Console.WriteLine("   {0:N0} matches in {1}", matches, sw.Elapsed);      
   }
}
// The example displays the following output:
//       10 Sentences with Interpreted Regex:
//          10 matches in 00:00:00.0047491
//       10 Sentences with Compiled Regex:
//          10 matches in 00:00:00.0141872
//       All Sentences with Interpreted Regex:
//          13,443 matches in 00:00:01.1929928
//       All Sentences with Compiled Regex:
//          13,443 matches in 00:00:00.7635869
//       
//       >compare1
//       10 Sentences with Interpreted Regex:
//          10 matches in 00:00:00.0046914
//       10 Sentences with Compiled Regex:
//          10 matches in 00:00:00.0143727
//       All Sentences with Interpreted Regex:
//          13,443 matches in 00:00:01.1514100
//       All Sentences with Compiled Regex:
//          13,443 matches in 00:00:00.7432921
Imports System.Diagnostics
Imports System.IO
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b(\w+((\r?\n)|,?\s))*\w+[.?:;!]"
      Dim sw As Stopwatch
      Dim match As Match
      Dim ctr As Integer

      Dim inFile As New StreamReader(".\Dreiser_TheFinancier.txt")
      Dim input As String = inFile.ReadToEnd()
      inFile.Close()
      
      ' Read first ten sentences with interpreted regex.
      Console.WriteLine("10 Sentences with Interpreted Regex:")
      sw = Stopwatch.StartNew()
      Dim int10 As New Regex(pattern, RegexOptions.SingleLine)
      match = int10.Match(input)
      For ctr = 0 To 9
         If match.Success Then
            ' Do nothing with the match except get the next match.
            match = match.NextMatch()
         Else
            Exit For
         End If
      Next
      sw.Stop()
      Console.WriteLine("   {0} matches in {1}", ctr, sw.Elapsed)
      
      ' Read first ten sentences with compiled regex.
      Console.WriteLine("10 Sentences with Compiled Regex:")
      sw = Stopwatch.StartNew()
      Dim comp10 As New Regex(pattern, 
                   RegexOptions.SingleLine Or RegexOptions.Compiled)
      match = comp10.Match(input)
      For ctr = 0 To 9
         If match.Success Then
            ' Do nothing with the match except get the next match.
            match = match.NextMatch()
         Else
            Exit For
         End If
      Next
      sw.Stop()
      Console.WriteLine("   {0} matches in {1}", ctr, sw.Elapsed)
      
      ' Read all sentences with interpreted regex.
      Console.WriteLine("All Sentences with Interpreted Regex:")
      sw = Stopwatch.StartNew()
      Dim intAll As New Regex(pattern, RegexOptions.SingleLine)
      match = intAll.Match(input)
      Dim matches As Integer = 0
      Do While match.Success
         matches += 1
         ' Do nothing with the match except get the next match.
         match = match.NextMatch()
      Loop
      sw.Stop()
      Console.WriteLine("   {0:N0} matches in {1}", matches, sw.Elapsed)
      
      ' Read all sentnces with compiled regex.
      Console.WriteLine("All Sentences with Compiled Regex:")
      sw = Stopwatch.StartNew()
      Dim compAll As New Regex(pattern, 
                     RegexOptions.SingleLine Or RegexOptions.Compiled)
      match = compAll.Match(input)
      matches = 0
      Do While match.Success
         matches += 1
         ' Do nothing with the match except get the next match.
         match = match.NextMatch()
      Loop
      sw.Stop()
      Console.WriteLine("   {0:N0} matches in {1}", matches, sw.Elapsed)      
   End Sub
End Module
' The example displays output like the following:
'       10 Sentences with Interpreted Regex:
'          10 matches in 00:00:00.0047491
'       10 Sentences with Compiled Regex:
'          10 matches in 00:00:00.0141872
'       All Sentences with Interpreted Regex:
'          13,443 matches in 00:00:01.1929928
'       All Sentences with Compiled Regex:
'          13,443 matches in 00:00:00.7635869
'       
'       >compare1
'       10 Sentences with Interpreted Regex:
'          10 matches in 00:00:00.0046914
'       10 Sentences with Compiled Regex:
'          10 matches in 00:00:00.0143727
'       All Sentences with Interpreted Regex:
'          13,443 matches in 00:00:01.1514100
'       All Sentences with Compiled Regex:
'          13,443 matches in 00:00:00.7432921

範例中所使用規則運算式模式 \b(\w+((\r?\n)|,?\s))*\w+[.?:;!] 的定義方式如下表所示。The regular expression pattern used in the example, \b(\w+((\r?\n)|,?\s))*\w+[.?:;!], is defined as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
\w+ 比對一個或多個文字字元。Match one or more word characters.
(\r?\n)|,?\s) 比對後面接著新行字元的零個或一個歸位字元,或是後面接著空白字元的零個或一個逗號。Match either zero or one carriage return followed by a newline character, or zero or one comma followed by a white-space character.
(\w+((\r?\n)|,?\s))* 比對出現零次或多次的一個或多個文字字元,其後面會接著零個或一個歸位字元和新行字元,或是後面接著空白字元的零個或一個逗號。Match zero or more occurrences of one or more word characters that are followed either by zero or one carriage return and a newline character, or by zero or one comma followed by a white-space character.
\w+ 比對一個或多個文字字元。Match one or more word characters.
[.?:;!] 比對句號、問號、冒號、分號或驚嘆號。Match a period, question mark, colon, semicolon, or exclamation point.

正則運算式:編譯成元件Regular expressions: Compiled to an assembly

.NET 也可讓您建立包含已編譯規則運算式的組件。.NET also enables you to create an assembly that contains compiled regular expressions. 這樣會將規則運算式編譯的效能影響從執行階段移至設計階段。This moves the performance hit of regular expression compilation from run time to design time. 不過,它還包含了一些額外的工作:您必須事先定義規則運算式,並且將其編譯為組件。However, it also involves some additional work: You must define the regular expressions in advance and compile them to an assembly. 接著編譯器就可在編譯使用組件之規則運算式的原始程式碼時參考這個組件。The compiler can then reference this assembly when compiling source code that uses the assembly’s regular expressions. 組件中的每個編譯的規則運算式都會以衍生自 Regex 的類別表示。Each compiled regular expression in the assembly is represented by a class that derives from Regex.

若要將規則運算式編譯為組件,請呼叫 Regex.CompileToAssembly(RegexCompilationInfo[], AssemblyName) 方法,並且將代表要編譯之規則運算式的 RegexCompilationInfo 物件陣列,以及包含所要建立組件之相關資訊的 AssemblyName 物件傳遞給該方法。To compile regular expressions to an assembly, you call the Regex.CompileToAssembly(RegexCompilationInfo[], AssemblyName) method and pass it an array of RegexCompilationInfo objects that represent the regular expressions to be compiled, and an AssemblyName object that contains information about the assembly to be created.

建議您在下列情況下將規則運算式編譯為組件:We recommend that you compile regular expressions to an assembly in the following situations:

  • 如果您是元件開發人員,而且想要建立可重複使用的規則運算式程式庫。If you are a component developer who wants to create a library of reusable regular expressions.

  • 如果您希望不定次數 (從一、兩次到數千、數萬次) 地呼叫規則運算式的模式比對方法。If you expect your regular expression's pattern-matching methods to be called an indeterminate number of times -- anywhere from once or twice to thousands or tens of thousands of times. 與編譯或解譯的規則運算式不同的是,無論方法呼叫的次數為何,編譯為個別組件的規則運算式都會提供一致的效能。Unlike compiled or interpreted regular expressions, regular expressions that are compiled to separate assemblies offer performance that is consistent regardless of the number of method calls.

如果您要使用編譯的規則運算式來最佳化效能,則不應使用反映來建立組件、載入規則運算式引擎,以及執行其模式比對方法。If you are using compiled regular expressions to optimize performance, you should not use reflection to create the assembly, load the regular expression engine, and execute its pattern-matching methods. 因此您就必須避免動態建置規則運算式模式,並且在建立組件時指定任何模式比對選項 (例如不區分大小寫的模式比對)。This requires that you avoid building regular expression patterns dynamically, and that you specify any pattern-matching options (such as case-insensitive pattern matching) at the time the assembly is created. 另外,您也必須將建立組件的程式碼與使用規則運算式的程式碼分開。It also requires that you separate the code that creates the assembly from the code that uses the regular expression.

下列範例示範如何建立內含編譯的規則運算式的組件。The following example shows how to create an assembly that contains a compiled regular expression. 它會建立名為 RegexLib.dll 的元件,其中含有單一正則運算式類別,SentencePattern,其中包含在 [解讀與編譯的正則運算式] 區段中使用的句子比對正則運算式模式。It creates an assembly named RegexLib.dll with a single regular expression class, SentencePattern, that contains the sentence-matching regular expression pattern used in the Interpreted vs. Compiled Regular Expressions section.

using System;
using System.Reflection;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      RegexCompilationInfo SentencePattern =
                           new RegexCompilationInfo(@"\b(\w+((\r?\n)|,?\s))*\w+[.?:;!]",
                                                    RegexOptions.Multiline,
                                                    "SentencePattern",
                                                    "Utilities.RegularExpressions",
                                                    true);
      RegexCompilationInfo[] regexes = { SentencePattern };
      AssemblyName assemName = new AssemblyName("RegexLib, Version=1.0.0.1001, Culture=neutral, PublicKeyToken=null");
      Regex.CompileToAssembly(regexes, assemName);
   }
}
Imports System.Reflection
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim SentencePattern As New RegexCompilationInfo("\b(\w+((\r?\n)|,?\s))*\w+[.?:;!]",
                                                      RegexOptions.Multiline,
                                                      "SentencePattern",
                                                      "Utilities.RegularExpressions",
                                                      True)
      Dim regexes() As RegexCompilationInfo = {SentencePattern}
      Dim assemName As New AssemblyName("RegexLib, Version=1.0.0.1001, Culture=neutral, PublicKeyToken=null")
      Regex.CompileToAssembly(regexes, assemName)
   End Sub
End Module

當範例編譯為可執行檔並且執行時,會建立名為 RegexLib.dll 的組件。When the example is compiled to an executable and run, it creates an assembly named RegexLib.dll. 規則運算式會以衍生自 Utilities.RegularExpressions.SentencePattern 且名為 Regex 的類別表示。The regular expression is represented by a class named Utilities.RegularExpressions.SentencePattern that is derived from Regex. 下列範例接著會使用已編譯的規則運算式來擷取 Theodore Dreiser 所著 The Financier 一文中的句子。The following example then uses the compiled regular expression to extract the sentences from the text of Theodore Dreiser's The Financier.

using System;
using System.IO;
using System.Text.RegularExpressions;
using Utilities.RegularExpressions;

public class Example
{
   public static void Main()
   {
      SentencePattern pattern = new SentencePattern();
      StreamReader inFile = new StreamReader(@".\Dreiser_TheFinancier.txt");
      string input = inFile.ReadToEnd();
      inFile.Close();
      
      MatchCollection matches = pattern.Matches(input);
      Console.WriteLine("Found {0:N0} sentences.", matches.Count);      
   }
}
// The example displays the following output:
//      Found 13,443 sentences.
Imports System.IO
Imports System.Text.RegularExpressions
Imports Utilities.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As New SentencePattern()
      Dim inFile As New StreamReader(".\Dreiser_TheFinancier.txt")
      Dim input As String = inFile.ReadToEnd()
      inFile.Close()
      
      Dim matches As MatchCollection = pattern.Matches(input)
      Console.WriteLine("Found {0:N0} sentences.", matches.Count)      
   End Sub
End Module
' The example displays the following output:
'      Found 13,443 sentences.

控制回溯Take charge of backtracking

通常規則運算式引擎會使用線性迴歸逐一處理輸入字串,並且與規則運算式模式比較。Ordinarily, the regular expression engine uses linear progression to move through an input string and compare it to a regular expression pattern. 不過,當規則運算式模式中使用不定數的數量詞 (例如 *+?) 時,規則運算式引擎可能會放棄一部分成功的部分符合結果,並且返回之前儲存的狀態,以便搜尋與整個模式完全相符的結果。However, when indeterminate quantifiers such as *, +, and ? are used in a regular expression pattern, the regular expression engine may give up a portion of successful partial matches and return to a previously saved state in order to search for a successful match for the entire pattern. 這個程序稱為「回溯」(Backtracking)。This process is known as backtracking.

注意

如需有關回溯的詳細資訊,請參閱規則運算式行為的詳細資料回溯For more information on backtracking, see Details of Regular Expression Behavior and Backtracking. 如需有關回溯的詳細討論,請參閱 BCL Team 部落格中的將規則運算式的效能最佳化,第 II 部分:控制回溯 (英文)。For a detailed discussion of backtracking, see Optimizing Regular Expression Performance, Part II: Taking Charge of Backtracking in the BCL Team blog.

支援回溯能讓規則運算式更強大且更靈活,Support for backtracking gives regular expressions power and flexibility. 同時還能讓規則運算式開發人員負責掌控規則運算式引擎的作業。It also places the responsibility for controlling the operation of the regular expression engine in the hands of regular expression developers. 由於開發人員經常忽略這個責任而誤用回溯或大量使用回溯,因而時常是造成規則運算式效能低落的最重要原因。Because developers are often not aware of this responsibility, their misuse of backtracking or reliance on excessive backtracking often plays the most significant role in degrading regular expression performance. 在最糟的情況下,輸入字串中每個超出字元的執行時間可能會倍增。In a worst-case scenario, execution time can double for each additional character in the input string. 事實上,如果輸入幾乎符合規則運算式模式的話,大量使用回溯很容易製造相當於程式設計上的無窮迴圈,而規則運算式引擎可能需要數小時,甚至數天來處理相對來說很短的輸入字串。In fact, by using backtracking excessively, it is easy to create the programmatic equivalent of an endless loop if input nearly matches the regular expression pattern; the regular expression engine may take hours or even days to process a relatively short input string.

儘管回溯並不是比對的要件,應用程式常常會因為使用回溯而影響效能。Often, applications pay a performance penalty for using backtracking despite the fact that backtracking is not essential for a match. 例如,規則運算式 \b\p{Lu}\w*\b 會比對所有開頭為大寫字元的文字,如下表所示。For example, the regular expression \b\p{Lu}\w*\b matches all words that begin with an uppercase character, as the following table shows.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
\p{Lu} 比對大寫字元。Match an uppercase character.
\w* 比對零個或多個文字字元。Match zero or more word characters.
\b 結束字緣比對。End the match at a word boundary.

由於字緣與文字字元不同,也不是文字字元的子集,因此規則運算式引擎不可能在比對文字字元時跨越字緣。Because a word boundary is not the same as, or a subset of, a word character, there is no possibility that the regular expression engine will cross a word boundary when matching word characters. 這表示對於這個規則運算式來說,回溯不會使任何比對完全成功,只會造成效能降低,因為規則運算式引擎會被迫儲存每一個成功的初始文字字元比對的狀態。This means that for this regular expression, backtracking can never contribute to the overall success of any match -- it can only degrade performance, because the regular expression engine is forced to save its state for each successful preliminary match of a word character.

如果您判定不需回溯,則可使用 (?>subexpression) 語言元素來停用它。If you determine that backtracking is not necessary, you can disable it by using the (?>subexpression) language element. 下列範例會使用兩個規則運算式剖析輸入字串。The following example parses an input string by using two regular expressions. 首先,\b\p{Lu}\w*\b 會仰賴回溯。The first, \b\p{Lu}\w*\b, relies on backtracking. 第二,\b\p{Lu}(?>\w*)\b 會停用回溯。The second, \b\p{Lu}(?>\w*)\b, disables backtracking. 如範例的輸出所示,兩者會產生相同的結果。As the output from the example shows, they both produce the same result.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "This this word Sentence name Capital";
      string pattern = @"\b\p{Lu}\w*\b";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine(match.Value);

      Console.WriteLine();
      
      pattern = @"\b\p{Lu}(?>\w*)\b";   
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine(match.Value);
   }
}
// The example displays the following output:
//       This
//       Sentence
//       Capital
//       
//       This
//       Sentence
//       Capital
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "This this word Sentence name Capital"
      Dim pattern As String = "\b\p{Lu}\w*\b"
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine(match.Value)
      Next
      Console.WriteLine()
      
      pattern = "\b\p{Lu}(?>\w*)\b"   
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine(match.Value)
      Next
   End Sub
End Module
' The example displays the following output:
'       This
'       Sentence
'       Capital
'       
'       This
'       Sentence
'       Capital

在許多情況下,回溯是比對規則運算式模式與輸入文字時所必要。In many cases, backtracking is essential for matching a regular expression pattern to input text. 不過,大量回溯可能嚴重降低效能,並且製造應用程式停止回應的印象。However, excessive backtracking can severely degrade performance and create the impression that an application has stopped responding. 尤其是當數量詞為巢狀,而且符合外部子運算式的文字是符合內部子運算式之文字的子集時,就會發生這種情況。In particular, this happens when quantifiers are nested and the text that matches the outer subexpression is a subset of the text that matches the inner subexpression.

警告

除了避免大量回溯以外,您應該使用逾時功能確保大量回溯不會嚴重降低規則運算式的效能。In addition to avoiding excessive backtracking, you should use the timeout feature to ensure that excessive backtracking does not severely degrade regular expression performance. 如需詳細資訊,請參閱使用逾時值一節。For more information, see the Use Time-out Values section.

例如,規則運算式模式 ^[0-9A-Z]([-.\w]*[0-9A-Z])*\$$ 的目的在於比對至少包含一個英數字元的零件編號。For example, the regular expression pattern ^[0-9A-Z]([-.\w]*[0-9A-Z])*\$$ is intended to match a part number that consists of at least one alphanumeric character. 任何額外的字元都可能包含英數字元、連字號、底線或句號,不過最後一個字元必須是英數字。Any additional characters can consist of an alphanumeric character, a hyphen, an underscore, or a period, though the last character must be alphanumeric. $ 符號則結束零件編號。A dollar sign terminates the part number. 在某些情況下,這個規則運算式模式可能顯現出極差的效能,因為數量詞為巢狀,而且子運算式 [0-9A-Z][-.\w]* 子運算式的子集。In some cases, this regular expression pattern can exhibit extremely poor performance because quantifiers are nested, and because the subexpression [0-9A-Z] is a subset of the subexpression [-.\w]*.

在這類情況下,您可以移除巢狀數量詞,並且將外部子運算式取代為零寬度的右合樣或左合樣判斷提示,藉此最佳化規則運算式的效能。In these cases, you can optimize regular expression performance by removing the nested quantifiers and replacing the outer subexpression with a zero-width lookahead or lookbehind assertion. 右合樣和左合樣判斷提示是錨點,它們不會移動輸入字串中的指標,而是向右或向左合樣,以檢查是否符合指定的條件。Lookahead and lookbehind assertions are anchors; they do not move the pointer in the input string, but instead look ahead or behind to check whether a specified condition is met. 例如,零件編號規則運算式可以重寫為 ^[0-9A-Z][-.\w]*(?<=[0-9A-Z])\$$For example, the part number regular expression can be rewritten as ^[0-9A-Z][-.\w]*(?<=[0-9A-Z])\$$. 這個規則運算式模式的定義方式如下表所示。This regular expression pattern is defined as shown in the following table.

模式Pattern 描述Description
^ 在輸入字串的開頭開始比對。Begin the match at the beginning of the input string.
[0-9A-Z] 比對英數字元。Match an alphanumeric character. 零件編號必須至少包含這個字元。The part number must consist of at least this character.
[-.\w]* 比對出現零次或多次的任何文字字元、連字號或句號。Match zero or more occurrences of any word character, hyphen, or period.
\$ 比對 $ 符號。Match a dollar sign.
(?<=[0-9A-Z]) 向右合樣結尾的 $ 符號,確定前一個字元是英數字。Look ahead of the ending dollar sign to ensure that the previous character is alphanumeric.
$ 在輸入字串結尾結束比對。End the match at the end of the input string.

下列範例說明如何使用這個規則運算式比對包含可能零件編號的陣列。The following example illustrates the use of this regular expression to match an array containing possible part numbers.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"^[0-9A-Z][-.\w]*(?<=[0-9A-Z])\$$";
      string[] partNos = { "A1C$", "A4", "A4$", "A1603D$", "A1603D#" };
      
      foreach (var input in partNos) {
         Match match = Regex.Match(input, pattern);
         if (match.Success)
            Console.WriteLine(match.Value);
         else
            Console.WriteLine("Match not found.");
      }      
   }
}
// The example displays the following output:
//       A1C$
//       Match not found.
//       A4$
//       A1603D$
//       Match not found.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "^[0-9A-Z][-.\w]*(?<=[0-9A-Z])\$$"
      Dim partNos() As String = { "A1C$", "A4", "A4$", "A1603D$", 
                                  "A1603D#" }
      
      For Each input As String In partNos
         Dim match As Match = Regex.Match(input, pattern)
         If match.Success Then
            Console.WriteLine(match.Value)
         Else
            Console.WriteLine("Match not found.")
         End If
      Next      
   End Sub
End Module
' The example displays the following output:
'       A1C$
'       Match not found.
'       A4$
'       A1603D$
'       Match not found.

.NET 中的規則運算式語言包括下列語言項目,可讓您用來消除巢狀數量詞。The regular expression language in .NET includes the following language elements that you can use to eliminate nested quantifiers. 如需詳細資訊,請參閱分組建構For more information, see Grouping Constructs.

語言項目Language element 描述Description
(?= subexpression )(?= subexpression ) 零寬度右合樣。Zero-width positive lookahead. 從目前的位置向右合樣,判斷 subexpression 是否符合輸入字串。Look ahead of the current position to determine whether subexpression matches the input string.
(?! subexpression )(?! subexpression ) 零寬度右不合樣。Zero-width negative lookahead. 從目前的位置向右合樣,判斷 subexpression 是否不符合輸入字串。Look ahead of the current position to determine whether subexpression does not match the input string.
(?<= subexpression )(?<= subexpression ) 零寬度左合樣。Zero-width positive lookbehind. 從目前的位置向左合樣,判斷 subexpression 是否符合輸入字串。Look behind the current position to determine whether subexpression matches the input string.
(?<! subexpression )(?<! subexpression ) 零寬度左不合樣。Zero-width negative lookbehind. 從目前的位置向左合樣,判斷 subexpression 是否不符合輸入字串。Look behind the current position to determine whether subexpression does not match the input string.

使用逾時值Use time-out values

如果您的規則運算式會處理幾乎符合規則運算式模式的輸入,它經常會依賴大量回溯,如此就會大幅影響其效能。If your regular expressions processes input that nearly matches the regular expression pattern, it can often rely on excessive backtracking, which impacts its performance significantly. 除了仔細考量使用回溯以及對幾乎符合的輸入進行規則運算式測試之外,務必要設定逾時值,以確保將大量回溯 (如發生的話) 的影響降至最低。In addition to carefully considering your use of backtracking and testing the regular expression against near-matching input, you should always set a time-out value to ensure that the impact of excessive backtracking, if it occurs, is minimized.

正則運算式逾時間隔會定義正則運算式引擎在超時前尋找單一比對的時間長度。預設的逾時間隔是 Regex.InfiniteMatchTimeout,這表示正則運算式不會超時。您可以覆寫此值並定義逾時間隔,如下所示:The regular expression time-out interval defines the period of time that the regular expression engine will look for a single match before it times out. The default time-out interval is Regex.InfiniteMatchTimeout, which means that the regular expression will not time out. You can override this value and define a time-out interval as follows:

如果您已定義逾時間隔,但是在該間隔結束時未找到相符項目,則規則運算式方法會擲回 RegexMatchTimeoutException 例外狀況。If you have defined a time-out interval and a match is not found at the end of that interval, the regular expression method throws a RegexMatchTimeoutException exception. 在例外處理常式中,您可以選擇以較長的逾時間隔重試比對、放棄比對嘗試並假設沒有相符項目,或是放棄比對嘗試並記錄例外狀況資訊供未來進行分析。In your exception handler, you can choose to retry the match with a longer time-out interval, abandon the match attempt and assume that there is no match, or abandon the match attempt and log the exception information for future analysis.

下列範例將定義 GetWordData 方法,該方法會具現化逾時間隔為 350 毫秒的規則運算式,以計算文字文件中的字數和一個字的平均字元數。The following example defines a GetWordData method that instantiates a regular expression with a time-out interval of 350 milliseconds to calculate the number of words and average number of characters in a word in a text document. 如果比對作業逾時,則逾時間隔將增加 350 毫秒,並且重新具現化 Regex 物件。If the matching operation times out, the time-out interval is increased by 350 milliseconds and the Regex object is re-instantiated. 如果新的逾時間隔超過 1 秒,則方法會重新擲回例外狀況至呼叫端。If the new time-out interval exceeds 1 second, the method re-throws the exception to the caller.

using System;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      RegexUtilities util = new RegexUtilities();
      string title = "Doyle - The Hound of the Baskervilles.txt";
      try {
         var info = util.GetWordData(title);
         Console.WriteLine("Words:               {0:N0}", info.Item1);
         Console.WriteLine("Average Word Length: {0:N2} characters", info.Item2); 
      }
      catch (IOException e) {
         Console.WriteLine("IOException reading file '{0}'", title);
         Console.WriteLine(e.Message);
      }
      catch (RegexMatchTimeoutException e) {
         Console.WriteLine("The operation timed out after {0:N0} milliseconds", 
                           e.MatchTimeout.TotalMilliseconds);
      }
   }
}

public class RegexUtilities
{
   public Tuple<int, double> GetWordData(string filename)
   { 
      const int MAX_TIMEOUT = 1000;   // Maximum timeout interval in milliseconds.
      const int INCREMENT = 350;      // Milliseconds increment of timeout.
      
      List<string> exclusions = new List<string>( new string[] { "a", "an", "the" });
      int[] wordLengths = new int[29];        // Allocate an array of more than ample size.
      string input = null;
      StreamReader sr = null;
      try { 
         sr = new StreamReader(filename);
         input = sr.ReadToEnd();
      }
      catch (FileNotFoundException e) {
         string msg = String.Format("Unable to find the file '{0}'", filename);
         throw new IOException(msg, e);
      }
      catch (IOException e) {
         throw new IOException(e.Message, e);
      }
      finally {
         if (sr != null) sr.Close(); 
      }

      int timeoutInterval = INCREMENT;
      bool init = false;
      Regex rgx = null;
      Match m = null;
      int indexPos = 0;  
      do {
         try {
            if (! init) {
               rgx = new Regex(@"\b\w+\b", RegexOptions.None, 
                               TimeSpan.FromMilliseconds(timeoutInterval));
               m = rgx.Match(input, indexPos);
               init = true;
            }
            else { 
               m = m.NextMatch();
            }
            if (m.Success) {    
               if ( !exclusions.Contains(m.Value.ToLower()))
                  wordLengths[m.Value.Length]++;

               indexPos += m.Length + 1;   
            }
         }
         catch (RegexMatchTimeoutException e) {
            if (e.MatchTimeout.TotalMilliseconds < MAX_TIMEOUT) {
               timeoutInterval += INCREMENT;
               init = false;
            }
            else {
               // Rethrow the exception.
               throw; 
            }   
         }          
      } while (m.Success);
            
      // If regex completed successfully, calculate number of words and average length.
      int nWords = 0; 
      long totalLength = 0;
      
      for (int ctr = wordLengths.GetLowerBound(0); ctr <= wordLengths.GetUpperBound(0); ctr++) {
         nWords += wordLengths[ctr];
         totalLength += ctr * wordLengths[ctr];
      }
      return new Tuple<int, double>(nWords, totalLength/nWords);
   }
}
Imports System.Collections.Generic
Imports System.IO
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim util As New RegexUtilities()
      Dim title As String = "Doyle - The Hound of the Baskervilles.txt"
      Try
         Dim info = util.GetWordData(title)
         Console.WriteLine("Words:               {0:N0}", info.Item1)
         Console.WriteLine("Average Word Length: {0:N2} characters", info.Item2) 
      Catch e As IOException
         Console.WriteLine("IOException reading file '{0}'", title)
         Console.WriteLine(e.Message)
      Catch e As RegexMatchTimeoutException
         Console.WriteLine("The operation timed out after {0:N0} milliseconds", 
                           e.MatchTimeout.TotalMilliseconds)
      End Try
   End Sub
End Module

Public Class RegexUtilities
   Public Function GetWordData(filename As String) As Tuple(Of Integer, Double) 
      Const MAX_TIMEOUT As Integer = 1000  ' Maximum timeout interval in milliseconds.
      Const INCREMENT As Integer = 350     ' Milliseconds increment of timeout.
      
      Dim exclusions As New List(Of String)({"a", "an", "the" })
      Dim wordLengths(30) As Integer        ' Allocate an array of more than ample size.
      Dim input As String = Nothing
      Dim sr As StreamReader = Nothing
      Try 
         sr = New StreamReader(filename)
         input = sr.ReadToEnd()
      Catch e As FileNotFoundException
         Dim msg As String = String.Format("Unable to find the file '{0}'", filename)
         Throw New IOException(msg, e)
      Catch e As IOException
         Throw New IOException(e.Message, e)
      Finally
         If sr IsNot Nothing Then sr.Close() 
      End Try

      Dim timeoutInterval As Integer = INCREMENT
      Dim init As Boolean = False
      Dim rgx As Regex = Nothing
      Dim m As Match = Nothing
      Dim indexPos As Integer = 0  
      Do
         Try
            If Not init Then
               rgx = New Regex("\b\w+\b", RegexOptions.None, 
                               TimeSpan.FromMilliseconds(timeoutInterval))
               m = rgx.Match(input, indexPos)
               init = True
            Else 
               m = m.NextMatch()
            End If
            If m.Success Then    
               If Not exclusions.Contains(m.Value.ToLower()) Then
                  wordLengths(m.Value.Length) += 1
               End If
               indexPos += m.Length + 1   
            End If
         Catch e As RegexMatchTimeoutException
            If e.MatchTimeout.TotalMilliseconds < MAX_TIMEOUT Then
               timeoutInterval += INCREMENT
               init = False
            Else
               ' Rethrow the exception.
               Throw 
            End If   
         End Try          
      Loop While m.Success
            
      ' If regex completed successfully, calculate number of words and average length.
      Dim nWords As Integer
      Dim totalLength As Long
      
      For ctr As Integer = wordLengths.GetLowerBound(0) To wordLengths.GetUpperBound(0)
         nWords += wordLengths(ctr)
         totalLength += ctr * wordLengths(ctr)
      Next
      Return New Tuple(Of Integer, Double)(nWords, totalLength/nWords)
   End Function
End Class

必要時擷取Capture only when necessary

.NET 中的規則運算式支援許多群組建構,可讓您將規則運算式模式與一或多個子運算式設為群組。Regular expressions in .NET support a number of grouping constructs, which let you group a regular expression pattern into one or more subexpressions. .NET 規則運算式語言中最常用的群組建構為 (subexpression) (用於定義已編號的擷取群組) 和 (?<name>subexpression) (用於定義具名擷取群組)。The most commonly used grouping constructs in .NET regular expression language are (subexpression), which defines a numbered capturing group, and (?<name>subexpression), which defines a named capturing group. 群組建構是建立反向參考和定義套用數量詞之子運算式的要件。Grouping constructs are essential for creating backreferences and for defining a subexpression to which a quantifier is applied.

不過,使用這些語言項目也有其代價。However, the use of these language elements has a cost. 這些語言項目會造成在 GroupCollection 屬性傳回的 Match.Groups 物件中填入最近使用的未命名或具名擷取,而如果單一群組建構擷取了輸入字串中的多個子字串,則這些語言項目也會在特定擷取群組的 CaptureCollection 屬性傳回的 Group.Captures 物件中填入多個 Capture 物件。They cause the GroupCollection object returned by the Match.Groups property to be populated with the most recent unnamed or named captures, and if a single grouping construct has captured multiple substrings in the input string, they also populate the CaptureCollection object returned by the Group.Captures property of a particular capturing group with multiple Capture objects.

通常在規則運算式中使用群組建構的目的在於能夠套用數量詞,而且後續不會使用這些子運算式擷取的群組。Often, grouping constructs are used in a regular expression only so that quantifiers can be applied to them, and the groups captured by these subexpressions are not subsequently used. 例如,規則運算式 \b(\w+[;,]?\s?)+[.?!] 是設計用來擷取整個句子。For example, the regular expression \b(\w+[;,]?\s?)+[.?!] is designed to capture an entire sentence. 下表描述這個規則運算式模式中的語言項目,及其對於 Match 物件的 Match.GroupsGroup.Captures 集合造成的影響。The following table describes the language elements in this regular expression pattern and their effect on the Match object's Match.Groups and Group.Captures collections.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
\w+ 比對一個或多個文字字元。Match one or more word characters.
[;,]? 比對零個或一個逗號或分號。Match zero or one comma or semicolon.
\s? 比對零個或一個空白字元。Match zero or one white-space character.
(\w+[;,]?\s?)+ 比對出現一次或多次的一個或多個文字字元,後面接著選擇性的逗號或分號,再後面接著選擇性的空白字元。Match one or more occurrences of one or more word characters followed by an optional comma or semicolon followed by an optional white-space character. 這會定義必要的第一個擷取群組,如此多個文字字元 (也就是文字) 後面接著選擇性標點符號的組合才會重複,直到規則運算式引擎到達句尾為止。This defines the first capturing group, which is necessary so that the combination of multiple word characters (that is, a word) followed by an optional punctuation symbol will be repeated until the regular expression engine reaches the end of a sentence.
[.?!] 比對句號、問號或驚嘆號。Match a period, question mark, or exclamation point.

如下列範例所示,找到符合的結果時,GroupCollectionCaptureCollection 物件中都會填入比對所擷取的項目。As the following example shows, when a match is found, both the GroupCollection and CaptureCollection objects are populated with captures from the match. 在此案例中,擷取群組 (\w+[;,]?\s?) 會存在,如此 + 數量詞就能套用至其中,這樣就能讓規則運算式模式比對句子中的每個字。In this case, the capturing group (\w+[;,]?\s?) exists so that the + quantifier can be applied to it, which enables the regular expression pattern to match each word in a sentence. 否則就會比對句子中的最後一個字。Otherwise, it would match the last word in a sentence.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "This is one sentence. This is another.";
      string pattern = @"\b(\w+[;,]?\s?)+[.?!]";
      
      foreach (Match match in Regex.Matches(input, pattern)) {
         Console.WriteLine("Match: '{0}' at index {1}.", 
                           match.Value, match.Index);
         int grpCtr = 0;
         foreach (Group grp in match.Groups) {
            Console.WriteLine("   Group {0}: '{1}' at index {2}.",
                              grpCtr, grp.Value, grp.Index);
            int capCtr = 0;
            foreach (Capture cap in grp.Captures) {
               Console.WriteLine("      Capture {0}: '{1}' at {2}.",
                                 capCtr, cap.Value, cap.Index);
               capCtr++;
            }
            grpCtr++;
         }          
         Console.WriteLine();        
      }
   }
}
// The example displays the following output:
//       Match: 'This is one sentence.' at index 0.
//          Group 0: 'This is one sentence.' at index 0.
//             Capture 0: 'This is one sentence.' at 0.
//          Group 1: 'sentence' at index 12.
//             Capture 0: 'This ' at 0.
//             Capture 1: 'is ' at 5.
//             Capture 2: 'one ' at 8.
//             Capture 3: 'sentence' at 12.
//       
//       Match: 'This is another.' at index 22.
//          Group 0: 'This is another.' at index 22.
//             Capture 0: 'This is another.' at 22.
//          Group 1: 'another' at index 30.
//             Capture 0: 'This ' at 22.
//             Capture 1: 'is ' at 27.
//             Capture 2: 'another' at 30.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "This is one sentence. This is another."
      Dim pattern As String = "\b(\w+[;,]?\s?)+[.?!]"
      
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("Match: '{0}' at index {1}.", 
                           match.Value, match.Index)
         Dim grpCtr As Integer = 0
         For Each grp As Group In match.Groups
            Console.WriteLine("   Group {0}: '{1}' at index {2}.",
                              grpCtr, grp.Value, grp.Index)
            Dim capCtr As Integer = 0
            For Each cap As Capture In grp.Captures
               Console.WriteLine("      Capture {0}: '{1}' at {2}.",
                                 capCtr, cap.Value, cap.Index)
               capCtr += 1
            Next
            grpCtr += 1
         Next          
         Console.WriteLine()        
      Next    
   End Sub
End Module
' The example displays the following output:
'       Match: 'This is one sentence.' at index 0.
'          Group 0: 'This is one sentence.' at index 0.
'             Capture 0: 'This is one sentence.' at 0.
'          Group 1: 'sentence' at index 12.
'             Capture 0: 'This ' at 0.
'             Capture 1: 'is ' at 5.
'             Capture 2: 'one ' at 8.
'             Capture 3: 'sentence' at 12.
'       
'       Match: 'This is another.' at index 22.
'          Group 0: 'This is another.' at index 22.
'             Capture 0: 'This is another.' at 22.
'          Group 1: 'another' at index 30.
'             Capture 0: 'This ' at 22.
'             Capture 1: 'is ' at 27.
'             Capture 2: 'another' at 30.

如果您使用子運算式的目的只是要在其中套用數量詞,對於擷取的文字並不感興趣,則應該停用群組擷取。When you use subexpressions only to apply quantifiers to them, and you are not interested in the captured text, you should disable group captures. 例如,(?:subexpression) 語言元素會阻止套用該語言元素的群組擷取相符的子字串。For example, the (?:subexpression) language element prevents the group to which it applies from capturing matched substrings. 在下列範例中,前一個範例的規則運算式模式會變成 \b(?:\w+[;,]?\s?)+[.?!]In the following example, the regular expression pattern from the previous example is changed to \b(?:\w+[;,]?\s?)+[.?!]. 如輸出所示,它會阻止規則運算式引擎填入 GroupCollectionCaptureCollection 集合。As the output shows, it prevents the regular expression engine from populating the GroupCollection and CaptureCollection collections.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "This is one sentence. This is another.";
      string pattern = @"\b(?:\w+[;,]?\s?)+[.?!]";
      
      foreach (Match match in Regex.Matches(input, pattern)) {
         Console.WriteLine("Match: '{0}' at index {1}.", 
                           match.Value, match.Index);
         int grpCtr = 0;
         foreach (Group grp in match.Groups) {
            Console.WriteLine("   Group {0}: '{1}' at index {2}.",
                              grpCtr, grp.Value, grp.Index);
            int capCtr = 0;
            foreach (Capture cap in grp.Captures) {
               Console.WriteLine("      Capture {0}: '{1}' at {2}.",
                                 capCtr, cap.Value, cap.Index);
               capCtr++;
            }
            grpCtr++;
         }          
         Console.WriteLine();        
      }
   }
}
// The example displays the following output:
//       Match: 'This is one sentence.' at index 0.
//          Group 0: 'This is one sentence.' at index 0.
//             Capture 0: 'This is one sentence.' at 0.
//       
//       Match: 'This is another.' at index 22.
//          Group 0: 'This is another.' at index 22.
//             Capture 0: 'This is another.' at 22.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "This is one sentence. This is another."
      Dim pattern As String = "\b(?:\w+[;,]?\s?)+[.?!]"
      
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("Match: '{0}' at index {1}.", 
                           match.Value, match.Index)
         Dim grpCtr As Integer = 0
         For Each grp As Group In match.Groups
            Console.WriteLine("   Group {0}: '{1}' at index {2}.",
                              grpCtr, grp.Value, grp.Index)
            Dim capCtr As Integer = 0
            For Each cap As Capture In grp.Captures
               Console.WriteLine("      Capture {0}: '{1}' at {2}.",
                                 capCtr, cap.Value, cap.Index)
               capCtr += 1
            Next
            grpCtr += 1
         Next          
         Console.WriteLine()        
      Next    
   End Sub
End Module
' The example displays the following output:
'       Match: 'This is one sentence.' at index 0.
'          Group 0: 'This is one sentence.' at index 0.
'             Capture 0: 'This is one sentence.' at 0.
'       
'       Match: 'This is another.' at index 22.
'          Group 0: 'This is another.' at index 22.
'             Capture 0: 'This is another.' at 22.

您可以透過下列其中一種方式停用擷取:You can disable captures in one of the following ways:

  • 使用 (?:subexpression) 語言元素。Use the (?:subexpression) language element. 這個項目會阻止在套用該項目的群組中擷取相符的子字串。This element prevents the capture of matched substrings in the group to which it applies. 不過,它不會停用任何巢狀群組中的子字串擷取。It does not disable substring captures in any nested groups.

  • 使用 ExplicitCapture 選項。Use the ExplicitCapture option. 這個選項會停用規則運算式模式中的所有未命名或隱含擷取。It disables all unnamed or implicit captures in the regular expression pattern. 當您使用這個選項時,只會擷取符合 (?<name>subexpression) 語言元素所定義之具名群組的子字串。When you use this option, only substrings that match named groups defined with the (?<name>subexpression) language element can be captured. ExplicitCapture 旗標可以傳遞至 options 類別建構函式的 Regex 參數,或是 options 靜態比對方法的 Regex 參數。The ExplicitCapture flag can be passed to the options parameter of a Regex class constructor or to the options parameter of a Regex static matching method.

  • n 語言項目中使用 (?imnsx) 選項。Use the n option in the (?imnsx) language element. 這個選項會從規則運算式模式中出現該項目的位置開始,停用所有未命名或隱含擷取。This option disables all unnamed or implicit captures from the point in the regular expression pattern at which the element appears. 在到達模式結尾或 (-n) 選項啟用未命名或隱含擷取之前,擷取都會是停用狀態。Captures are disabled either until the end of the pattern or until the (-n) option enables unnamed or implicit captures. 如需詳細資訊,請參閱 Miscellaneous ConstructsFor more information, see Miscellaneous Constructs.

  • n 語言項目中使用 (?imnsx:subexpression) 選項。Use the n option in the (?imnsx:subexpression) language element. 這個選項會停用 subexpression 中的所有未命名或隱含擷取。This option disables all unnamed or implicit captures in subexpression. 任何未命名或隱含巢狀擷取群組所進行的擷取也都會停用。Captures by any unnamed or implicit nested capturing groups are disabled as well.

標題Title 描述Description
規則運算式行為的詳細資訊Details of Regular Expression Behavior 檢查 .NET 中規則運算式引擎的實作。Examines the implementation of the regular expression engine in .NET. 本主題將強調規則運算式的靈活度,並且說明開發人員應負責確保規則運算式引擎有效率且穩定地運作。The topic focuses on the flexibility of regular expressions and explains the developer's responsibility for ensuring the efficient and robust operation of the regular expression engine.
回溯Backtracking 說明何謂回溯以及回溯如何影響規則運算式的效能,並且檢查提供回溯之替代方式的語言項目。Explains what backtracking is and how it affects regular expression performance, and examines language elements that provide alternatives to backtracking.
規則運算式語言 - 快速參考Regular Expression Language - Quick Reference 描述 .NET 中規則運算式語言的項目,並且提供每個語言項目之詳細文件的連結。Describes the elements of the regular expression language in .NET and provides links to detailed documentation for each language element.