規則運算式中的分組建構Grouping Constructs in Regular Expressions

分組建構會描寫規則運算式的子運算式,以及擷取輸入字串的子字串。Grouping constructs delineate the subexpressions of a regular expression and capture the substrings of an input string. 您可以使用分組建構來執行下列作業:You can use grouping constructs to do the following:

  • 比對輸入字串中重複的子運算式。Match a subexpression that is repeated in the input string.

  • 將數量詞套用至具有多個規則運算式語言項目的子運算式。Apply a quantifier to a subexpression that has multiple regular expression language elements. 如需數量詞的詳細資訊,請參閱 QuantifiersFor more information about quantifiers, see Quantifiers.

  • Regex.ReplaceMatch.Result 方法傳回的字串中包含子運算式。Include a subexpression in the string that is returned by the Regex.Replace and Match.Result methods.

  • Match.Groups 屬性擷取個別子運算式,再全部一起與相符文字分開處理。Retrieve individual subexpressions from the Match.Groups property and process them separately from the matched text as a whole.

下表列出 .NET 規則運算式引擎支援的分組建構,並指出其為擷取或非擷取。The following table lists the grouping constructs supported by the .NET regular expression engine and indicates whether they are capturing or non-capturing.

分組建構Grouping construct 擷取或非擷取Capturing or noncapturing
相符子運算式Matched subexpressions 擷取Capturing
具名的相符子運算式Named matched subexpressions 擷取Capturing
平衡群組定義Balancing group definitions 擷取Capturing
非擷取群組Noncapturing groups 非擷取Noncapturing
群組選項Group options 非擷取Noncapturing
零寬度右合樣 (Positive Lookahead) 判斷提示Zero-width positive lookahead assertions 非擷取Noncapturing
零寬度右不合樣 (Negative Lookahead) 判斷提示Zero-width negative lookahead assertions 非擷取Noncapturing
零寬度左合樣 (Positive Lookbehind) 判斷提示Zero-width positive lookbehind assertions 非擷取Noncapturing
零寬度左不合樣 (Negative Lookbehind) 判斷提示Zero-width negative lookbehind assertions 非擷取Noncapturing
非回溯子運算式Nonbacktracking subexpressions 非擷取Noncapturing

如需群組和規則運算式物件模型的詳細資訊,請參閱 分組建構和規則運算式物件For information on groups and the regular expression object model, see Grouping constructs and regular expression objects.

相符子運算式Matched Subexpressions

下列分組建構會擷取相符子運算式:The following grouping construct captures a matched subexpression:

( subexpression )( subexpression )

其中 subexpression 是任何有效的規則運算式模式。where subexpression is any valid regular expression pattern. 使用括號的擷取會依據規則運算式中的左括號順序,從一開始,由左至右自動編號。Captures that use parentheses are numbered automatically from left to right based on the order of the opening parentheses in the regular expression, starting from one. 編號零的擷取是整個規則運算式模式比對的文字。The capture that is numbered zero is the text matched by the entire regular expression pattern.

注意

根據預設, (子運算式) 語言項目會擷取相符的子運算式。By default, the (subexpression) language element captures the matched subexpression. 但是如果規則運算式模式比對方法的 RegexOptions 參數包含 RegexOptions.ExplicitCapture 旗標,或這個子運算式套用了 n 選項 (請參閱本主題稍後的群組選項),則不會擷取相符的子運算式。But if the RegexOptions parameter of a regular expression pattern matching method includes the RegexOptions.ExplicitCapture flag, or if the n option is applied to this subexpression (see Group options later in this topic), the matched subexpression is not captured.

您可以用四種方式來存取擷取群組:You can access captured groups in four ways:

  • 在規則運算式中使用反向參考建構。By using the backreference construct within the regular expression. 使用語法 \number(其中 where number 是所擷取子運算式的序號),在相同的規則運算式中參考相符子運算式。The matched subexpression is referenced in the same regular expression by using the syntax \number, where number is the ordinal number of the captured subexpression.

  • 在規則運算式中使用具名的反向參考建構。By using the named backreference construct within the regular expression. 使用語法 \k<name>(其中 where name 是擷取群組的名稱) 或 \k<number>(其中 where number 是擷取群組的序號),在相同的規則運算式中參考相符子運算式。The matched subexpression is referenced in the same regular expression by using the syntax \k<name>, where name is the name of a capturing group, or \k<number>, where number is the ordinal number of a capturing group. 擷取群組的預設名稱與其序號相同。A capturing group has a default name that is identical to its ordinal number. 如需詳細資訊,請參閱本主題稍後的 具名的相符子運算式For more information, see Named matched subexpressions later in this topic.

  • Regex.ReplaceMatch.Result 方法呼叫中使用 $number 取代序列,其中 number 是所擷取子運算式的序號。By using the $number replacement sequence in a Regex.Replace or Match.Result method call, where number is the ordinal number of the captured subexpression.

  • 以程式設計方式,使用 GroupCollection 屬性傳回的 Match.Groups 物件。Programmatically, by using the GroupCollection object returned by the Match.Groups property. 集合中位置為零的成員代表整個規則運算式相符。The member at position zero in the collection represents the entire regular expression match. 每個後續成員各代表一個相符子運算式。Each subsequent member represents a matched subexpression. 如需詳細資訊,請參閱群組建構和規則運算式物件一節。For more information, see the Grouping Constructs and Regular Expression Objects section.

下列範例說明可識別文字中重複文字的規則運算式。The following example illustrates a regular expression that identifies duplicated words in text. 規則運算式模式的兩個擷取群組代表重複文字的兩個執行個體。The regular expression pattern's two capturing groups represent the two instances of the duplicated word. 擷取第二個執行個體是為了報告其於輸入字串中的開始位置。The second instance is captured to report its starting position in the input string.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(\w+)\s(\1)";
      string input = "He said that that was the the correct answer.";
      foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
         Console.WriteLine("Duplicate '{0}' found at positions {1} and {2}.", 
                           match.Groups[1].Value, match.Groups[1].Index, match.Groups[2].Index);
   }
}
// The example displays the following output:
//       Duplicate 'that' found at positions 8 and 13.
//       Duplicate 'the' found at positions 22 and 26.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(\w+)\s(\1)\W"
      Dim input As String = "He said that that was the the correct answer."
      For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
         Console.WriteLine("Duplicate '{0}' found at positions {1} and {2}.", _
                           match.Groups(1).Value, match.Groups(1).Index, match.Groups(2).Index)
      Next
   End Sub
End Module
' The example displays the following output:
'       Duplicate 'that' found at positions 8 and 13.
'       Duplicate 'the' found at positions 22 and 26.

規則運算式模式如下:The regular expression pattern is the following:

(\w+)\s(\1)\W

下表顯示規則運算式模式的解譯方式。The following table shows how the regular expression pattern is interpreted.

模式Pattern 描述Description
(\w+) 比對一個或多個文字字元。Match one or more word characters. 這是第一個擷取群組。This is the first capturing group.
\s 比對空白字元。Match a white-space character.
(\1) 比對第一個擷取群組中的字串。Match the string in the first captured group. 這是第二個擷取群組。This is the second capturing group. 此範例將其指派給擷取的群組,以便從 Match.Index 物件。The example assigns it to a captured group so that the starting position of the duplicate word can be retrieved from the Match.Index property.
\W 比對非文字字元,包括空格和標點符號。Match a non-word character, including white space and punctuation. 如此可防止規則運算式模式比對以第一個擷取群組中的文字為開頭的文字。This prevents the regular expression pattern from matching a word that starts with the word from the first captured group.

具名的相符子運算式Named Matched Subexpressions

下列分組建構會擷取相符子運算式,並可讓您依名稱或依號碼加以存取。The following grouping construct captures a matched subexpression and lets you access it by name or by number:

(?<name>subexpression)

或:or:

(?'name'subexpression)

其中 name 是有效的群組名稱,而 subexpression 是任何有效的規則運算式模式。where name is a valid group name, and subexpression is any valid regular expression pattern. name 絕不能包含任何標點符號字元,而且不能以數字開頭。name must not contain any punctuation characters and cannot begin with a number.

注意

如果規則運算式模式比對方法的 RegexOptions 參數包含 RegexOptions.ExplicitCapture 旗標,或這個子運算式套用了 n 選項 (請參閱本主題稍後的群組選項),則擷取子運算式的唯一方式就是明確地為擷取群組命名。If the RegexOptions parameter of a regular expression pattern matching method includes the RegexOptions.ExplicitCapture flag, or if the n option is applied to this subexpression (see Group options later in this topic), the only way to capture a subexpression is to explicitly name capturing groups.

您可以用下列方式來存取具名的擷取群組:You can access named captured groups in the following ways:

  • 在規則運算式中使用具名的反向參考建構。By using the named backreference construct within the regular expression. 使用語法 \k<name>(其中 where name 是所擷取子運算式的名稱),在相同的規則運算式中參考相符子運算式。The matched subexpression is referenced in the same regular expression by using the syntax \k<name>, where name is the name of the captured subexpression.

  • 在規則運算式中使用反向參考建構。By using the backreference construct within the regular expression. 使用語法 \number(其中 where number 是所擷取子運算式的序號),在相同的規則運算式中參考相符子運算式。The matched subexpression is referenced in the same regular expression by using the syntax \number, where number is the ordinal number of the captured subexpression. 具名的相符子運算式會在相符子運算式之後,由左至右連續編號。Named matched subexpressions are numbered consecutively from left to right after matched subexpressions.

  • Regex.ReplaceMatch.Result 方法呼叫中使用 ${name} 取代序列,其中 name 是所擷取子運算式的名稱。By using the ${name} replacement sequence in a Regex.Replace or Match.Result method call, where name is the name of the captured subexpression.

  • Regex.ReplaceMatch.Result 方法呼叫中使用 $number 取代序列,其中 number 是所擷取子運算式的序號。By using the $number replacement sequence in a Regex.Replace or Match.Result method call, where number is the ordinal number of the captured subexpression.

  • 以程式設計方式,使用 GroupCollection 屬性傳回的 Match.Groups 物件。Programmatically, by using the GroupCollection object returned by the Match.Groups property. 集合中位置為零的成員代表整個規則運算式相符。The member at position zero in the collection represents the entire regular expression match. 每個後續成員各代表一個相符子運算式。Each subsequent member represents a matched subexpression. 具名的擷取群組儲存在集合中,編號的擷取群組之後。Named captured groups are stored in the collection after numbered captured groups.

  • 以程式設計方式,提供子運算式名稱給 GroupCollection 物件的索引子 (在 C# 中) 或其 Item[String] 屬性 (在 Visual Basic 中)。Programmatically, by providing the subexpression name to the GroupCollection object's indexer (in C#) or to its Item[String] property (in Visual Basic).

有一個簡單的規則運算式模式,說明如何以程式設計方式,或是使用規則運算式語言語法,來參考編號 (未具名) 和具名群組。A simple regular expression pattern illustrates how numbered (unnamed) and named groups can be referenced either programmatically or by using regular expression language syntax. 規則運算式 ((?<One>abc)\d+)?(?<Two>xyz)(.*) 會依號碼或依名稱產生下列擷取群組。The regular expression ((?<One>abc)\d+)?(?<Two>xyz)(.*) produces the following capturing groups by number and by name. 第一個擷取群組 (編號 0) 一律參考整個模式。The first capturing group (number 0) always refers to the entire pattern.

numberNumber [屬性]Name 模式Pattern
00 0 (預設名稱)0 (default name) ((?<One>abc)\d+)?(?<Two>xyz)(.*)
11 1 (預設名稱)1 (default name) ((?<One>abc)\d+)
22 2 (預設名稱)2 (default name) (.*)
33 One (?<One>abc)
44 Two (?<Two>xyz)

下列範例說明的規則運算式可識別重複文字,以及緊接在每個重複文字後面的文字。The following example illustrates a regular expression that identifies duplicated words and the word that immediately follows each duplicated word. 規則運算式模式會定義兩個具名子運算式: duplicateWord代表重複文字,而 nextWord代表接在重複文字後面的文字。The regular expression pattern defines two named subexpressions: duplicateWord, which represents the duplicated word; and nextWord, which represents the word that follows the duplicated word.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(?<duplicateWord>\w+)\s\k<duplicateWord>\W(?<nextWord>\w+)";
      string input = "He said that that was the the correct answer.";
      foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
         Console.WriteLine("A duplicate '{0}' at position {1} is followed by '{2}'.", 
                           match.Groups["duplicateWord"].Value, match.Groups["duplicateWord"].Index, 
                           match.Groups["nextWord"].Value);

   }
}
// The example displays the following output:
//       A duplicate 'that' at position 8 is followed by 'was'.
//       A duplicate 'the' at position 22 is followed by 'correct'.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(?<duplicateWord>\w+)\s\k<duplicateWord>\W(?<nextWord>\w+)"
      Dim input As String = "He said that that was the the correct answer."
      Console.WriteLine(Regex.Matches(input, pattern, RegexOptions.IgnoreCase).Count)
      For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
         Console.WriteLine("A duplicate '{0}' at position {1} is followed by '{2}'.", _
                           match.Groups("duplicateWord").Value, match.Groups("duplicateWord").Index, _
                           match.Groups("nextWord").Value)
      Next
   End Sub
End Module
' The example displays the following output:
'    A duplicate 'that' at position 8 is followed by 'was'.
'    A duplicate 'the' at position 22 is followed by 'correct'.

規則運算式模式如下:The regular expression pattern is as follows:

(?<duplicateWord>\w+)\s\k<duplicateWord>\W(?<nextWord>\w+)

下表顯示規則運算式的解譯方式。The following table shows how the regular expression is interpreted.

模式Pattern 描述Description
(?<duplicateWord>\w+) 比對一個或多個文字字元。Match one or more word characters. 將此擷取群組命名為 duplicateWordName this capturing group duplicateWord.
\s 比對空白字元。Match a white-space character.
\k<duplicateWord> 從名為 duplicateWord的擷取群組比對字串。Match the string from the captured group that is named duplicateWord.
\W 比對非文字字元,包括空格和標點符號。Match a non-word character, including white space and punctuation. 如此可防止規則運算式模式比對以第一個擷取群組中的文字為開頭的文字。This prevents the regular expression pattern from matching a word that starts with the word from the first captured group.
(?<nextWord>\w+) 比對一個或多個文字字元。Match one or more word characters. 將此擷取群組命名為 nextWordName this capturing group nextWord.

請注意可以在規則運算式中重複群組名稱。Note that a group name can be repeated in a regular expression. 例如,可能會有多於一個群組命名為 digit,如下列範例所示。For example, it is possible for more than one group to be named digit, as the following example illustrates. 對於重複名稱的案例, Group 物件的值取決於輸入字串中的最後一個成功擷取。In the case of duplicate names, the value of the Group object is determined by the last successful capture in the input string. 此外,將每個擷取的相關資訊填入 CaptureCollection ,就如同群組名稱不重複的情況。In addition, the CaptureCollection is populated with information about each capture just as it would be if the group name was not duplicated.

在下列範例中,規則運算式 \D+(?<digit>\d+)\D+(?<digit>\d+)? 包含兩次出現名為 digit的群組。In the following example, the regular expression \D+(?<digit>\d+)\D+(?<digit>\d+)? includes two occurrences of a group named digit. 第一個 digit 具名群組擷取一或多個數字字元。The first digit named group captures one or more digit characters. 第二個 digit 具名群組擷取一或多個數字字元的零次或一次發生。The second digit named group captures either zero or one occurrence of one or more digit characters. 如範例的輸出所示,如果第二個擷取群組成功地與文字相符,該文字的值會定義 Group 物件的值。As the output from the example shows, if the second capturing group successfully matches text, the value of that text defines the value of the Group object. 如果第二個擷取群組不符合輸入的字串,則上一次成功比對的值會定義 Group 物件的數值。If the second capturing group cannot does not match the input string, the value of the last successful match defines the value of the Group object.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      String pattern = @"\D+(?<digit>\d+)\D+(?<digit>\d+)?";
      String[] inputs = { "abc123def456", "abc123def" };
      foreach (var input in inputs) {
         Match m = Regex.Match(input, pattern);
         if (m.Success) {
            Console.WriteLine("Match: {0}", m.Value);
            for (int grpCtr = 1; grpCtr < m.Groups.Count; grpCtr++) {
               Group grp = m.Groups[grpCtr];
               Console.WriteLine("Group {0}: {1}", grpCtr, grp.Value);
               for (int capCtr = 0; capCtr < grp.Captures.Count; capCtr++)
                  Console.WriteLine("   Capture {0}: {1}", capCtr,
                                    grp.Captures[capCtr].Value);
            }
         }
         else {
            Console.WriteLine("The match failed.");
         }
         Console.WriteLine();
      }
   }
}
// The example displays the following output:
//       Match: abc123def456
//       Group 1: 456
//          Capture 0: 123
//          Capture 1: 456
//
//       Match: abc123def
//       Group 1: 123
//          Capture 0: 123
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\D+(?<digit>\d+)\D+(?<digit>\d+)?"
      Dim inputs() As String = { "abc123def456", "abc123def" }
      For Each input As String In inputs
         Dim m As Match = Regex.Match(input, pattern)
         If m.Success Then
            Console.WriteLine("Match: {0}", m.Value)
            For grpCtr As Integer = 1 to m.Groups.Count - 1
               Dim grp As Group = m.Groups(grpCtr)
               Console.WriteLine("Group {0}: {1}", grpCtr, grp.Value)
               For capCtr As Integer = 0 To grp.Captures.Count - 1
                  Console.WriteLine("   Capture {0}: {1}", capCtr,
                                    grp.Captures(capCtr).Value)
               Next
            Next
         Else
            Console.WriteLine("The match failed.")
         End If
         Console.WriteLine()
      Next
   End Sub
End Module
' The example displays the following output:
'       Match: abc123def456
'       Group 1: 456
'          Capture 0: 123
'          Capture 1: 456
'
'       Match: abc123def
'       Group 1: 123
'          Capture 0: 123

下表顯示規則運算式的解譯方式。The following table shows how the regular expression is interpreted.

模式Pattern 描述Description
\D+ 比對一個或更多非十進位數字字元。Match one or more non-decimal digit characters.
(?<digit>\d+) 比對一個或更多十進位數字字元。Match one or more decimal digit characters. 指派 digit 具名群組的比對。Assign the match to the digit named group.
\D+ 比對一個或更多非十進位數字字元。Match one or more non-decimal digit characters.
(?<digit>\d+)? 比對一或多個十進位數字字元的零次或一次發生。Match zero or one occurrence of one or more decimal digit characters. 指派 digit 具名群組的比對。Assign the match to the digit named group.

平衡群組定義Balancing Group Definitions

平衡群組定義會刪除先前定義之群組的定義,並且在目前群組中儲存先前定義的群組與目前群組之間的間隔。A balancing group definition deletes the definition of a previously defined group and stores, in the current group, the interval between the previously defined group and the current group. 此分組建構的格式如下:This grouping construct has the following format:

(?<name1-name2>subexpression)

或:or:

(?'name1-name2' subexpression)

其中 name1 是目前群組 (選用), name2 是先前定義的群組,而 subexpression 是任何有效的規則運算式模式。where name1 is the current group (optional), name2 is a previously defined group, and subexpression is any valid regular expression pattern. 平衡群組定義會刪除 name2 的定義,並且將 name2name1 之間的間隔儲存在 name1中。The balancing group definition deletes the definition of name2 and stores the interval between name2 and name1 in name1. 如果沒有定義 name2 群組,比對結果會回溯。If no name2 group is defined, the match backtracks. 因為刪除 name2 最後的定義會顯示 name2先前的定義,所以此建構可讓您將擷取堆疊用於群組 name2 ,以做為追蹤巢狀建構 (例如括弧或左右方括號) 的計數器。Because deleting the last definition of name2 reveals the previous definition of name2, this construct lets you use the stack of captures for group name2 as a counter for keeping track of nested constructs such as parentheses or opening and closing brackets.

平衡群組定義將 name2 當作堆疊使用。The balancing group definition uses name2 as a stack. 每個巢狀建構的開頭字元都會放在群組及其 Group.Captures 集合中。The beginning character of each nested construct is placed in the group and in its Group.Captures collection. 找到配對的結尾字元時,就會從群組中移除其對應的開頭字元,而 Captures 集合中就會減少一個。When the closing character is matched, its corresponding opening character is removed from the group, and the Captures collection is decreased by one. 所有巢狀建構的開頭和結尾字元都配成對之後,name2 就空了。After the opening and closing characters of all nested constructs have been matched, name2 is empty.

注意

在您修改下列範例中的規則運算式來使用巢狀建構的適當開頭和結尾字元之後,即可用它來處理大部分的巢狀建構,例如包含多個巢狀方法呼叫的數學運算式或程式碼字行。After you modify the regular expression in the following example to use the appropriate opening and closing character of a nested construct, you can use it to handle most nested constructs, such as mathematical expressions or lines of program code that include multiple nested method calls.

下列範例使用平衡群組定義來比對輸入字串中的左右角括號 (<>)。The following example uses a balancing group definition to match left and right angle brackets (<>) in an input string. 此範例定義兩個具名群組 OpenClose,像堆疊一樣可用來追蹤成對的角括號。The example defines two named groups, Open and Close, that are used like a stack to track matching pairs of angle brackets. 所擷取的每個左角括號都會被推送至 Open 群組的擷取集合中,而所擷取的每個右角括號都會被推送至 Close 群組的擷取集合中。Each captured left angle bracket is pushed into the capture collection of the Open group, and each captured right angle bracket is pushed into the capture collection of the Close group. 平衡群組定義可確保每個左角括號都有成對的右角括號。The balancing group definition ensures that there is a matching right angle bracket for each left angle bracket. 如果沒有,唯有當 (?(Open)(?!))群組不是空的 (而因此如果所有巢狀建構都未關閉),才會評估最後的子模式 OpenIf there is not, the final subpattern, (?(Open)(?!)), is evaluated only if the Open group is not empty (and, therefore, if all nested constructs have not been closed). 如果評估最後的子模式,比對會失敗,因為 (?!) 子模式是一律會失敗的零寬度右不合樣 (Negative Lookahead) 判斷提示。If the final subpattern is evaluated, the match fails, because the (?!) subpattern is a zero-width negative lookahead assertion that always fails.

using System;
using System.Text.RegularExpressions;

class Example
{
   public static void Main() 
   {
      string pattern = "^[^<>]*" +
                       "(" + 
                       "((?'Open'<)[^<>]*)+" +
                       "((?'Close-Open'>)[^<>]*)+" +
                       ")*" +
                       "(?(Open)(?!))$";
      string input = "<abc><mno<xyz>>";

      Match m = Regex.Match(input, pattern);
      if (m.Success == true)
      {
         Console.WriteLine("Input: \"{0}\" \nMatch: \"{1}\"", input, m);
         int grpCtr = 0;
         foreach (Group grp in m.Groups)
         {
            Console.WriteLine("   Group {0}: {1}", grpCtr, grp.Value);
            grpCtr++;
            int capCtr = 0;
            foreach (Capture cap in grp.Captures)
            {            
                Console.WriteLine("      Capture {0}: {1}", capCtr, cap.Value);
                capCtr++;
            }
          }
      }
      else
      {
         Console.WriteLine("Match failed.");
      }   
    }
}
// The example displays the following output:
//    Input: "<abc><mno<xyz>>"
//    Match: "<abc><mno<xyz>>"
//       Group 0: <abc><mno<xyz>>
//          Capture 0: <abc><mno<xyz>>
//       Group 1: <mno<xyz>>
//          Capture 0: <abc>
//          Capture 1: <mno<xyz>>
//       Group 2: <xyz
//          Capture 0: <abc
//          Capture 1: <mno
//          Capture 2: <xyz
//       Group 3: >
//          Capture 0: >
//          Capture 1: >
//          Capture 2: >
//       Group 4:
//       Group 5: mno<xyz>
//          Capture 0: abc
//          Capture 1: xyz
//          Capture 2: mno<xyz>
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main() 
        Dim pattern As String = "^[^<>]*" & _
                                "(" + "((?'Open'<)[^<>]*)+" & _
                                "((?'Close-Open'>)[^<>]*)+" + ")*" & _
                                "(?(Open)(?!))$"
        Dim input As String = "<abc><mno<xyz>>"
        Dim rgx AS New Regex(pattern)'
        Dim m As Match = Regex.Match(input, pattern)
        If m.Success Then
            Console.WriteLine("Input: ""{0}"" " & vbCrLf & "Match: ""{1}""", _
                               input, m)
            Dim grpCtr As Integer = 0
            For Each grp As Group In m.Groups
               Console.WriteLine("   Group {0}: {1}", grpCtr, grp.Value)
               grpCtr += 1
               Dim capCtr As Integer = 0
               For Each cap As Capture In grp.Captures            
                  Console.WriteLine("      Capture {0}: {1}", capCtr, cap.Value)
                  capCtr += 1
               Next
            Next
        Else
            Console.WriteLine("Match failed.")
        End If
    End Sub
End Module  
' The example displays the following output:
'       Input: "<abc><mno<xyz>>"
'       Match: "<abc><mno<xyz>>"
'          Group 0: <abc><mno<xyz>>
'             Capture 0: <abc><mno<xyz>>
'          Group 1: <mno<xyz>>
'             Capture 0: <abc>
'             Capture 1: <mno<xyz>>
'          Group 2: <xyz
'             Capture 0: <abc
'             Capture 1: <mno
'             Capture 2: <xyz
'          Group 3: >
'             Capture 0: >
'             Capture 1: >
'             Capture 2: >
'          Group 4:
'          Group 5: mno<xyz>
'             Capture 0: abc
'             Capture 1: xyz
'             Capture 2: mno<xyz>

規則運算式模式為:The regular expression pattern is:

^[^<>]*(((?'Open'<)[^<>]*)+((?'Close-Open'>)[^<>]*)+)*(?(Open)(?!))$

規則運算式解譯如下:The regular expression is interpreted as follows:

模式Pattern 描述Description
^ 從字串開頭開始。Begin at the start of the string.
[^<>]* 比對非左右角括號的零或多個字元。Match zero or more characters that are not left or right angle brackets.
(?'Open'<) 比對左角括號,並將其指派給名為 Open的群組。Match a left angle bracket and assign it to a group named Open.
[^<>]* 比對非左右角括號的零或多個字元。Match zero or more characters that are not left or right angle brackets.
((?'Open'<)[^<>]*)+ 比對出現一或數次、後面接零或多個非左右角括號字元的左角括號。Match one or more occurrences of a left angle bracket followed by zero or more characters that are not left or right angle brackets. 這是第二個擷取群組。This is the second capturing group.
(?'Close-Open'>) 比對右角括號,將 Open 群組與目前群組之間的子字串指派給 Close 群組,並刪除 Open 群組的定義。Match a right angle bracket, assign the substring between the Open group and the current group to the Close group, and delete the definition of the Open group.
[^<>]* 比對出現零或數次、非左右角括號的任何字元。Match zero or more occurrences of any character that is neither a left nor a right angle bracket.
((?'Close-Open'>)[^<>]*)+ 比對出現一或數次、後接零或多個非左右角括號字元的右角括號。Match one or more occurrences of a right angle bracket, followed by zero or more occurrences of any character that is neither a left nor a right angle bracket. 比對右角括號時,將 Open 群組與目前群組之間的子字串指派給 Close 群組,並刪除 Open 群組的定義。When matching the right angle bracket, assign the substring between the Open group and the current group to the Close group, and delete the definition of the Open group. 這是第三個擷取群組。This is the third capturing group.
(((?'Open'<)[^<>]*)+((?'Close-Open'>)[^<>]*)+)* 比對出現零或數次的下列模式:出現一或數次的左角括號,後接零或多個非角括號字元,後接出現一或數次的右角括號,後接出現零或數次的非角括號。Match zero or more occurrences of the following pattern: one or more occurrences of a left angle bracket, followed by zero or more non-angle bracket characters, followed by one or more occurrences of a right angle bracket, followed by zero or more occurrences of non-angle brackets. 比對右角括號時,刪除 Open 群組的定義,並將 Open 群組與目前群組之間的子字串指派給 Close 群組。When matching the right angle bracket, delete the definition of the Open group, and assign the substring between the Open group and the current group to the Close group. 這是第一個擷取群組。This is the first capturing group.
(?(Open)(?!)) 如果 Open 群組存在,若可以比對空字串,則放棄比對,但不要將字串中的規則運算式引擎的位置向前移動。If the Open group exists, abandon the match if an empty string can be matched, but do not advance the position of the regular expression engine in the string. 這是零寬度右不合樣 (Negative Lookahead) 判斷提示。This is a zero-width negative lookahead assertion. 因為空字串一律以隱含方式存在於輸入字串中,所以此比對一定會失敗。Because an empty string is always implicitly present in an input string, this match always fails. 此比對失敗表示角括號不平衡。Failure of this match indicates that the angle brackets are not balanced.
$ 比對輸入字串的結尾。Match the end of the input string.

最後的子運算式 (?(Open)(?!))指出輸入字串中的巢狀建構是否正確平衡 (例如,是否每個左角括號都有配對的右角括號)。The final subexpression, (?(Open)(?!)), indicates whether the nesting constructs in the input string are properly balanced (for example, whether each left angle bracket is matched by a right angle bracket). 其依據有效的擷取群組進行條件式比對;如需詳細資訊,請參閱 Alternation ConstructsIt uses conditional matching based on a valid captured group; for more information, see Alternation Constructs. 如果已定義 Open 群組,規則運算式引擎會嘗試比對輸入字串中的子運算式 (?!)If the Open group is defined, the regular expression engine attempts to match the subexpression (?!) in the input string. 唯有當巢狀建構不平衡時,才應定義 Open 群組。The Open group should be defined only if nesting constructs are unbalanced. 因此,要在輸入字串中比對的模式,應該是一律導致比對失敗的模式。Therefore, the pattern to be matched in the input string should be one that always causes the match to fail. 在這個情況下, (?!) 是一律失敗的零寬度右不合樣 (Negative Lookahead) 判斷提示,因為空字串一律以隱含方式存在於輸入字串中。In this case, (?!) is a zero-width negative lookahead assertion that always fails, because an empty string is always implicitly present at the next position in the input string.

在這個範例中,規則運算式引擎會評估輸入字串 "<abc><mno<xyz>>",如下表所示。In the example, the regular expression engine evaluates the input string "<abc><mno<xyz>>" as shown in the following table.

步驟Step 模式Pattern 結果Result
11 ^ 從輸入字串的開頭開始比對。Starts the match at the beginning of the input string
22 [^<>]* 在左角括號之前尋找非角括號字元;沒有找到配對。Looks for non-angle bracket characters before the left angle bracket;finds no matches.
33 (((?'Open'<) 比對 "<abc>" 中的左角括號,並將其指派給 Open 群組。Matches the left angle bracket in "<abc>" and assigns it to the Open group.
44 [^<>]* 比對 "abc"。Matches "abc".
55 )+ "<abc" 是第二個擷取群組的值。"<abc" is the value of the second captured group.

輸入字串中的下一個字元不是左角括號,所以規則運算式引擎未回送至 (?'Open'<)[^<>]*) 子模式。The next character in the input string is not a left angle bracket, so the regular expression engine does not loop back to the (?'Open'<)[^<>]*) subpattern.
66 ((?'Close-Open'>) 比對 "<abc>" 中的右角括號,將 "abc" (介於 Open 群組與右角括號之間的子字串) 指派給 Open 群組,並刪除 Close 群組的目前值 ("<"),使其空白。Matches the right angle bracket in "<abc>", assigns "abc", which is the substring between the Open group and the right angle bracket, to the Close group, and deletes the current value ("<") of the Open group, leaving it empty.
77 [^<>]* 在右角括號之後尋找非角括號字元;沒有找到配對。Looks for non-angle bracket characters after the right angle bracket; finds no matches.
88 )+ 第三個擷取群組的值是 ">"。The value of the third captured group is ">".

輸入字串中的下一個字元不是右角括號,所以規則運算式引擎未回送至 ((?'Close-Open'>)[^<>]*) 子模式。The next character in the input string is not a right angle bracket, so the regular expression engine does not loop back to the ((?'Close-Open'>)[^<>]*) subpattern.
99 )* 第一個擷取群組的值是 "<abc>"。The value of the first captured group is "<abc>".

輸入字串中的下一個字元是左角括號,所以規則運算式引擎會回送至 (((?'Open'<) 子模式。The next character in the input string is a left angle bracket, so the regular expression engine loops back to the (((?'Open'<) subpattern.
1010 (((?'Open'<) 比對 "<mno" 中的左角括弧,並將其指派給 Open 群組。Matches the left angle bracket in "<mno" and assigns it to the Open group. Group.Captures 集合現在有單一值 "<"。Its Group.Captures collection now has a single value, "<".
1111 [^<>]* 比對 "mno"。Matches "mno".
1212 )+ "<mno" 是第二個擷取群組的值。"<mno" is the value of the second captured group.

輸入字串中的下一個字元是左角括號,所以規則運算式引擎會回送至 (?'Open'<)[^<>]*) 子模式。The next character in the input string is an left angle bracket, so the regular expression engine loops back to the (?'Open'<)[^<>]*) subpattern.
1313 (((?'Open'<) 比對 "<xyz>" 中的左角括號,並將其指派給 Open 群組。Matches the left angle bracket in "<xyz>" and assigns it to the Open group. Open 群組的 Group.Captures 集合現在包含兩個擷取:"<mno" 的左角括弧,以及 "<xyz>" 的左角括弧。The Group.Captures collection of the Open group now includes two captures: the left angle bracket from "<mno", and the left angle bracket from "<xyz>".
1414 [^<>]* 比對 "xyz"。Matches "xyz".
1515 )+ "<xyz" 是第二個擷取群組的值。"<xyz" is the value of the second captured group.

輸入字串中的下一個字元不是左角括號,所以規則運算式引擎未回送至 (?'Open'<)[^<>]*) 子模式。The next character in the input string is not a left angle bracket, so the regular expression engine does not loop back to the (?'Open'<)[^<>]*) subpattern.
1616 ((?'Close-Open'>) 比對 "<xyz>" 的右角括弧。Matches the right angle bracket in "<xyz>". "xyz",將 Open 群組與右角括弧之間的子字串指派給 Close 群組,並刪除 Open 群組目前的值。"xyz", assigns the substring between the Open group and the right angle bracket to the Close group, and deletes the current value of the Open group. 先前擷取的值 ("<mno" 中的左角括弧) 會變成 Open 群組目前值。The value of the previous capture (the left angle bracket in "<mno") becomes the current value of the Open group. Open 群組的 Captures 集合現在包含單一擷取:"<xyz>" 的左角括號。The Captures collection of the Open group now includes a single capture, the left angle bracket from "<xyz>".
1717 [^<>]* 尋找非角括號字元;沒有找到配對。Looks for non-angle bracket characters; finds no matches.
1818 )+ 第三個擷取群組的值是 ">"。The value of the third captured group is ">".

輸入字串中的下一個字元是右角括號,所以規則運算式引擎會回送至 ((?'Close-Open'>)[^<>]*) 子模式。The next character in the input string is a right angle bracket, so the regular expression engine loops back to the ((?'Close-Open'>)[^<>]*) subpattern.
1919 ((?'Close-Open'>) 比對 "xyz>>" 中的最後一個右角括號,將 "mno<xyz>" (介於 Open 群組與右角括號之間的子字串) 指派給 Close 群組,並刪除 Open 群組的目前值。Matches the final right angle bracket in "xyz>>", assigns "mno<xyz>" (the substring between the Open group and the right angle bracket) to the Close group, and deletes the current value of the Open group. Open 群組現在是空的。The Open group is now empty.
2020 [^<>]* 尋找非角括號字元;沒有找到配對。Looks for non-angle bracket characters; finds no matches.
2121 )+ 第三個擷取群組的值是 ">"。The value of the third captured group is ">".

輸入字串中的下一個字元不是右角括號,所以規則運算式引擎未回送至 ((?'Close-Open'>)[^<>]*) 子模式。The next character in the input string is not a right angle bracket, so the regular expression engine does not loop back to the ((?'Close-Open'>)[^<>]*) subpattern.
2222 )* 第一個擷取群組的值是 "<mno<xyz>>"。The value of the first captured group is "<mno<xyz>>".

輸入字串中的下一個字元不是左角括號,所以規則運算式引擎未回送至 (((?'Open'<) 子模式。The next character in the input string is not a left angle bracket, so the regular expression engine does not loop back to the (((?'Open'<) subpattern.
2323 (?(Open)(?!)) Open 群組未定義,所以未嘗試任何比對。The Open group is not defined, so no match is attempted.
2424 $ 比對輸入字串的結尾。Matches the end of the input string.

非擷取群組Noncapturing Groups

下列分組建構不會擷取由下列子運算式比對的子字串:The following grouping construct does not capture the substring that is matched by a subexpression:

(?:subexpression)

其中 subexpression 是任何有效的規則運算式模式。where subexpression is any valid regular expression pattern. 將數量詞套用至群組時,通常會使用非擷取分組建構,但是群組擷取的子字串沒有用。The noncapturing group construct is typically used when a quantifier is applied to a group, but the substrings captured by the group are of no interest.

注意

如果規則運算式包含巢狀分組建構,則外部非擷取分組建構不會套用至內部巢狀分組建構。If a regular expression includes nested grouping constructs, an outer noncapturing group construct does not apply to the inner nested group constructs.

下列範例說明包含非擷取群組的規則運算式。The following example illustrates a regular expression that includes noncapturing groups. 請注意,輸出沒有包含任何擷取群組。Note that the output does not include any captured groups.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(?:\b(?:\w+)\W*)+\.";
      string input = "This is a short sentence.";
      Match match = Regex.Match(input, pattern);
      Console.WriteLine("Match: {0}", match.Value);
      for (int ctr = 1; ctr < match.Groups.Count; ctr++)
         Console.WriteLine("   Group {0}: {1}", ctr, match.Groups[ctr].Value);
   }
}
// The example displays the following output:
//       Match: This is a short sentence.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(?:\b(?:\w+)\W*)+\."
      Dim input As String = "This is a short sentence."
      Dim match As Match = Regex.Match(input, pattern)
      Console.WriteLine("Match: {0}", match.Value)
      For ctr As Integer = 1 To match.Groups.Count - 1
         Console.WriteLine("   Group {0}: {1}", ctr, match.Groups(ctr).Value)
      Next
   End Sub
End Module
' The example displays the following output:
'       Match: This is a short sentence.

規則運算式 (?:\b(?:\w+)\W*)+\. 符合以句點終止的句子。The regular expression (?:\b(?:\w+)\W*)+\. matches a sentence that is terminated by a period. 因為規則運算式著重在句子,而不是個別文字,所以分組建構只會用做數量詞。Because the regular expression focuses on sentences and not on individual words, grouping constructs are used exclusively as quantifiers. 規則運算式模式的解譯方式如下表所示。The regular expression pattern is interpreted as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
(?:\w+) 比對一個或多個文字字元。Match one or more word characters. 請勿將相符的文字指派給擷取群組。Do not assign the matched text to a captured group.
\W* 比對零或多個非文字字元。Match zero or more non-word characters.
(?:\b(?:\w+)\W*)+ 比對下列模式一次或多次:以字邊界開頭的一或多個文字字元,後面接零或多個非文字字元。Match the pattern of one or more word characters starting at a word boundary, followed by zero or more non-word characters, one or more times. 請勿將相符的文字指派給擷取群組。Do not assign the matched text to a captured group.
\. 比對句點。Match a period.

群組選項Group Options

下列分組建構會在子運算式中套用或停用指定的選項:The following grouping construct applies or disables the specified options within a subexpression:

(?imnsx-imnsx: subexpression )(?imnsx-imnsx: subexpression )

其中 subexpression 是任何有效的規則運算式模式。where subexpression is any valid regular expression pattern. 例如, (?i-s:) 會開啟不區分大小寫,並停用單行模式。For example, (?i-s:) turns on case insensitivity and disables single-line mode. 如需您可以指定之內嵌選項的詳細資訊,請參閱 規則運算式選項For more information about the inline options you can specify, see Regular Expression Options.

注意

若要將指定的選項套用至整個規則運算式,而不是單一子運算式,您可以使用 System.Text.RegularExpressions.Regex 類別建構函式或靜態方法。You can specify options that apply to an entire regular expression rather than a subexpression by using a System.Text.RegularExpressions.Regex class constructor or a static method. 您也可以使用 (?imnsx-imnsx) 語言建構,以指定套用於規則運算式特定點之後的內嵌選項。You can also specify inline options that apply after a specific point in a regular expression by using the (?imnsx-imnsx) language construct.

群組選項建構不是擷取群組。The group options construct is not a capturing group. 也就是說,雖然 subexpression 擷取之字串的任何部分都會包含在比對中,但不會包含在擷取群組中,也不會用來填入 GroupCollection 物件。That is, although any portion of a string that is captured by subexpression is included in the match, it is not included in a captured group nor used to populate the GroupCollection object.

例如,下列範例中的規則運算式 \b(?ix: d \w+)\s 在分組建構中使用內嵌選項,來啟用不區分大小寫的比對,並會在識別以字母 "d" 開頭的所有字組時,忽略模式空白字元。For example, the regular expression \b(?ix: d \w+)\s in the following example uses inline options in a grouping construct to enable case-insensitive matching and ignore pattern white space in identifying all words that begin with the letter "d". 規則運算式的定義如下表所示。The regular expression is defined as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
(?ix: d \w+) 使用不區分大小寫比對,並忽略此模式中的空格,比對後面接一或多個文字字元的 "d"。Using case-insensitive matching and ignoring white space in this pattern, match a "d" followed by one or more word characters.
\s 比對空白字元。Match a white-space character.
string pattern = @"\b(?ix: d \w+)\s";
string input = "Dogs are decidedly good pets.";

foreach (Match match in Regex.Matches(input, pattern))
   Console.WriteLine("'{0}// found at index {1}.", match.Value, match.Index);
// The example displays the following output:
//    'Dogs // found at index 0.
//    'decidedly // found at index 9.      
Dim pattern As String = "\b(?ix: d \w+)\s"
Dim input As String = "Dogs are decidedly good pets."

For Each match As Match In Regex.Matches(input, pattern)
   Console.WriteLine("'{0}' found at index {1}.", match.Value, match.Index)
Next
' The example displays the following output:
'    'Dogs ' found at index 0.
'    'decidedly ' found at index 9.      

零寬度右合樣 (Positive Lookahead) 判斷提示Zero-Width Positive Lookahead Assertions

下列分組建構可定義零寬度右合樣 (Positive Lookahead) 判斷提示:The following grouping construct defines a zero-width positive lookahead assertion:

(?= subexpression )(?= subexpression )

其中 subexpression 是任何規則運算式模式。where subexpression is any regular expression pattern. 若要讓比對成功,輸入字串必須符合 subexpression中的規則運算式模式,但是相符的子字串不會包含在比對結果中。For a match to be successful, the input string must match the regular expression pattern in subexpression, although the matched substring is not included in the match result. 零寬度右合樣 (Positive Lookahead) 判斷提示不會回溯。A zero-width positive lookahead assertion does not backtrack.

通常會在規則運算式模式結尾找到零寬度右合樣 (Positive Lookahead) 判斷提示。Typically, a zero-width positive lookahead assertion is found at the end of a regular expression pattern. 它會定義必須在字串結尾找到,以讓相符項出現的子字串,但不應包含在比對中。It defines a substring that must be found at the end of a string for a match to occur but that should not be included in the match. 防止過度回溯也很有用。It is also useful for preventing excessive backtracking. 您可以使用零寬度右合樣 (Positive Lookahead) 判斷提示,以確保特定擷取群組的開頭文字,符合針對該擷取群組所定義的模式子集。You can use a zero-width positive lookahead assertion to ensure that a particular captured group begins with text that matches a subset of the pattern defined for that captured group. 例如,如果擷取群組符合連續的文字字元,您就可以使用零寬度右合樣 (Positive Lookahead) 判斷提示,要求第一個字元為英文字母大寫字元。For example, if a capturing group matches consecutive word characters, you can use a zero-width positive lookahead assertion to require that the first character be an alphabetical uppercase character.

下列範例使用以零寬度右合樣判斷提示,比對輸入字串中在動詞 "is" 之前的字組。The following example uses a zero-width positive lookahead assertion to match the word that precedes the verb "is" in the input string.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b\w+(?=\sis\b)";
      string[] inputs = { "The dog is a Malamute.", 
                          "The island has beautiful birds.", 
                          "The pitch missed home plate.", 
                          "Sunday is a weekend day." };

      foreach (string input in inputs)
      {
         Match match = Regex.Match(input, pattern);
         if (match.Success)
            Console.WriteLine("'{0}' precedes 'is'.", match.Value);
         else
            Console.WriteLine("'{0}' does not match the pattern.", input); 
      }
   }
}
// The example displays the following output:
//    'dog' precedes 'is'.
//    'The island has beautiful birds.' does not match the pattern.
//    'The pitch missed home plate.' does not match the pattern.
//    'Sunday' precedes 'is'.
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b\w+(?=\sis\b)"
      Dim inputs() As String = { "The dog is a Malamute.", _
                                 "The island has beautiful birds.", _
                                 "The pitch missed home plate.", _
                                 "Sunday is a weekend day." }

      For Each input As String In inputs
         Dim match As Match = Regex.Match(input, pattern)
         If match.Success Then
            Console.WriteLine("'{0}' precedes 'is'.", match.Value)
         Else
            Console.WriteLine("'{0}' does not match the pattern.", input) 
         End If     
      Next
   End Sub
End Module
' The example displays the following output:
'       'dog' precedes 'is'.
'       'The island has beautiful birds.' does not match the pattern.
'       'The pitch missed home plate.' does not match the pattern.
'       'Sunday' precedes 'is'.

規則運算式 \b\w+(?=\sis\b) 的解譯方式如下表所示。The regular expression \b\w+(?=\sis\b) is interpreted as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
\w+ 比對一個或多個文字字元。Match one or more word characters.
(?=\sis\b) 判定文字字元後面是否接著空格字元和字串 "is",在字邊界結束。Determine whether the word characters are followed by a white-space character and the string "is", which ends on a word boundary. 若是如此,比對將會成功。If so, the match is successful.

零寬度右不合樣 (Negative Lookahead) 判斷提示Zero-Width Negative Lookahead Assertions

下列分組建構可定義零寬度右不合樣 (Negative Lookahead) 判斷提示:The following grouping construct defines a zero-width negative lookahead assertion:

(?! subexpression )(?! subexpression )

其中 subexpression 是任何規則運算式模式。where subexpression is any regular expression pattern. 為了若要讓比對成功,輸入字串絕不能符合 subexpression中的規則運算式模式,但是相符的字串不會包含在比對結果中。For the match to be successful, the input string must not match the regular expression pattern in subexpression, although the matched string is not included in the match result.

零寬度右不合樣 (Negative Lookahead) 判斷提示通常會用在規則運算式開頭或結尾。A zero-width negative lookahead assertion is typically used either at the beginning or at the end of a regular expression. 若是在規則運算式開頭,當規則運算式開頭定義類似但較為廣泛的模式以供比對時,此判斷提示可定義不應相符的模式。At the beginning of a regular expression, it can define a specific pattern that should not be matched when the beginning of the regular expression defines a similar but more general pattern to be matched. 在此情況下,通常是用來限制回溯。In this case, it is often used to limit backtracking. 若是在規則運算式結尾,則可定義不能出現在相符項結尾的子運算式。At the end of a regular expression, it can define a subexpression that cannot occur at the end of a match.

下列範例定義的規則運算式在規則運算式開頭使用零寬度右合樣判斷提示,以比對不是以 "un" 開頭的文字。The following example defines a regular expression that uses a zero-width lookahead assertion at the beginning of the regular expression to match words that do not begin with "un".

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(?!un)\w+\b";
      string input = "unite one unethical ethics use untie ultimate";
      foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
         Console.WriteLine(match.Value);
   }
}
// The example displays the following output:
//       one
//       ethics
//       use
//       ultimate
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b(?!un)\w+\b"
      Dim input As String = "unite one unethical ethics use untie ultimate"
      For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
         Console.WriteLine(match.Value)
      Next
   End Sub
End Module
' The example displays the following output:
'       one
'       ethics
'       use
'       ultimate

規則運算式 \b(?!un)\w+\b 的解譯方式如下表所示。The regular expression \b(?!un)\w+\b is interpreted as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
(?!un) 判斷下兩個字元是否為 "un"。Determine whether the next two characters are "un". 如果不是,才可能比對。If they are not, a match is possible.
\w+ 比對一個或多個文字字元。Match one or more word characters.
\b 結束字緣比對。End the match at a word boundary.

下列範例定義的規則運算式在規則運算式結尾使用零寬度右合樣判斷提示,以比對不是以標點符號字元結尾的文字。The following example defines a regular expression that uses a zero-width lookahead assertion at the end of the regular expression to match words that do not end with a punctuation character.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b\w+\b(?!\p{P})";
      string input = "Disconnected, disjointed thoughts in a sentence fragment.";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine(match.Value);
   }
}
// The example displays the following output:
//       disjointed
//       thoughts
//       in
//       a
//       sentence
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b\w+\b(?!\p{P})"
      Dim input As String = "Disconnected, disjointed thoughts in a sentence fragment."
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine(match.Value)
      Next   
   End Sub
End Module
' The example displays the following output:
'       disjointed
'       thoughts
'       in
'       a
'       sentence

規則運算式 \b\w+\b(?!\p{P}) 的解譯方式如下表所示。The regular expression \b\w+\b(?!\p{P}) is interpreted as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
\w+ 比對一個或多個文字字元。Match one or more word characters.
\b 結束字緣比對。End the match at a word boundary.
\p{P}) 如果下一個字元不是標點符號 (例如句點或逗號),則比對成功。If the next character is not a punctuation symbol (such as a period or a comma), the match succeeds.

零寬度左合樣 (Positive Lookbehind) 判斷提示Zero-Width Positive Lookbehind Assertions

下列分組建構可定義零寬度左合樣 (Positive Lookbehind) 判斷提示:The following grouping construct defines a zero-width positive lookbehind assertion:

(?<= subexpression )(?<= subexpression )

其中 subexpression 是任何規則運算式模式。where subexpression is any regular expression pattern. 若要讓比對成功, subexpression 必須出現在目前位置左邊的輸入字串中,但是 subexpression 不會包含在比對結果中。For a match to be successful, subexpression must occur at the input string to the left of the current position, although subexpression is not included in the match result. 零寬度左合樣 (Positive Lookbehind) 判斷提示不會回溯。A zero-width positive lookbehind assertion does not backtrack.

零寬度左合樣 (Positive Lookbehind) 判斷提示通常會用在規則運算式開頭。Zero-width positive lookbehind assertions are typically used at the beginning of regular expressions. 其定義的模式是比對的前置條件,但不包含在比對結果中。The pattern that they define is a precondition for a match, although it is not a part of the match result.

例如,下列範例會比對 21 世紀年份的後兩位數 (也就是說,比對的字串前面需要有數字 "20")。For example, the following example matches the last two digits of the year for the twenty first century (that is, it requires that the digits "20" precede the matched string).

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "2010 1999 1861 2140 2009";
      string pattern = @"(?<=\b20)\d{2}\b";
      
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine(match.Value);
   }
}
// The example displays the following output:
//       10
//       09
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "2010 1999 1861 2140 2009"
      Dim pattern As String = "(?<=\b20)\d{2}\b"
      
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine(match.Value)
      Next      
   End Sub
End Module
' The example displays the following output:
'       10
'       09

規則運算式模式 (?<=\b20)\d{2}\b 的解譯方式如下表所示。The regular expression pattern (?<=\b20)\d{2}\b is interpreted as shown in the following table.

模式Pattern 描述Description
\d{2} 比對兩個十進位數字。Match two decimal digits.
(?<=\b20) 如果字邊界上的兩個十進位數字前置十進位數字 "20",則繼續比對。Continue the match if the two decimal digits are preceded by the decimal digits "20" on a word boundary.
\b 結束字緣比對。End the match at a word boundary.

當擷取群組中的最後一或多個字元,必須是符合該群組規則運算式模式的字元子集時,零寬度左合樣 (Positive Lookbehind) 判斷提示也可以用來限制回溯。Zero-width positive lookbehind assertions are also used to limit backtracking when the last character or characters in a captured group must be a subset of the characters that match that group's regular expression pattern. 例如,如果群組擷取所有連續的文字字元,您就可以使用零寬度左合樣 (Positive Lookbehind) 判斷提示,要求最後一個字元為英文字母。For example, if a group captures all consecutive word characters, you can use a zero-width positive lookbehind assertion to require that the last character be alphabetical.

零寬度左不合樣 (Negative Lookbehind) 判斷提示Zero-Width Negative Lookbehind Assertions

下列分組建構可定義零寬度左不合樣 (Negative Lookbehind) 判斷提示:The following grouping construct defines a zero-width negative lookbehind assertion:

(?<! subexpression )(?<! subexpression )

其中 subexpression 是任何規則運算式模式。where subexpression is any regular expression pattern. 若要讓比對成功, subexpression 絕不能出現在目前位置左邊的輸入字串中。For a match to be successful, subexpression must not occur at the input string to the left of the current position. 不過,不符合 subexpression 的任何子字串都不會包含在比對結果中。However, any substring that does not match subexpression is not included in the match result.

零寬度左不合樣 (Negative Lookbehind) 判斷提示通常會用在規則運算式開頭。Zero-width negative lookbehind assertions are typically used at the beginning of regular expressions. 其定義的模式排除了後面字串中的比對。The pattern that they define precludes a match in the string that follows. 當擷取群組中的最後一或多個字元,絕不能是符合該群組規則運算式模式的一或多個字元時,此判斷提示也可以用來限制回溯。They are also used to limit backtracking when the last character or characters in a captured group must not be one or more of the characters that match that group's regular expression pattern. 例如,如果群組擷取所有連續的文字字元,您就可以使用零寬度左合樣 (Positive Lookbehind) 判斷提示,要求最後一個字元不能是底線 (_)。For example, if a group captures all consecutive word characters, you can use a zero-width positive lookbehind assertion to require that the last character not be an underscore (_).

下列範例會比對週間非週末 (不是星期六也不是星期日) 任何一天的日期。The following example matches the date for any day of the week that is not a weekend (that is, that is neither Saturday nor Sunday).

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string[] dates = { "Monday February 1, 2010", 
                         "Wednesday February 3, 2010", 
                         "Saturday February 6, 2010", 
                         "Sunday February 7, 2010", 
                         "Monday, February 8, 2010" };
      string pattern = @"(?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b";
      
      foreach (string dateValue in dates)
      {
         Match match = Regex.Match(dateValue, pattern);
         if (match.Success)
            Console.WriteLine(match.Value);
      }      
   }
}
// The example displays the following output:
//       February 1, 2010
//       February 3, 2010
//       February 8, 2010
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim dates() As String = { "Monday February 1, 2010", _
                                "Wednesday February 3, 2010", _
                                "Saturday February 6, 2010", _
                                "Sunday February 7, 2010", _
                                "Monday, February 8, 2010" }
      Dim pattern As String = "(?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b"
      
      For Each dateValue As String In dates
         Dim match As Match = Regex.Match(dateValue, pattern)
         If match.Success Then
            Console.WriteLine(match.Value)
         End If   
      Next      
   End Sub
End Module
' The example displays the following output:
'       February 1, 2010
'       February 3, 2010
'       February 8, 2010

規則運算式模式 (?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b 的解譯方式如下表所示。The regular expression pattern (?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b is interpreted as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
\w+ 比對後接空格字元的一或多個文字字元。Match one or more word characters followed by a white-space character.
\d{1,2}, 比對後接空格字元和逗號的一或兩個十進位數。Match either one or two decimal digits followed by a white-space character and a comma.
\d{4}\b 比對四個十進位數,並且在字邊界上結束比對。Match four decimal digits, and end the match at a word boundary.
(?<!(Saturday|Sunday) ) 如果比對項目前置文字不是後接空格的字串 "Saturday" 或 "Sunday",則比對成功。If the match is preceded by something other than the strings "Saturday" or "Sunday" followed by a space, the match is successful.

非回溯子運算式Nonbacktracking Subexpressions

下列分組建構代表非回溯子運算式 (也稱為「窮盡」(Greedy) 子運算式):The following grouping construct represents a nonbacktracking subexpression (also known as a "greedy" subexpression):

(?> subexpression )(?> subexpression )

其中 subexpression 是任何規則運算式模式。where subexpression is any regular expression pattern.

一般而言,如果規則運算式包含選用性或替代性比對模式,而比對未成功,規則運算式引擎可以分支在多個方向,將輸入字串與模式比對。Ordinarily, if a regular expression includes an optional or alternative matching pattern and a match does not succeed, the regular expression engine can branch in multiple directions to match an input string with a pattern. 如果採用第一個分支時,沒有找到比對項目,規則運算式引擎可以備份或回溯至採用第一個比對項目的點,並嘗試使用第二個分支來比對。If a match is not found when it takes the first branch, the regular expression engine can back up or backtrack to the point where it took the first match and attempt the match using the second branch. 此程序可以一直持續到所有分支都試過為止。This process can continue until all branches have been tried.

唯有當巢狀建構不平衡時,才應定義 (?>子運算式) 語言建構會停用回溯。The (?>subexpression) language construct disables backtracking. 規則運算式引擎會盡可能比對輸入字串中的所有字元。The regular expression engine will match as many characters in the input string as it can. 如果已無法進一步比對,將不會回溯嘗試替代模式比對。When no further match is possible, it will not backtrack to attempt alternate pattern matches. (也就是說,子運算式只會比對該子運算式單獨比對的字串,而不會嘗試依據子運算式和其後的任何子運算式來比對字串。)(That is, the subexpression matches only strings that would be matched by the subexpression alone; it does not attempt to match a string based on the subexpression and any subexpressions that follow it.)

如果您知道回溯會成功,則建議使用此選項。This option is recommended if you know that backtracking will not succeed. 防止規則運算式引擎執行不必要的搜尋,以提升效能。Preventing the regular expression engine from performing unnecessary searching improves performance.

下列範例說明非回溯子運算式如何修改模式比對的結果。The following example illustrates how a nonbacktracking subexpression modifies the results of a pattern match. 回溯規則運算式成功比對一連串重複的字元,其後面接著字邊界上出現一或多次的相同字元,而非回溯規則運算式則不成功。The backtracking regular expression successfully matches a series of repeated characters followed by one more occurrence of the same character on a word boundary, but the nonbacktracking regular expression does not.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string[] inputs = { "cccd.", "aaad", "aaaa" };
      string back = @"(\w)\1+.\b";
      string noback = @"(?>(\w)\1+).\b";
      
      foreach (string input in inputs)
      {
         Match match1 = Regex.Match(input, back);
         Match match2 = Regex.Match(input, noback);
         Console.WriteLine("{0}: ", input);

         Console.Write("   Backtracking : ");
         if (match1.Success)
            Console.WriteLine(match1.Value);
         else
            Console.WriteLine("No match");
         
         Console.Write("   Nonbacktracking: ");
         if (match2.Success)
            Console.WriteLine(match2.Value);
         else
            Console.WriteLine("No match");
      }
   }
}
// The example displays the following output:
//    cccd.:
//       Backtracking : cccd
//       Nonbacktracking: cccd
//    aaad:
//       Backtracking : aaad
//       Nonbacktracking: aaad
//    aaaa:
//       Backtracking : aaaa
//       Nonbacktracking: No match
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim inputs() As String = { "cccd.", "aaad", "aaaa" }
      Dim back As String = "(\w)\1+.\b"
      Dim noback As String = "(?>(\w)\1+).\b"
      
      For Each input As String In inputs
         Dim match1 As Match = Regex.Match(input, back)
         Dim match2 As Match = Regex.Match(input, noback)
         Console.WriteLine("{0}: ", input)

         Console.Write("   Backtracking : ")
         If match1.Success Then
            Console.WriteLine(match1.Value)
         Else
            Console.WriteLine("No match")
         End If
         
         Console.Write("   Nonbacktracking: ")
         If match2.Success Then
            Console.WriteLine(match2.Value)
         Else
            Console.WriteLine("No match")
         End If
      Next
   End Sub
End Module
' The example displays the following output:
'    cccd.:
'       Backtracking : cccd
'       Nonbacktracking: cccd
'    aaad:
'       Backtracking : aaad
'       Nonbacktracking: aaad
'    aaaa:
'       Backtracking : aaaa
'       Nonbacktracking: No match

非回溯規則運算式 (?>(\w)\1+).\b 的定義如下表所示。The nonbacktracking regular expression (?>(\w)\1+).\b is defined as shown in the following table.

模式Pattern 描述Description
(\w) 比對單一文字字元,並將其指派給第一個擷取群組。Match a single word character and assign it to the first capturing group.
\1+ 比對第一個擷取子字串的值一或數次。Match the value of the first captured substring one or more times.
. 比對任何字元。Match any character.
\b 結束字邊界比對。End the match on a word boundary.
(?>(\w)\1+) 比對出現一或數次的重複文字字元,但不要回溯比對字邊界上的最後一個字元。Match one or more occurrences of a duplicated word character, but do not backtrack to match the last character on a word boundary.

分組建構和規則運算式物件Grouping Constructs and Regular Expression Objects

規則運算式擷取群組所比對的子字串會以 System.Text.RegularExpressions.Group 物件來代表,此物件可從 System.Text.RegularExpressions.GroupCollection 屬性傳回的 Match.Groups 物件來擷取。Substrings that are matched by a regular expression capturing group are represented by System.Text.RegularExpressions.Group objects, which can be retrieved from the System.Text.RegularExpressions.GroupCollection object that is returned by the Match.Groups property. GroupCollection 物件的填入方式如下:The GroupCollection object is populated as follows:

  • 集合中的第一個 Group 物件 (索引位置為零的物件) 代表整個比對。The first Group object in the collection (the object at index zero) represents the entire match.

  • 下一個 Group 物件組合代表未具名 (編號的) 擷取群組。The next set of Group objects represent unnamed (numbered) capturing groups. 其出現順序會依照規則運算式中定義的順序,由左至右。They appear in the order in which they are defined in the regular expression, from left to right. 這些群組的索引值範圍是從 1 到集合中未具名擷取群組的編號。The index values of these groups range from 1 to the number of unnamed capturing groups in the collection. (特定群組的索引同等於其編號的反向參考。(The index of a particular group is equivalent to its numbered backreference. 如需反向參考的詳細資訊,請參閱 Backreference Constructs。)For more information about backreferences, see Backreference Constructs.)

  • 最後一個 Group 物件組合代表具名擷取群組。The final set of Group objects represent named capturing groups. 其出現順序會依照規則運算式中定義的順序,由左至右。They appear in the order in which they are defined in the regular expression, from left to right. 第一個具名擷取群組的索引值比最後一個未具名擷取群組的索引值大一。The index value of the first named capturing group is one greater than the index of the last unnamed capturing group. 如果規則運算式中沒有未具名擷取群組,則第一個具名擷取群組的索引值為一。If there are no unnamed capturing groups in the regular expression, the index value of the first named capturing group is one.

如果您將數量詞套用至擷取群組,對應之 Group 物件的 Capture.ValueCapture.IndexCapture.Length 屬性會反映擷取群組所擷取的最後一個子字串。If you apply a quantifier to a capturing group, the corresponding Group object's Capture.Value, Capture.Index, and Capture.Length properties reflect the last substring that is captured by a capturing group. 您可以擷取由群組所擷取的一組完整子字串,那些群組具有 CaptureCollection 屬性傳回之 Group.Captures 物件中的數量詞。You can retrieve a complete set of substrings that are captured by groups that have quantifiers from the CaptureCollection object that is returned by the Group.Captures property.

下列範例說明 GroupCapture 物件之間的關聯性。The following example clarifies the relationship between the Group and Capture objects.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"(\b(\w+)\W+)+";
      string input = "This is a short sentence.";
      Match match = Regex.Match(input, pattern);
      Console.WriteLine("Match: '{0}'", match.Value);
      for (int ctr = 1; ctr < match.Groups.Count; ctr++)
      {
         Console.WriteLine("   Group {0}: '{1}'", ctr, match.Groups[ctr].Value);
         int capCtr = 0;
         foreach (Capture capture in match.Groups[ctr].Captures)
         {
            Console.WriteLine("      Capture {0}: '{1}'", capCtr, capture.Value);
            capCtr++;
         }
      }
   }
}
// The example displays the following output:
//       Match: 'This is a short sentence.'
//          Group 1: 'sentence.'
//             Capture 0: 'This '
//             Capture 1: 'is '
//             Capture 2: 'a '
//             Capture 3: 'short '
//             Capture 4: 'sentence.'
//          Group 2: 'sentence'
//             Capture 0: 'This'
//             Capture 1: 'is'
//             Capture 2: 'a'
//             Capture 3: 'short'
//             Capture 4: 'sentence'
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(\b(\w+)\W+)+"
      Dim input As String = "This is a short sentence."
      Dim match As Match = Regex.Match(input, pattern)
      Console.WriteLine("Match: '{0}'", match.Value)
      For ctr As Integer = 1 To match.Groups.Count - 1
         Console.WriteLine("   Group {0}: '{1}'", ctr, match.Groups(ctr).Value)
         Dim capCtr As Integer = 0
         For Each capture As Capture In match.Groups(ctr).Captures
            Console.WriteLine("      Capture {0}: '{1}'", capCtr, capture.Value)
            capCtr += 1
         Next
      Next
   End Sub
End Module
' The example displays the following output:
'       Match: 'This is a short sentence.'
'          Group 1: 'sentence.'
'             Capture 0: 'This '
'             Capture 1: 'is '
'             Capture 2: 'a '
'             Capture 3: 'short '
'             Capture 4: 'sentence.'
'          Group 2: 'sentence'
'             Capture 0: 'This'
'             Capture 1: 'is'
'             Capture 2: 'a'
'             Capture 3: 'short'
'             Capture 4: 'sentence'

規則運算式模式 (\b(\w+)\W+)+ 會從字串擷取個別文字。The regular expression pattern (\b(\w+)\W+)+ extracts individual words from a string. 其定義方式如下表所示。It is defined as shown in the following table.

模式Pattern 描述Description
\b 開始字緣比對。Begin the match at a word boundary.
(\w+) 比對一個或多個文字字元。Match one or more word characters. 這些字元共同構成一個單字。Together, these characters form a word. 這是第二個擷取群組。This is the second capturing group.
\W+ 比對一或多個非文字字元。Match one or more non-word characters.
(\b(\w+)\W+) 一或多次比對一或多個文字字元後面接著一或多個非文字字元的模式。Match the pattern of one or more word characters followed by one or more non-word characters one or more times. 這是第一個擷取群組。This is the first capturing group.

第二個擷取群組會比對句子中的每個字。The second capturing group matches each word of the sentence. 第一個擷取群組會比對每個字以及接在該字後面的標點符號和空白字元。The first capturing group matches each word along with the punctuation and white space that follow the word. 索引為 2 的 Group 物件會提供第二個擷取群組所比對之文字的相關資訊。The Group object whose index is 2 provides information about the text matched by the second capturing group. 您可以從 CaptureCollection 屬性傳回的 Group.Captures 物件取得擷取群組所擷取的一整組文字。The complete set of words captured by the capturing group are available from the CaptureCollection object returned by the Group.Captures property.

請參閱See also