規則運算式中的替代建構Alternation Constructs in Regular Expressions

替代建構會修改規則運算式來啟用二選一或條件式比對。Alternation constructs modify a regular expression to enable either/or or conditional matching. .NET 支援下列三種替代建構:.NET supports three alternation constructs:

以 | 進行的模式比對Pattern Matching with |

您可以使用分隔號 (|) 字元來比對其中任一系列的模式,且 | 字元會分隔每一個模式。You can use the vertical bar (|) character to match any one of a series of patterns, where the | character separates each pattern.

| 字元和正字元類別一樣,可以用來比對一些單一字元當中的任一字元。Like the positive character class, the | character can be used to match any one of a number of single characters. 下列範例同時使用正字元類別和透過 | 字元的二選一模式比對,在字串中尋找與單字 "gray" 或 "grey" 相符的項目。The following example uses both a positive character class and either/or pattern matching with the | character to locate occurrences of the words "gray" or "grey" in a string. 在此案例中, | 字元會產生更詳細的規則運算式。In this case, the | character produces a regular expression that is more verbose.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      // Regular expression using character class.
      string pattern1 = @"\bgr[ae]y\b";
      // Regular expression using either/or.
      string pattern2 = @"\bgr(a|e)y\b";
      
      string input = "The gray wolf blended in among the grey rocks.";
      foreach (Match match in Regex.Matches(input, pattern1))
         Console.WriteLine("'{0}' found at position {1}", 
                           match.Value, match.Index);
      Console.WriteLine();
      foreach (Match match in Regex.Matches(input, pattern2))
         Console.WriteLine("'{0}' found at position {1}", 
                           match.Value, match.Index);
   }
}
// The example displays the following output:
//       'gray' found at position 4
//       'grey' found at position 35
//       
//       'gray' found at position 4
//       'grey' found at position 35           
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      ' Regular expression using character class.
      Dim pattern1 As String = "\bgr[ae]y\b"
      ' Regular expression using either/or.
      Dim pattern2 As String = "\bgr(a|e)y\b"
      
      Dim input As String = "The gray wolf blended in among the grey rocks."
      For Each match As Match In Regex.Matches(input, pattern1)
         Console.WriteLine("'{0}' found at position {1}", _
                           match.Value, match.Index)
      Next      
      Console.WriteLine()
      For Each match As Match In Regex.Matches(input, pattern2)
         Console.WriteLine("'{0}' found at position {1}", _
                           match.Value, match.Index)
      Next      
   End Sub
End Module
' The example displays the following output:
'       'gray' found at position 4
'       'grey' found at position 35
'       
'       'gray' found at position 4
'       'grey' found at position 35           

使用 | 字元 \bgr(a|e)y\b的正則運算式,如下表所示:The regular expression that uses the | character, \bgr(a|e)y\b, is interpreted as shown in the following table:

模式Pattern 描述Description
\b 從字緣開始。Start at a word boundary.
gr 比對字元 "gr"。Match the characters "gr".
(a|e) 比對 "a" 或 "e"。Match either an "a" or an "e".
y\b 比對字邊界上的 "y"。Match a "y" on a word boundary.

| 字元也可以用來與多個字元或子運算式執行兩選一的比對,這些字元或運算式可以包含字元常值和規則運算式語言項目的任何組合。The | character can also be used to perform an either/or match with multiple characters or subexpressions, which can include any combination of character literals and regular expression language elements. (字元類別不提供這種功能)。下列範例會使用 | 字元來解壓縮美國社會安全號碼(SSN),這是一種以ddd-dd-dddd或美國雇主識別碼(EIN)格式的9位數數位,是9位數的數位,其格式為dd-ddddddd(The character class does not provide this functionality.) The following example uses the | character to extract either a U.S. Social Security Number (SSN), which is a 9-digit number with the format ddd-dd-dddd, or a U.S. Employer Identification Number (EIN), which is a 9-digit number with the format dd-ddddddd.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b";
      string input = "01-9999999 020-333333 777-88-9999";
      Console.WriteLine("Matches for {0}:", pattern);
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index);
   }
}
// The example displays the following output:
//       Matches for \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b:
//          01-9999999 at position 0
//          777-88-9999 at position 22
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b"
      Dim input As String = "01-9999999 020-333333 777-88-9999"
      Console.WriteLine("Matches for {0}:", pattern)
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index)
      Next   
   End Sub
End Module
' The example displays the following output:
'       Matches for \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b:
'          01-9999999 at position 0
'          777-88-9999 at position 22

正則運算式 \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b 的解讀方式如下表所示:The regular expression \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b is interpreted as shown in the following table:

模式Pattern 描述Description
\b 從字緣開始。Start at a word boundary.
(\d{2}-\d{7}|\d{3}-\d{2}-\d{4}) 比對下列其中一項:兩個十進位數字後接連字號再後接七個十進位數字,或是三個十進位數字、連字號、兩個十進位數字、另一個連字號及四個十進位數字。Match either of the following: two decimal digits followed by a hyphen followed by seven decimal digits; or three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits.
\d 結束字緣比對。End the match at a word boundary.

使用運算式進行的條件式比對Conditional matching with an expression

這個語言項目會嘗試比對兩個模式的其中一個是否符合初始模式。This language element attempts to match one of two patterns depending on whether it can match an initial pattern. 它的語法為:Its syntax is:

(?( expression ) yes | no )(?( expression ) yes | no )

其中 expression 是要比對的初始模式, yes 是符合 expression 時要比對的模式,而 no 是不符合 expression 時要比對的選擇性模式。where expression is the initial pattern to match, yes is the pattern to match if expression is matched, and no is the optional pattern to match if expression is not matched. 規則運算式引擎會將 expression 視為零寬度的判斷提示,也就是說,規則運算式引擎不會在評估 expression 之後於輸入資料流中前進。The regular expression engine treats expression as a zero-width assertion; that is, the regular expression engine does not advance in the input stream after it evaluates expression. 因此,此建構等同於下列:Therefore, this construct is equivalent to the following:

(?(?= expression ) yes | no )(?(?= expression ) yes | no )

其中 (?=expression)) 是零寬度的判斷提示建構where (?=expression) is a zero-width assertion construct. (如需詳細資訊,請參閱群組結構)。因為正則運算式引擎會將expression解讀為錨點(零寬度的判斷提示),所以expression必須是零寬度的判斷提示(如需詳細資訊,請參閱錨點)或也包含在中的子運算式(For more information, see Grouping Constructs.) Because the regular expression engine interprets expression as an anchor (a zero-width assertion), expression must either be a zero-width assertion (for more information, see Anchors) or a subexpression that is also contained in yes. 否則就無法比對 yes 模式。Otherwise, the yes pattern cannot be matched.

注意

如果 expression 是具名或編號的擷取群組,則替代建構會解譯為擷取測試。如需詳細資訊,請參閱下一節依據有效擷取群組進行的條件式比對If expression is a named or numbered capturing group, the alternation construct is interpreted as a capture test; for more information, see the next section, Conditional Matching Based on a Valid Capture Group. 換句話說,規則運算式引擎不會嘗試比對擷取的子字串,而會測試群組是否存在。In other words, the regular expression engine does not attempt to match the captured substring, but instead tests for the presence or absence of the group.

下列範例是以 | 進行的二選一模式比對一節中使用的範例變化。The following example is a variation of the example that appears in the Either/Or Pattern Matching with | section. 它使用條件式比對,來判斷字邊界後的前三個字元是否為兩個位數後接連字號。It uses conditional matching to determine whether the first three characters after a word boundary are two digits followed by a hyphen. 如果是,它會嘗試比對美國美國雇主識別碼 (EIN)。If they are, it attempts to match a U.S. Employer Identification Number (EIN). 如果不是,它會嘗試比對美國社會安全碼 (SSN)。If not, it attempts to match a U.S. Social Security Number (SSN).

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b";
      string input = "01-9999999 020-333333 777-88-9999";
      Console.WriteLine("Matches for {0}:", pattern);
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index);
   }
}
// The example displays the following output:
//       Matches for \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b:
//          01-9999999 at position 0
//          777-88-9999 at position 22
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "\b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b"
      Dim input As String = "01-9999999 020-333333 777-88-9999"
      Console.WriteLine("Matches for {0}:", pattern)
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index)
      Next   
   End Sub
End Module
' The example displays the following output:
'       Matches for \b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b:
'          01-9999999 at position 0
'          777-88-9999 at position 22

正則運算式模式 \b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b 的解讀方式如下表所示:The regular expression pattern \b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b is interpreted as shown in the following table:

模式Pattern 描述Description
\b 從字緣開始。Start at a word boundary.
(?(\d{2}-) 判斷接下來三個字元是否為兩個數字後接連字號。Determine whether the next three characters consist of two digits followed by a hyphen.
\d{2}-\d{7} 如果上一個模式符合,便會比對兩個數字,後接連字號,再後接七個數字。If the previous pattern matches, match two digits followed by a hyphen followed by seven digits.
\d{3}-\d{2}-\d{4} 如果上一個模式不符合,便會比對三個十進位數字、連字號、兩個十進位數字、另一個連字號,以及四個十進位數字。If the previous pattern does not match, match three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits.
\b 比對字邊界。Match a word boundary.

依據有效擷取群組進行的條件式比對Conditional matching based on a valid captured group

這個語言項目會嘗試根據它是否已經比對指定的擷取群組,比對兩種模式的其中一種。This language element attempts to match one of two patterns depending on whether it has matched a specified capturing group. 它的語法為:Its syntax is:

(?( name ) yes | no )(?( name ) yes | no )

or

(?( 數字 ) | no )(?( number ) yes | no )

其中 name 是名稱,而 number 是擷取群組的數目, yesnamenumber 其中之一相符時要比對的運算式,而 no 則是不符合時要比對的選擇性運算式。where name is the name and number is the number of a capturing group, yes is the expression to match if name or number has a match, and no is the optional expression to match if it does not.

如果 name 並未對應到規則運算式模式中所使用的擷取群組名稱,則替代建構會解譯為運算式測試,如上一節中所說明。If name does not correspond to the name of a capturing group that is used in the regular expression pattern, the alternation construct is interpreted as an expression test, as explained in the previous section. 通常,這表示 expression 判斷值為 falseTypically, this means that expression evaluates to false. 如果 number 沒有對應到規則運算式模式中所使用的編號擷取群組,則規則運算式引擎會擲回 ArgumentExceptionIf number does not correspond to a numbered capturing group that is used in the regular expression pattern, the regular expression engine throws an ArgumentException.

下列範例是以 | 進行的二選一模式比對一節中使用的範例變化。The following example is a variation of the example that appears in the Either/Or Pattern Matching with | section. 它會使用名為 n2 的擷取群組,其由兩個數字後接連字號所組成。It uses a capturing group named n2 that consists of two digits followed by a hyphen. 替代建構會測試是否已在輸入字串中比對這個擷取群組。The alternation construct tests whether this capturing group has been matched in the input string. 如果是,交替建構會嘗試比對九位數雇主識別碼 (EIN) 的末七碼。美國雇主識別碼 (EIN)。If it has, the alternation construct attempts to match the last seven digits of a nine-digit U.S. Employer Identification Number (EIN). 如果不是,則會嘗試比對九位數美國社會安全碼 (SSN)。社會安全碼 (SSN)。If it has not, it attempts to match a nine-digit U.S. Social Security Number (SSN).

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(?<n2>\d{2}-)?(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b";
      string input = "01-9999999 020-333333 777-88-9999";
      Console.WriteLine("Matches for {0}:", pattern);
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index);
   }
}
// The example displays the following output:
//       Matches for \b(?<n2>\d{2}-)?(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b:
//          01-9999999 at position 0
//          777-88-9999 at position 22
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
        Dim pattern As String = "\b(?<n2>\d{2}-)?(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b"
        Dim input As String = "01-9999999 020-333333 777-88-9999"
      Console.WriteLine("Matches for {0}:", pattern)
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index)
      Next   
   End Sub
End Module

正則運算式模式 \b(?<n2>\d{2}-)?(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b 的解讀方式如下表所示:The regular expression pattern \b(?<n2>\d{2}-)?(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b is interpreted as shown in the following table:

模式Pattern 描述Description
\b 從字緣開始。Start at a word boundary.
(?<n2>\d{2}-)? 比對出現零次或一次且後接連字號的兩個數字。Match zero or one occurrence of two digits followed by a hyphen. 將此擷取群組命名為 n2Name this capturing group n2.
(?(n2) 測試 n2 在輸入字串中是否相符。Test whether n2 was matched in the input string.
\d{7} 如果 n2 相符,則會比對七個十進位數字。If n2 was matched, match seven decimal digits.
|\d{3}-\d{2}-\d{4} 如果 n2 不相符,則會比對三個十進位數字、一個連字號、兩個十進位數字、另一個連字號,以及四個十進位數字。If n2 was not matched, match three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits.
\b 比對字邊界。Match a word boundary.

此範例是使用編號的群組而不是具名群組的一種變化,如下所示。A variation of this example that uses a numbered group instead of a named group is shown in the following example. 它的規則運算式模式是 \b(\d{2}-)?(?(1)\d{7}|\d{3}-\d{2}-\d{4})\bIts regular expression pattern is \b(\d{2}-)?(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(\d{2}-)?(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b";
      string input = "01-9999999 020-333333 777-88-9999";
      Console.WriteLine("Matches for {0}:", pattern);
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index);
   }
}
// The example display the following output:
//       Matches for \b(\d{2}-)?(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b:
//          01-9999999 at position 0
//          777-88-9999 at position 22
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
        Dim pattern As String = "\b(\d{2}-)?(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b"
        Dim input As String = "01-9999999 020-333333 777-88-9999"
      Console.WriteLine("Matches for {0}:", pattern)
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("   {0} at position {1}", match.Value, match.Index)
      Next   
   End Sub
End Module
' The example displays the following output:
'       Matches for \b(\d{2}-)?(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b:
'          01-9999999 at position 0
'          777-88-9999 at position 22

請參閱See also