CharUnicodeInfo.GetUnicodeCategory 方法

定義

取得 Unicode 字元的 Unicode 分類。Gets the Unicode category of a Unicode character.

多載

GetUnicodeCategory(Char)

取得指定之字元的 Unicode 分類。Gets the Unicode category of the specified character.

GetUnicodeCategory(Int32)

取得指定之字元的 Unicode 分類。Gets the Unicode category of the specified character.

GetUnicodeCategory(String, Int32)

取得字元的 Unicode 分類,其位於指定字串的指定索引處。Gets the Unicode category of the character at the specified index of the specified string.

GetUnicodeCategory(Char)

取得指定之字元的 Unicode 分類。Gets the Unicode category of the specified character.

public:
 static System::Globalization::UnicodeCategory GetUnicodeCategory(char ch);
public static System.Globalization.UnicodeCategory GetUnicodeCategory (char ch);
static member GetUnicodeCategory : char -> System.Globalization.UnicodeCategory
Public Shared Function GetUnicodeCategory (ch As Char) As UnicodeCategory

參數

ch
Char

要取得 Unicode 分類的 Unicode 字元。The Unicode character for which to get the Unicode category.

傳回

UnicodeCategory 值,指出指定之字元的分類。A UnicodeCategory value indicating the category of the specified character.

範例

下列程式碼範例顯示不同字元類型的每個方法所傳回的值。The following code example shows the values returned by each method for different types of characters.

using namespace System;
using namespace System::Globalization;
void PrintProperties( Char c );
int main()
{
   Console::WriteLine( "                                        c  Num   Dig   Dec   UnicodeCategory" );
   Console::Write( "U+0061 LATIN SMALL LETTER A            " );
   PrintProperties( L'a' );
   Console::Write( "U+0393 GREEK CAPITAL LETTER GAMMA      " );
   PrintProperties( L'\u0393' );
   Console::Write( "U+0039 DIGIT NINE                      " );
   PrintProperties( L'9' );
   Console::Write( "U+00B2 SUPERSCRIPT TWO                 " );
   PrintProperties( L'\u00B2' );
   Console::Write( "U+00BC VULGAR FRACTION ONE QUARTER     " );
   PrintProperties( L'\u00BC' );
   Console::Write( "U+0BEF TAMIL DIGIT NINE                " );
   PrintProperties( L'\u0BEF' );
   Console::Write( "U+0BF0 TAMIL NUMBER TEN                " );
   PrintProperties( L'\u0BF0' );
   Console::Write( "U+0F33 TIBETAN DIGIT HALF ZERO         " );
   PrintProperties( L'\u0F33' );
   Console::Write( "U+2788 CIRCLED SANS-SERIF DIGIT NINE   " );
   PrintProperties( L'\u2788' );
}

void PrintProperties( Char c )
{
   Console::Write( " {0,-3}", c );
   Console::Write( " {0,-5}", CharUnicodeInfo::GetNumericValue( c ) );
   Console::Write( " {0,-5}", CharUnicodeInfo::GetDigitValue( c ) );
   Console::Write( " {0,-5}", CharUnicodeInfo::GetDecimalDigitValue( c ) );
   Console::WriteLine( "{0}", CharUnicodeInfo::GetUnicodeCategory( c ) );
}

/*
This code produces the following output.  Some characters might not display at the console.

                                        c  Num   Dig   Dec   UnicodeCategory
U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
U+0393 GREEK CAPITAL LETTER GAMMA       \u0393   -1    -1    -1   UppercaseLetter
U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
U+00B2 SUPERSCRIPT TWO                  \u00B2   2     2     2    OtherNumber
U+00BC VULGAR FRACTION ONE QUARTER      \u00BC   0.25  -1    -1   OtherNumber
U+0BEF TAMIL DIGIT NINE                 \u0BEF   9     9     9    DecimalDigitNumber
U+0BF0 TAMIL NUMBER TEN                 \u0BF0   10    -1    -1   OtherNumber
U+0F33 TIBETAN DIGIT HALF ZERO          \u0F33   -0.5  -1    -1   OtherNumber
U+2788 CIRCLED SANS-SERIF DIGIT NINE    \u2788   9     9     -1   OtherNumber

*/
using System;
using System.Globalization;

public class SamplesCharUnicodeInfo  {

   public static void Main()  {

      Console.WriteLine( "                                        c  Num   Dig   Dec   UnicodeCategory" );

      Console.Write( "U+0061 LATIN SMALL LETTER A            " );
      PrintProperties( 'a' );

      Console.Write( "U+0393 GREEK CAPITAL LETTER GAMMA      " );
      PrintProperties( '\u0393' );

      Console.Write( "U+0039 DIGIT NINE                      " );
      PrintProperties( '9' );

      Console.Write( "U+00B2 SUPERSCRIPT TWO                 " );
      PrintProperties( '\u00B2' );

      Console.Write( "U+00BC VULGAR FRACTION ONE QUARTER     " );
      PrintProperties( '\u00BC' );

      Console.Write( "U+0BEF TAMIL DIGIT NINE                " );
      PrintProperties( '\u0BEF' );

      Console.Write( "U+0BF0 TAMIL NUMBER TEN                " );
      PrintProperties( '\u0BF0' );

      Console.Write( "U+0F33 TIBETAN DIGIT HALF ZERO         " );
      PrintProperties( '\u0F33' );

      Console.Write( "U+2788 CIRCLED SANS-SERIF DIGIT NINE   " );
      PrintProperties( '\u2788' );

   }

   public static void PrintProperties( char c )  {
      Console.Write( " {0,-3}", c );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetNumericValue( c ) );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetDigitValue( c ) );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetDecimalDigitValue( c ) );
      Console.WriteLine( "{0}", CharUnicodeInfo.GetUnicodeCategory( c ) );
   }

}


/*
This code produces the following output.  Some characters might not display at the console.

                                        c  Num   Dig   Dec   UnicodeCategory
U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
U+0393 GREEK CAPITAL LETTER GAMMA       \u0393   -1    -1    -1   UppercaseLetter
U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
U+00B2 SUPERSCRIPT TWO                  \u00B2   2     2     2    OtherNumber
U+00BC VULGAR FRACTION ONE QUARTER      \u00BC   0.25  -1    -1   OtherNumber
U+0BEF TAMIL DIGIT NINE                 \u0BEF   9     9     9    DecimalDigitNumber
U+0BF0 TAMIL NUMBER TEN                 \u0BF0   10    -1    -1   OtherNumber
U+0F33 TIBETAN DIGIT HALF ZERO          \u0F33   -0.5  -1    -1   OtherNumber
U+2788 CIRCLED SANS-SERIF DIGIT NINE    \u2788   9     9     -1   OtherNumber

*/

Imports System.Globalization

Public Class SamplesCharUnicodeInfo   

   Public Shared Sub Main()

      Console.WriteLine("                                        c  Num   Dig   Dec   UnicodeCategory")

      Console.Write("U+0061 LATIN SMALL LETTER A            ")
      PrintProperties("a"c)

      Console.Write("U+0393 GREEK CAPITAL LETTER GAMMA      ")
      PrintProperties(ChrW(&H0393))

      Console.Write("U+0039 DIGIT NINE                      ")
      PrintProperties("9"c)

      Console.Write("U+00B2 SUPERSCRIPT TWO                 ")
      PrintProperties(ChrW(&H00B2))

      Console.Write("U+00BC VULGAR FRACTION ONE QUARTER     ")
      PrintProperties(ChrW(&H00BC))

      Console.Write("U+0BEF TAMIL DIGIT NINE                ")
      PrintProperties(ChrW(&H0BEF))

      Console.Write("U+0BF0 TAMIL NUMBER TEN                ")
      PrintProperties(ChrW(&H0BF0))

      Console.Write("U+0F33 TIBETAN DIGIT HALF ZERO         ")
      PrintProperties(ChrW(&H0F33))

      Console.Write("U+2788 CIRCLED SANS-SERIF DIGIT NINE   ")
      PrintProperties(ChrW(&H2788))

   End Sub

   Public Shared Sub PrintProperties(c As Char)
      Console.Write(" {0,-3}", c)
      Console.Write(" {0,-5}", CharUnicodeInfo.GetNumericValue(c))
      Console.Write(" {0,-5}", CharUnicodeInfo.GetDigitValue(c))
      Console.Write(" {0,-5}", CharUnicodeInfo.GetDecimalDigitValue(c))
      Console.WriteLine("{0}", CharUnicodeInfo.GetUnicodeCategory(c))
   End Sub

End Class


'This code produces the following output.  Some characters might not display at the console.
'
'                                        c  Num   Dig   Dec   UnicodeCategory
'U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
'U+0393 GREEK CAPITAL LETTER GAMMA       \u0393   -1    -1    -1   UppercaseLetter
'U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
'U+00B2 SUPERSCRIPT TWO                  \u00B2   2     2     2    OtherNumber
'U+00BC VULGAR FRACTION ONE QUARTER      \u00BC   0.25  -1    -1   OtherNumber
'U+0BEF TAMIL DIGIT NINE                 \u0BEF   9     9     9    DecimalDigitNumber
'U+0BF0 TAMIL NUMBER TEN                 \u0BF0   10    -1    -1   OtherNumber
'U+0F33 TIBETAN DIGIT HALF ZERO          \u0F33   -0.5  -1    -1   OtherNumber
'U+2788 CIRCLED SANS-SERIF DIGIT NINE    \u2788   9     9     -1   OtherNumber

備註

Unicode 字元會分成多個類別。The Unicode characters are divided into categories. 字元的類別目錄是它的其中一個屬性。A character's category is one of its properties. 例如,字元可能是大寫字母、小寫字母、十進位數數位、字母數位、連接子標點符號、數學符號或貨幣符號。For example, a character might be an uppercase letter, a lowercase letter, a decimal digit number, a letter number, a connector punctuation, a math symbol, or a currency symbol. UnicodeCategory 類別會傳回 Unicode 字元的分類。The UnicodeCategory class returns the category of a Unicode character. 如需 Unicode 字元的詳細資訊,請參閱Unicode 標準For more information on Unicode characters, see the Unicode Standard.

GetUnicodeCategory 方法會假設 ch 對應至單一語言字元,並傳回其類別目錄。The GetUnicodeCategory method assumes that ch corresponds to a single linguistic character and returns its category. 這表示,針對代理項配對,它會傳回 UnicodeCategory.Surrogate,而不是代理所屬的類別目錄。This means that, for surrogate pairs, it returns UnicodeCategory.Surrogate instead of the category to which the surrogate belongs. 例如,Ugaritic 字母會佔用程式碼點 U + 10380 到 U + 1039F。For example, the Ugaritic alphabet occupies code points U+10380 to U+1039F. 下列範例會使用 ConvertFromUtf32 方法來具現化代表 UGARITIC 字母 ALPA (U + 10380)的字串,這是 Ugaritic 字母的第一個字母。The following example uses the ConvertFromUtf32 method to instantiate a string that represents UGARITIC LETTER ALPA (U+10380), which is the first letter of the Ugaritic alphabet. 如範例的輸出所示,如果將此字元的高代理項或低代理項傳遞給它,則 IsNumber(Char) 方法會傳回 falseAs the output from the example shows, the IsNumber(Char) method returns false if it is passed either the high surrogate or the low surrogate of this character.

int utf32 = 0x10380;       // UGARITIC LETTER ALPA
string surrogate = Char.ConvertFromUtf32(utf32);
foreach (var ch in surrogate)
   Console.WriteLine("U+{0:X4}: {1:G}", 
                     Convert.ToUInt16(ch), 
                     System.Globalization.CharUnicodeInfo.GetUnicodeCategory(ch));
// The example displays the following output:
//       U+D800: Surrogate
//       U+DF80: Surrogate
Dim utf32 As Integer = &h10380       ' UGARITIC LETTER ALPA
Dim surrogate As String = Char.ConvertFromUtf32(utf32)
For Each ch In surrogate
   Console.WriteLine("U+{0:X4}: {1:G}", 
                     Convert.ToUInt16(ch), 
                     System.Globalization.CharUnicodeInfo.GetUnicodeCategory(ch))
Next
' The example displays the following output:
'       U+D800: Surrogate
'       U+DF80: Surrogate

請注意,當傳遞特定字元做為參數時,CharUnicodeInfo.GetUnicodeCategory 不一定會傳回與 Char.GetUnicodeCategory 方法相同的 UnicodeCategory 值。Note that CharUnicodeInfo.GetUnicodeCategory does not always return the same UnicodeCategory value as the Char.GetUnicodeCategory method when passed a particular character as a parameter. CharUnicodeInfo.GetUnicodeCategory 方法是設計來反映 Unicode 標準的目前版本。The CharUnicodeInfo.GetUnicodeCategory method is designed to reflect the current version of the Unicode standard. 相反地,雖然 Char.GetUnicodeCategory 方法通常會反映 Unicode 標準的目前版本,但它可能會傳回以舊版標準為基礎的字元類別,或者,它可能會傳回與目前標準不同的類別,以保留回溯相容性。In contrast, although the Char.GetUnicodeCategory method usually reflects the current version of the Unicode standard, it might return a character's category based on a previous version of the standard, or it might return a category that differs from the current standard to preserve backward compatibility.

另請參閱

GetUnicodeCategory(Int32)

取得指定之字元的 Unicode 分類。Gets the Unicode category of the specified character.

public:
 static System::Globalization::UnicodeCategory GetUnicodeCategory(int codePoint);
public static System.Globalization.UnicodeCategory GetUnicodeCategory (int codePoint);
static member GetUnicodeCategory : int -> System.Globalization.UnicodeCategory
Public Shared Function GetUnicodeCategory (codePoint As Integer) As UnicodeCategory

參數

codePoint
Int32

表示 Unicode 字元 32 位元字碼指標值的數字。A number representing the 32-bit code point value of the Unicode character.

傳回

UnicodeCategory 值,指出指定之字元的分類。A UnicodeCategory value indicating the category of the specified character.

GetUnicodeCategory(String, Int32)

取得字元的 Unicode 分類,其位於指定字串的指定索引處。Gets the Unicode category of the character at the specified index of the specified string.

public:
 static System::Globalization::UnicodeCategory GetUnicodeCategory(System::String ^ s, int index);
public static System.Globalization.UnicodeCategory GetUnicodeCategory (string s, int index);
static member GetUnicodeCategory : string * int -> System.Globalization.UnicodeCategory
Public Shared Function GetUnicodeCategory (s As String, index As Integer) As UnicodeCategory

參數

s
String

String,包含要取得 Unicode 分類的 Unicode 字元。The String containing the Unicode character for which to get the Unicode category.

index
Int32

要取得 Unicode 分類之 Unicode 字元的索引。The index of the Unicode character for which to get the Unicode category.

傳回

UnicodeCategory 值,指出位於指定字串之指定索引處的字元分類。A UnicodeCategory value indicating the category of the character at the specified index of the specified string.

例外狀況

snulls is null.

indexs 的有效索引範圍之外。index is outside the range of valid indexes in s.

範例

下列程式碼範例顯示不同字元類型的每個方法所傳回的值。The following code example shows the values returned by each method for different types of characters.

using namespace System;
using namespace System::Globalization;
int main()
{
   
   // The String to get information for.
   String^ s = "a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788";
   Console::WriteLine( "String: {0}", s );
   
   // Print the values for each of the characters in the string.
   Console::WriteLine( "index c  Num   Dig   Dec   UnicodeCategory" );
   for ( int i = 0; i < s->Length; i++ )
   {
      Console::Write( "{0,-5} {1,-3}", i, s[ i ] );
      Console::Write( " {0,-5}", CharUnicodeInfo::GetNumericValue( s, i ) );
      Console::Write( " {0,-5}", CharUnicodeInfo::GetDigitValue( s, i ) );
      Console::Write( " {0,-5}", CharUnicodeInfo::GetDecimalDigitValue( s, i ) );
      Console::WriteLine( "{0}", CharUnicodeInfo::GetUnicodeCategory( s, i ) );

   }
}

/*
This code produces the following output.  Some characters might not display at the console.

String: a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788
index c  Num   Dig   Dec   UnicodeCategory
0     a   -1    -1    -1   LowercaseLetter
1     9   9     9     9    DecimalDigitNumber
2     \u0393   -1    -1    -1   UppercaseLetter
3     \u00B2   2     2     2    OtherNumber
4     \u00BC   0.25  -1    -1   OtherNumber
5     \u0BEF   9     9     9    DecimalDigitNumber
6     \u0BF0   10    -1    -1   OtherNumber
7     \u2788   9     9     -1   OtherNumber

*/
using System;
using System.Globalization;

public class SamplesCharUnicodeInfo  {

   public static void Main()  {

      // The String to get information for.
      String s = "a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788";
      Console.WriteLine( "String: {0}", s );

      // Print the values for each of the characters in the string.
      Console.WriteLine( "index c  Num   Dig   Dec   UnicodeCategory" );
      for ( int i = 0; i < s.Length; i++ )  {
         Console.Write( "{0,-5} {1,-3}", i, s[i] );
         Console.Write( " {0,-5}", CharUnicodeInfo.GetNumericValue( s, i ) );
         Console.Write( " {0,-5}", CharUnicodeInfo.GetDigitValue( s, i ) );
         Console.Write( " {0,-5}", CharUnicodeInfo.GetDecimalDigitValue( s, i ) );
         Console.WriteLine( "{0}", CharUnicodeInfo.GetUnicodeCategory( s, i ) );
      }

   }

}


/*
This code produces the following output.  Some characters might not display at the console.

String: a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788
index c  Num   Dig   Dec   UnicodeCategory
0     a   -1    -1    -1   LowercaseLetter
1     9   9     9     9    DecimalDigitNumber
2     \u0393   -1    -1    -1   UppercaseLetter
3     \u00B2   2     2     2    OtherNumber
4     \u00BC   0.25  -1    -1   OtherNumber
5     \u0BEF   9     9     9    DecimalDigitNumber
6     \u0BF0   10    -1    -1   OtherNumber
7     \u2788   9     9     -1   OtherNumber

*/

Imports System.Globalization

Public Class SamplesCharUnicodeInfo   

   Public Shared Sub Main()

      ' The String to get information for.
      Dim s As [String] = "a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788"
      Console.WriteLine("String: {0}", s)

      ' Print the values for each of the characters in the string.
      Console.WriteLine("index c  Num   Dig   Dec   UnicodeCategory")
      Dim i As Integer
      For i = 0 To s.Length - 1
         Console.Write("{0,-5} {1,-3}", i, s(i))
         Console.Write(" {0,-5}", CharUnicodeInfo.GetNumericValue(s, i))
         Console.Write(" {0,-5}", CharUnicodeInfo.GetDigitValue(s, i))
         Console.Write(" {0,-5}", CharUnicodeInfo.GetDecimalDigitValue(s, i))
         Console.WriteLine("{0}", CharUnicodeInfo.GetUnicodeCategory(s, i))
      Next i

   End Sub

End Class


'This code produces the following output.  Some characters might not display at the console.
'
'String: a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788
'index c  Num   Dig   Dec   UnicodeCategory
'0     a   -1    -1    -1   LowercaseLetter
'1     9   9     9     9    DecimalDigitNumber
'2     \u0393   -1    -1    -1   UppercaseLetter
'3     \u00B2   2     2     2    OtherNumber
'4     \u00BC   0.25  -1    -1   OtherNumber
'5     \u0BEF   9     9     9    DecimalDigitNumber
'6     \u0BF0   10    -1    -1   OtherNumber
'7     \u2788   9     9     -1   OtherNumber

備註

Unicode 字元會分成多個類別。The Unicode characters are divided into categories. 字元的類別目錄是它的其中一個屬性。A character's category is one of its properties. 例如,字元可能是大寫字母、小寫字母、十進位數數位、字母數位、連接子標點符號、數學符號或貨幣符號。For example, a character might be an uppercase letter, a lowercase letter, a decimal digit number, a letter number, a connector punctuation, a math symbol, or a currency symbol. UnicodeCategory 類別會傳回 Unicode 字元的分類。The UnicodeCategory class returns the category of a Unicode character. 如需 Unicode 字元的詳細資訊,請參閱Unicode 標準For more information on Unicode characters, see the Unicode Standard.

如果位置 indexChar 物件是有效代理組的第一個字元,GetUnicodeCategory(String, Int32) 方法會傳回代理組的 Unicode 分類,而不是傳回 UnicodeCategory.SurrogateIf the Char object at position index is the first character of a valid surrogate pair, the GetUnicodeCategory(String, Int32) method returns the Unicode category of the surrogate pair instead of returning UnicodeCategory.Surrogate. 例如,Ugaritic 字母會佔用程式碼點 U + 10380 到 U + 1039F。For example, the Ugaritic alphabet occupies code points U+10380 to U+1039F. 下列範例會使用 ConvertFromUtf32 方法來具現化代表 UGARITIC 字母 ALPA (U + 10380)的字串,這是 Ugaritic 字母的第一個字母。The following example uses the ConvertFromUtf32 method to instantiate a string that represents UGARITIC LETTER ALPA (U+10380), which is the first letter of the Ugaritic alphabet. 如範例的輸出所示,GetUnicodeCategory(String, Int32) 方法會在傳遞此字元的高代理項時傳回 UnicodeCategory.OtherLetter,這表示它會考慮代理配對。As the output from the example shows, the GetUnicodeCategory(String, Int32) method returns UnicodeCategory.OtherLetter if it is passed the high surrogate of this character, which indicates that it considers the surrogate pair. 不過,如果通過低代理,則只會考慮隔離的低代理,並傳回 UnicodeCategory.SurrogateHowever, if it is passed the low surrogate, it considers only the low surrogate in isolation and returns UnicodeCategory.Surrogate.

int utf32 = 0x10380;       // UGARITIC LETTER ALPA
string surrogate = Char.ConvertFromUtf32(utf32);
for (int ctr = 0; ctr < surrogate.Length; ctr++)
   Console.WriteLine("U+{0:X4}: {1:G}", 
                     Convert.ToUInt16(surrogate[ctr]), 
                     System.Globalization.CharUnicodeInfo.GetUnicodeCategory(surrogate, ctr));
// The example displays the following output:
//       U+D800: OtherLetter
//       U+DF80: Surrogate      
Dim utf32 As Integer = &h10380       ' UGARITIC LETTER ALPA
Dim surrogate As String = Char.ConvertFromUtf32(utf32)
For ctr As Integer = 0 To surrogate.Length - 1
   Console.WriteLine("U+{0:X4}: {1:G}", 
                     Convert.ToUInt16(surrogate(ctr)), 
                     System.Globalization.CharUnicodeInfo.GetUnicodeCategory(surrogate, ctr))
Next
' The example displays the following output:
'       U+D800: OtherLetter
'       U+DF80: Surrogate      

請注意,當傳遞特定字元做為參數時,CharUnicodeInfo.GetUnicodeCategory 方法不一定會傳回與 Char.GetUnicodeCategory 方法相同的 UnicodeCategory 值。Note that CharUnicodeInfo.GetUnicodeCategory method does not always return the same UnicodeCategory value as the Char.GetUnicodeCategory method when passed a particular character as a parameter. CharUnicodeInfo.GetUnicodeCategory 方法是設計來反映 Unicode 標準的目前版本。The CharUnicodeInfo.GetUnicodeCategory method is designed to reflect the current version of the Unicode standard. 相反地,雖然 Char.GetUnicodeCategory 方法通常會反映 Unicode 標準的目前版本,但它可能會傳回以舊版標準為基礎的字元類別,或者,它可能會傳回與目前標準不同的類別,以保留回溯相容性。In contrast, although the Char.GetUnicodeCategory method usually reflects the current version of the Unicode standard, it might return a character's category based on a previous version of the standard, or it might return a category that differs from the current standard to preserve backward compatibility.

另請參閱

適用於