CharUnicodeInfo.GetUnicodeCategory 方法

參考

定義

命名空間:: System.Globalization

組件:: System.Globalization.dll

組件:: System.Runtime.dll

組件:: mscorlib.dll

組件:: netstandard.dll

重要

部分資訊涉及發行前產品，在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊，不做任何明確或隱含的瑕疵擔保。

取得 Unicode 字元的 Unicode 分類。

多載

GetUnicodeCategory(Char)	取得指定之字元的 Unicode 分類。
GetUnicodeCategory(Int32)	取得指定之字元的 Unicode 分類。
GetUnicodeCategory(String, Int32)	取得字元的 Unicode 分類，其位於指定字串的指定索引處。

GetUnicodeCategory(Char)

來源:: CharUnicodeInfo.cs

來源:: CharUnicodeInfo.cs

來源:: CharUnicodeInfo.cs

取得指定之字元的 Unicode 分類。

public:
 static System::Globalization::UnicodeCategory GetUnicodeCategory(char ch);

public static System.Globalization.UnicodeCategory GetUnicodeCategory (char ch);

static member GetUnicodeCategory : char -> System.Globalization.UnicodeCategory

Public Shared Function GetUnicodeCategory (ch As Char) As UnicodeCategory

參數

ch: Char

要取得 Unicode 分類的 Unicode 字元。

傳回

UnicodeCategory

UnicodeCategory 值，指出指定之字元的分類。

範例

下列程式代碼範例顯示每個方法針對不同類型的字元所傳回的值。

using namespace System;
using namespace System::Globalization;
void PrintProperties( Char c );
int main()
{
   Console::WriteLine( "                                        c  Num   Dig   Dec   UnicodeCategory" );
   Console::Write( "U+0061 LATIN SMALL LETTER A            " );
   PrintProperties( L'a' );
   Console::Write( "U+0393 GREEK CAPITAL LETTER GAMMA      " );
   PrintProperties( L'\u0393' );
   Console::Write( "U+0039 DIGIT NINE                      " );
   PrintProperties( L'9' );
   Console::Write( "U+00B2 SUPERSCRIPT TWO                 " );
   PrintProperties( L'\u00B2' );
   Console::Write( "U+00BC VULGAR FRACTION ONE QUARTER     " );
   PrintProperties( L'\u00BC' );
   Console::Write( "U+0BEF TAMIL DIGIT NINE                " );
   PrintProperties( L'\u0BEF' );
   Console::Write( "U+0BF0 TAMIL NUMBER TEN                " );
   PrintProperties( L'\u0BF0' );
   Console::Write( "U+0F33 TIBETAN DIGIT HALF ZERO         " );
   PrintProperties( L'\u0F33' );
   Console::Write( "U+2788 CIRCLED SANS-SERIF DIGIT NINE   " );
   PrintProperties( L'\u2788' );
}

void PrintProperties( Char c )
{
   Console::Write( " {0,-3}", c );
   Console::Write( " {0,-5}", CharUnicodeInfo::GetNumericValue( c ) );
   Console::Write( " {0,-5}", CharUnicodeInfo::GetDigitValue( c ) );
   Console::Write( " {0,-5}", CharUnicodeInfo::GetDecimalDigitValue( c ) );
   Console::WriteLine( "{0}", CharUnicodeInfo::GetUnicodeCategory( c ) );
}

/*
This code produces the following output.  Some characters might not display at the console.

                                        c  Num   Dig   Dec   UnicodeCategory
U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
U+0393 GREEK CAPITAL LETTER GAMMA       Γ   -1    -1    -1   UppercaseLetter
U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
U+00B2 SUPERSCRIPT TWO                  ²   2     2     -1   OtherNumber
U+00BC VULGAR FRACTION ONE QUARTER      ¼   0.25  -1    -1   OtherNumber
U+0BEF TAMIL DIGIT NINE                 ௯   9     9     9    DecimalDigitNumber
U+0BF0 TAMIL NUMBER TEN                 ௰   10    -1    -1   OtherNumber
U+0F33 TIBETAN DIGIT HALF ZERO          ༳   -0.5  -1    -1   OtherNumber
U+2788 CIRCLED SANS-SERIF DIGIT NINE    ➈   9     9     -1   OtherNumber

*/

using System;
using System.Globalization;

public class SamplesCharUnicodeInfo  {

   public static void Main()  {

      Console.WriteLine( "                                        c  Num   Dig   Dec   UnicodeCategory" );

      Console.Write( "U+0061 LATIN SMALL LETTER A            " );
      PrintProperties( 'a' );

      Console.Write( "U+0393 GREEK CAPITAL LETTER GAMMA      " );
      PrintProperties( '\u0393' );

      Console.Write( "U+0039 DIGIT NINE                      " );
      PrintProperties( '9' );

      Console.Write( "U+00B2 SUPERSCRIPT TWO                 " );
      PrintProperties( '\u00B2' );

      Console.Write( "U+00BC VULGAR FRACTION ONE QUARTER     " );
      PrintProperties( '\u00BC' );

      Console.Write( "U+0BEF TAMIL DIGIT NINE                " );
      PrintProperties( '\u0BEF' );

      Console.Write( "U+0BF0 TAMIL NUMBER TEN                " );
      PrintProperties( '\u0BF0' );

      Console.Write( "U+0F33 TIBETAN DIGIT HALF ZERO         " );
      PrintProperties( '\u0F33' );

      Console.Write( "U+2788 CIRCLED SANS-SERIF DIGIT NINE   " );
      PrintProperties( '\u2788' );
   }

   public static void PrintProperties( char c )  {
      Console.Write( " {0,-3}", c );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetNumericValue( c ) );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetDigitValue( c ) );
      Console.Write( " {0,-5}", CharUnicodeInfo.GetDecimalDigitValue( c ) );
      Console.WriteLine( "{0}", CharUnicodeInfo.GetUnicodeCategory( c ) );
   }
}


/*
This code produces the following output.  Some characters might not display at the console.

                                        c  Num   Dig   Dec   UnicodeCategory
U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
U+0393 GREEK CAPITAL LETTER GAMMA       Γ   -1    -1    -1   UppercaseLetter
U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
U+00B2 SUPERSCRIPT TWO                  ²   2     2     -1   OtherNumber
U+00BC VULGAR FRACTION ONE QUARTER      ¼   0.25  -1    -1   OtherNumber
U+0BEF TAMIL DIGIT NINE                 ௯   9     9     9    DecimalDigitNumber
U+0BF0 TAMIL NUMBER TEN                 ௰   10    -1    -1   OtherNumber
U+0F33 TIBETAN DIGIT HALF ZERO          ༳   -0.5  -1    -1   OtherNumber
U+2788 CIRCLED SANS-SERIF DIGIT NINE    ➈   9     9     -1   OtherNumber

*/

Imports System.Globalization

Public Class SamplesCharUnicodeInfo

   Public Shared Sub Main()

      Console.WriteLine("                                        c  Num   Dig   Dec   UnicodeCategory")

      Console.Write("U+0061 LATIN SMALL LETTER A            ")
      PrintProperties("a"c)

      Console.Write("U+0393 GREEK CAPITAL LETTER GAMMA      ")
      PrintProperties(ChrW(&H0393))

      Console.Write("U+0039 DIGIT NINE                      ")
      PrintProperties("9"c)

      Console.Write("U+00B2 SUPERSCRIPT TWO                 ")
      PrintProperties(ChrW(&H00B2))

      Console.Write("U+00BC VULGAR FRACTION ONE QUARTER     ")
      PrintProperties(ChrW(&H00BC))

      Console.Write("U+0BEF TAMIL DIGIT NINE                ")
      PrintProperties(ChrW(&H0BEF))

      Console.Write("U+0BF0 TAMIL NUMBER TEN                ")
      PrintProperties(ChrW(&H0BF0))

      Console.Write("U+0F33 TIBETAN DIGIT HALF ZERO         ")
      PrintProperties(ChrW(&H0F33))

      Console.Write("U+2788 CIRCLED SANS-SERIF DIGIT NINE   ")
      PrintProperties(ChrW(&H2788))

   End Sub

   Public Shared Sub PrintProperties(c As Char)
      Console.Write(" {0,-3}", c)
      Console.Write(" {0,-5}", CharUnicodeInfo.GetNumericValue(c))
      Console.Write(" {0,-5}", CharUnicodeInfo.GetDigitValue(c))
      Console.Write(" {0,-5}", CharUnicodeInfo.GetDecimalDigitValue(c))
      Console.WriteLine("{0}", CharUnicodeInfo.GetUnicodeCategory(c))
   End Sub

End Class


'This code produces the following output.  Some characters might not display at the console.
'
'                                        c  Num   Dig   Dec   UnicodeCategory
'U+0061 LATIN SMALL LETTER A             a   -1    -1    -1   LowercaseLetter
'U+0393 GREEK CAPITAL LETTER GAMMA       Γ   -1    -1    -1   UppercaseLetter
'U+0039 DIGIT NINE                       9   9     9     9    DecimalDigitNumber
'U+00B2 SUPERSCRIPT TWO                  ²   2     2     -1   OtherNumber
'U+00BC VULGAR FRACTION ONE QUARTER      ¼   0.25  -1    -1   OtherNumber
'U+0BEF TAMIL DIGIT NINE                 ௯   9     9     9    DecimalDigitNumber
'U+0BF0 TAMIL NUMBER TEN                 ௰   10    -1    -1   OtherNumber
'U+0F33 TIBETAN DIGIT HALF ZERO          ༳   -0.5  -1    -1   OtherNumber
'U+2788 CIRCLED SANS-SERIF DIGIT NINE    ➈   9     9     -1   OtherNumber

備註

Unicode 字元分成類別。字元的類別是其中一個屬性。例如，字元可能是大寫字母、小寫字母、小數位數、字母數位、連線標點符號、數學符號或貨幣符號。類別 UnicodeCategory 會傳回 Unicode 字元的類別。如需 Unicode 字元的詳細資訊，請參閱 Unicode Standard。

方法 GetUnicodeCategory 假設 ch 對應至單一語言字元，並傳回其類別。這表示，如果是 Surrogate 配對，它會傳 UnicodeCategory.Surrogate 回，而不是 Surrogate 所屬的類別。例如，Ugaritic 字母會佔用字碼點 U+10380 到 U+1039F。下列範例使用 ConvertFromUtf32 方法來具現化代表 UGARITIC LETTER ALPA (U+10380) 的字串，這是 Ugaritic 字母的第一個字母。如範例的輸出所示， IsNumber(Char) 如果傳遞高 Surrogate 或這個字元的低 Surrogate，則方法會傳回 false 。

int utf32 = 0x10380;       // UGARITIC LETTER ALPA
string surrogate = Char.ConvertFromUtf32(utf32);
foreach (var ch in surrogate)
    Console.WriteLine($"U+{(ushort)ch:X4}: {System.Globalization.CharUnicodeInfo.GetUnicodeCategory(ch):G}");
// The example displays the following output:
//       U+D800: Surrogate
//       U+DF80: Surrogate

Dim utf32 As Integer = &h10380       ' UGARITIC LETTER ALPA
Dim surrogate As String = Char.ConvertFromUtf32(utf32)
For Each ch In surrogate
   Console.WriteLine("U+{0:X4}: {1:G}", 
                     Convert.ToUInt16(ch), 
                     System.Globalization.CharUnicodeInfo.GetUnicodeCategory(ch))
Next
' The example displays the following output:
'       U+D800: Surrogate
'       U+DF80: Surrogate

請注意， CharUnicodeInfo.GetUnicodeCategory 當傳遞特定字元做為參數時，不一定會傳回與方法相同的 UnicodeCategory 值 Char.GetUnicodeCategory 。方法 CharUnicodeInfo.GetUnicodeCategory 的設計目的是要反映目前版本的 Unicode 標準。相反地，雖然 Char.GetUnicodeCategory 方法通常會反映 Unicode 標準的目前版本，但它可能會根據舊版的標準傳回字元的類別，或者可能會傳回與目前標準不同的類別，以保留回溯相容性。

另請參閱

UnicodeCategory

適用於

GetUnicodeCategory(Int32)

來源:: CharUnicodeInfo.cs

來源:: CharUnicodeInfo.cs

來源:: CharUnicodeInfo.cs

取得指定之字元的 Unicode 分類。

public:
 static System::Globalization::UnicodeCategory GetUnicodeCategory(int codePoint);

public static System.Globalization.UnicodeCategory GetUnicodeCategory (int codePoint);

static member GetUnicodeCategory : int -> System.Globalization.UnicodeCategory

Public Shared Function GetUnicodeCategory (codePoint As Integer) As UnicodeCategory

參數

codePoint: Int32

表示 Unicode 字元 32 位元字碼指標值的數字。

傳回

UnicodeCategory

UnicodeCategory 值，指出指定之字元的分類。

適用於

GetUnicodeCategory(String, Int32)

來源:: CharUnicodeInfo.cs

來源:: CharUnicodeInfo.cs

來源:: CharUnicodeInfo.cs

取得字元的 Unicode 分類，其位於指定字串的指定索引處。

public:
 static System::Globalization::UnicodeCategory GetUnicodeCategory(System::String ^ s, int index);

public static System.Globalization.UnicodeCategory GetUnicodeCategory (string s, int index);

static member GetUnicodeCategory : string * int -> System.Globalization.UnicodeCategory

Public Shared Function GetUnicodeCategory (s As String, index As Integer) As UnicodeCategory

參數

s: String

String，包含要取得 Unicode 分類的 Unicode 字元。

index: Int32

要取得 Unicode 分類之 Unicode 字元的索引。

傳回

UnicodeCategory

UnicodeCategory 值，指出位於指定字串之指定索引處的字元分類。

例外狀況

ArgumentNullException

s 為 null。

ArgumentOutOfRangeException

index 在 s 的有效索引範圍之外。

範例

下列程式代碼範例顯示每個方法針對不同類型的字元所傳回的值。

using namespace System;
using namespace System::Globalization;
int main()
{

   // The String to get information for.
   String^ s = "a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788";
   Console::WriteLine( "String: {0}", s );

   // Print the values for each of the characters in the string.
   Console::WriteLine( "index c  Num   Dig   Dec   UnicodeCategory" );
   for ( int i = 0; i < s->Length; i++ )
   {
      Console::Write( "{0,-5} {1,-3}", i, s[ i ] );
      Console::Write( " {0,-5}", CharUnicodeInfo::GetNumericValue( s, i ) );
      Console::Write( " {0,-5}", CharUnicodeInfo::GetDigitValue( s, i ) );
      Console::Write( " {0,-5}", CharUnicodeInfo::GetDecimalDigitValue( s, i ) );
      Console::WriteLine( "{0}", CharUnicodeInfo::GetUnicodeCategory( s, i ) );

   }
}

/*
This code produces the following output.  Some characters might not display at the console.

String: a9Γ²¼௯௰➈
index c  Num   Dig   Dec   UnicodeCategory
0     a   -1    -1    -1   LowercaseLetter
1     9   9     9     9    DecimalDigitNumber
2     Γ   -1    -1    -1   UppercaseLetter
3     ²   2     2     -1   OtherNumber
4     ¼   0.25  -1    -1   OtherNumber
5     ௯   9     9     9    DecimalDigitNumber
6     ௰   10    -1    -1   OtherNumber
7     ➈   9     9     -1   OtherNumber

*/

using System;
using System.Globalization;

public class SamplesCharUnicodeInfo  {

   public static void Main()  {

      // The String to get information for.
      String s = "a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788";
      Console.WriteLine( "String: {0}", s );

      // Print the values for each of the characters in the string.
      Console.WriteLine( "index c  Num   Dig   Dec   UnicodeCategory" );
      for ( int i = 0; i < s.Length; i++ )  {
         Console.Write( "{0,-5} {1,-3}", i, s[i] );
         Console.Write( " {0,-5}", CharUnicodeInfo.GetNumericValue( s, i ) );
         Console.Write( " {0,-5}", CharUnicodeInfo.GetDigitValue( s, i ) );
         Console.Write( " {0,-5}", CharUnicodeInfo.GetDecimalDigitValue( s, i ) );
         Console.WriteLine( "{0}", CharUnicodeInfo.GetUnicodeCategory( s, i ) );
      }
   }
}


/*
This code produces the following output.  Some characters might not display at the console.

String: a9Γ²¼௯௰➈
index c  Num   Dig   Dec   UnicodeCategory
0     a   -1    -1    -1   LowercaseLetter
1     9   9     9     9    DecimalDigitNumber
2     Γ   -1    -1    -1   UppercaseLetter
3     ²   2     2     -1   OtherNumber
4     ¼   0.25  -1    -1   OtherNumber
5     ௯   9     9     9    DecimalDigitNumber
6     ௰   10    -1    -1   OtherNumber
7     ➈   9     9     -1   OtherNumber

*/

Imports System.Globalization

Public Class SamplesCharUnicodeInfo

   Public Shared Sub Main()

      ' The String to get information for.
      Dim s As [String] = "a9\u0393\u00B2\u00BC\u0BEF\u0BF0\u2788"
      Console.WriteLine("String: {0}", s)

      ' Print the values for each of the characters in the string.
      Console.WriteLine("index c  Num   Dig   Dec   UnicodeCategory")
      Dim i As Integer
      For i = 0 To s.Length - 1
         Console.Write("{0,-5} {1,-3}", i, s(i))
         Console.Write(" {0,-5}", CharUnicodeInfo.GetNumericValue(s, i))
         Console.Write(" {0,-5}", CharUnicodeInfo.GetDigitValue(s, i))
         Console.Write(" {0,-5}", CharUnicodeInfo.GetDecimalDigitValue(s, i))
         Console.WriteLine("{0}", CharUnicodeInfo.GetUnicodeCategory(s, i))
      Next i

   End Sub

End Class


'This code produces the following output.  Some characters might not display at the console.
'
'String: a9Γ²¼௯௰➈
'index c  Num   Dig   Dec   UnicodeCategory
'0     a   -1    -1    -1   LowercaseLetter
'1     9   9     9     9    DecimalDigitNumber
'2     Γ   -1    -1    -1   UppercaseLetter
'3     ²   2     2     -1   OtherNumber
'4     ¼   0.25  -1    -1   OtherNumber
'5     ௯   9     9     9    DecimalDigitNumber
'6     ௰   10    -1    -1   OtherNumber
'7     ➈   9     9     -1   OtherNumber

備註

Char如果位於位置index的物件是有效 Surrogate 配對的第一個字元，GetUnicodeCategory(String, Int32)此方法會傳回 Surrogate 配對的 Unicode 類別，而不是傳UnicodeCategory.Surrogate回。例如，Ugaritic 字母會佔用字碼點 U+10380 到 U+1039F。下列範例使用 ConvertFromUtf32 方法來具現化代表 UGARITIC LETTER ALPA (U+10380) 的字串，這是 Ugaritic 字母的第一個字母。如範例的輸出所示， GetUnicodeCategory(String, Int32) 如果方法傳遞這個字元的高 Surrogate，此方法會傳回 UnicodeCategory.OtherLetter ，這表示它會考慮 Surrogate 配對。不過，如果傳遞低 Surrogate，則只會考慮隔離中的低 Surrogate，並傳 UnicodeCategory.Surrogate回。

int utf32 = 0x10380;       // UGARITIC LETTER ALPA
string surrogate = Char.ConvertFromUtf32(utf32);
for (int ctr = 0; ctr < surrogate.Length; ctr++)
    Console.WriteLine($"U+{(ushort)surrogate[ctr]:X4}: {System.Globalization.CharUnicodeInfo.GetUnicodeCategory(surrogate, ctr):G}");
// The example displays the following output:
//       U+D800: OtherLetter
//       U+DF80: Surrogate

Dim utf32 As Integer = &h10380       ' UGARITIC LETTER ALPA
Dim surrogate As String = Char.ConvertFromUtf32(utf32)
For ctr As Integer = 0 To surrogate.Length - 1
   Console.WriteLine("U+{0:X4}: {1:G}", 
                     Convert.ToUInt16(surrogate(ctr)), 
                     System.Globalization.CharUnicodeInfo.GetUnicodeCategory(surrogate, ctr))
Next
' The example displays the following output:
'       U+D800: OtherLetter
'       U+DF80: Surrogate

請注意， CharUnicodeInfo.GetUnicodeCategory 當傳遞特定字元做為參數時，方法不一定會傳回與方法相同的 UnicodeCategory 值 Char.GetUnicodeCategory 。方法 CharUnicodeInfo.GetUnicodeCategory 的設計目的是要反映目前版本的 Unicode 標準。相反地，雖然 Char.GetUnicodeCategory 方法通常會反映 Unicode 標準的目前版本，但它可能會根據舊版的標準傳回字元的類別，或者可能會傳回與目前標準不同的類別，以保留回溯相容性。

另請參閱

UnicodeCategory

適用於

CharUnicodeInfo.GetUnicodeCategory 方法

定義

多載

GetUnicodeCategory(Char)

參數

傳回

範例

備註

另請參閱

適用於

GetUnicodeCategory(Int32)

參數

傳回

適用於

GetUnicodeCategory(String, Int32)

參數

傳回

例外狀況

範例

備註

另請參閱

適用於

意見反應

其他資源