UnicodeCategory 列舉

定義

定義字元的 Unicode 分類。Defines the Unicode category of a character.

public enum class UnicodeCategory
public enum UnicodeCategory
[System.Serializable]
public enum UnicodeCategory
[System.Runtime.InteropServices.ComVisible(true)]
[System.Serializable]
public enum UnicodeCategory
type UnicodeCategory = 
Public Enum UnicodeCategory
繼承
UnicodeCategory
屬性

欄位

ClosePunctuation 21

一個成對標點符號的開閉字元 (例如括弧、方括弧和大括弧)。Closing character of one of the paired punctuation marks, such as parentheses, square brackets, and braces. 以 Unicode 指定 "Pe" 表示 (標點符號、關閉)。Signified by the Unicode designation "Pe" (punctuation, close). 值為 21。The value is 21.

ConnectorPunctuation 18

連接兩個字元的連接線標點符號字元。Connector punctuation character that connects two characters. 以 Unicode 指定 "Pc" 表示 (其他、連接線)。Signified by the Unicode designation "Pc" (punctuation, connector). 值為 18。The value is 18.

Control 14

Unicode 值為 U+007F 或者落在 U+0000 到 U+001F 範圍或 U+0080 到 U+009F 範圍內的控制碼字元。Control code character, with a Unicode value of U+007F or in the range U+0000 through U+001F or U+0080 through U+009F. 以 Unicode 指定 "Cc" 表示 (其他、控制項)。Signified by the Unicode designation "Cc" (other, control). 值為 14。The value is 14.

CurrencySymbol 26

貨幣符號字元。Currency symbol character. 以 Unicode 指定 "Sc" 表示 (符號、貨幣)。Signified by the Unicode designation "Sc" (symbol, currency). 值為 26。The value is 26.

DashPunctuation 19

破折號或連字號字元。Dash or hyphen character. 以 Unicode 指定 "Pd" 表示 (標點符號、破折號)。Signified by the Unicode designation "Pd" (punctuation, dash). 值為 19。The value is 19.

DecimalDigitNumber 8

十進位數字字元,即 0 到 9 範圍內的字元。Decimal digit character, that is, a character in the range 0 through 9. 以 Unicode 指定 "Nd" 表示 (數字、十進位數字)。Signified by the Unicode designation "Nd" (number, decimal digit). 值為 8。The value is 8.

EnclosingMark 7

封入標記字元,即圍住所有先前字元直到基底字元 (含) 的非間距組合字元。Enclosing mark character, which is a nonspacing combining character that surrounds all previous characters up to and including a base character. 以 Unicode 指定 "Me" 表示 (記號、封入)。Signified by the Unicode designation "Me" (mark, enclosing). 值為 7。The value is 7.

FinalQuotePunctuation 23

關閉或最終引號字元。Closing or final quotation mark character. 以 Unicode 指定 "Pf" 表示 (標點符號、最終引號)。Signified by the Unicode designation "Pf" (punctuation, final quote). 值為 23。The value is 23.

Format 15

影響文字版面配置或文字處理作業但一般不會呈現的格式字元。Format character that affects the layout of text or the operation of text processes, but is not normally rendered. 以 Unicode 指定 "Cf" 表示 (其他、格式)。Signified by the Unicode designation "Cf" (other, format). 值為 15。The value is 15.

InitialQuotePunctuation 22

開啟或初始引號字元。Opening or initial quotation mark character. 以 Unicode 指定 "Pi" 表示 (標點符號、初始引號)。Signified by the Unicode designation "Pi" (punctuation, initial quote). 值為 22。The value is 22.

LetterNumber 9

以字母 (非十進位數字) 代表的數字 (例如,羅馬數字 5 就是 "V")。Number represented by a letter, instead of a decimal digit, for example, the Roman numeral for five, which is "V". 指標是以 Unicode 指定 "Nl" 表示 (數字、字母)。The indicator is signified by the Unicode designation "Nl" (number, letter). 值為 9。The value is 9.

LineSeparator 12

用來分隔文字行的字元。Character that is used to separate lines of text. 以 Unicode 指定 "Zl" 表示 (分隔符號、行)。Signified by the Unicode designation "Zl" (separator, line). 值為 12。The value is 12.

LowercaseLetter 1

小寫字母。Lowercase letter. 以 Unicode 指定 "Ll" 表示 (字母、大寫)。Signified by the Unicode designation "Ll" (letter, lowercase). 值為 1。The value is 1.

MathSymbol 25

數學符號字元 (例如 "+" 或 "= ")。Mathematical symbol character, such as "+" or "= ". 以 Unicode 指定 "Sm" 表示 (符號、數學)。Signified by the Unicode designation "Sm" (symbol, math). 值為 25。The value is 25.

ModifierLetter 3

修飾詞字母字元,這是表示修改前導字母的獨立式間距字元。Modifier letter character, which is free-standing spacing character that indicates modifications of a preceding letter. 以 Unicode 指定 "Lm" 表示 (字母、修飾詞)。Signified by the Unicode designation "Lm" (letter, modifier). 此值為 3。The value is 3.

ModifierSymbol 27

表示修改周圍字元的修飾詞符號字元。Modifier symbol character, which indicates modifications of surrounding characters. 例如,分數斜線表示左邊數字為分子,右邊數字為分母。For example, the fraction slash indicates that the number to the left is the numerator and the number to the right is the denominator. 指標是以 Unicode 指定 "Sk" 表示 (符號、修飾詞)。The indicator is signified by the Unicode designation "Sk" (symbol, modifier). 值為 27。The value is 27.

NonSpacingMark 5

表示修改基底字元的非間距字元。Nonspacing character that indicates modifications of a base character. 以 Unicode 指定 "Mn" 表示 (記號、非間距)。Signified by the Unicode designation "Mn" (mark, nonspacing). 值為 5。The value is 5.

OpenPunctuation 20

一個成對標點符號的開啟字元 (例如括弧、方括弧和大括弧)。Opening character of one of the paired punctuation marks, such as parentheses, square brackets, and braces. 以 Unicode 指定 "Ps" 表示 (標點符號、開啟)。Signified by the Unicode designation "Ps" (punctuation, open). 值為 20。The value is 20.

OtherLetter 4

不是大寫字母、小寫字母、字首大寫字母或修飾詞字母的字母。Letter that is not an uppercase letter, a lowercase letter, a titlecase letter, or a modifier letter. 以 Unicode 指定 "Lo" 表示 (字母、其他)。Signified by the Unicode designation "Lo" (letter, other). 值為 4。The value is 4.

OtherNotAssigned 29

未指派給任何 Unicode 分類的字元。Character that is not assigned to any Unicode category. 以 Unicode 指定 "Cn" 表示 (其他、未指派)。Signified by the Unicode designation "Cn" (other, not assigned). 值為 29。The value is 29.

OtherNumber 10

不是十進位數字也不是字母數字的數字 (例如,分數 1/2)。Number that is neither a decimal digit nor a letter number, for example, the fraction 1/2. 指標是以 Unicode 指定 "No" 表示 (數字、其他)。The indicator is signified by the Unicode designation "No" (number, other). 值為 10。The value is 10.

OtherPunctuation 24

不是連接線、破折號、開啟標點符號、關閉標點符號、初始引號或最終引號的標點符號字元。Punctuation character that is not a connector, a dash, open punctuation, close punctuation, an initial quote, or a final quote. 以 Unicode 指定 "Po" 表示 (標點符號、其他)。Signified by the Unicode designation "Po" (punctuation, other). 值為 24。The value is 24.

OtherSymbol 28

不是數學符號、貨幣符號或修飾詞符號的符號字元。Symbol character that is not a mathematical symbol, a currency symbol or a modifier symbol. 以 Unicode 指定 "So" 表示 (符號、其他)。Signified by the Unicode designation "So" (symbol, other). 值為 28。The value is 28.

ParagraphSeparator 13

用來分隔段落的字元。Character used to separate paragraphs. 以 Unicode 指定 "Zp" 表示 (分隔符號、段落)。Signified by the Unicode designation "Zp" (separator, paragraph). 值為 13。The value is 13.

PrivateUse 17

Unicode 值落在 U+E000 到 U+F8FF 範圍內的私用字元。Private-use character, with a Unicode value in the range U+E000 through U+F8FF. 以 Unicode 指定 "Co" 表示 (其他、私用)。Signified by the Unicode designation "Co" (other, private use). 值為 17。The value is 17.

SpaceSeparator 11

沒有字符也不是控制或格式字元的空白字元。Space character, which has no glyph but is not a control or format character. 以 Unicode 指定 "Zs" 表示 (分隔符號、空格)。Signified by the Unicode designation "Zs" (separator, space). 值為 11。The value is 11.

SpacingCombiningMark 6

間距字元,表示修改基底字元並影響該基底字元字符寬度。Spacing character that indicates modifications of a base character and affects the width of the glyph for that base character. 以 Unicode 指定 "Mc" 表示 (記號、間距組合)。Signified by the Unicode designation "Mc" (mark, spacing combining). 值為 6。The value is 6.

Surrogate 16

高 Surrogate 或低 Surrogate 字元。High surrogate or a low surrogate character. Surrogate 字碼值落在 U+D800 到 U+DFFF 的範圍內。Surrogate code values are in the range U+D800 through U+DFFF. 以 Unicode 指定 "Cs" 表示 (其他、Surrogate)。Signified by the Unicode designation "Cs" (other, surrogate). 值為 16。The value is 16.

TitlecaseLetter 2

字首大寫字母。Titlecase letter. 以 Unicode 指定 "Lt" 表示 (字母、字首大寫)。Signified by the Unicode designation "Lt" (letter, titlecase). 值為 2。The value is 2.

UppercaseLetter 0

大寫字母。Uppercase letter. 以 Unicode 指定 "Lu" 表示 (字母、大寫)。Signified by the Unicode designation "Lu" (letter, uppercase). 值為 0。The value is 0.

範例

下列範例會針對 UppercaseLetter 類別目錄中的字元,顯示字元及其對應的程式碼點。The following example displays the characters and their corresponding code points for characters in the UppercaseLetter category. 您可以將 UppercaseLetter 取代為您在指派至 category 變數時感興趣的類別,藉此修改範例以顯示任何其他類別中的字母。You can modify the example to display the letters in any other category by replacing UppercaseLetter with the category of interest to you in the assignment to the category variable. 請注意,某些類別目錄的輸出可能會很廣泛。Note that the output for some categories can be extensive.

using System;
using System.Globalization;

public class Example
{
   public static void Main()
   {
      int ctr = 0;
      UnicodeCategory category = UnicodeCategory.UppercaseLetter;

      for (ushort codePoint = 0; codePoint < ushort.MaxValue; codePoint++) {
         Char ch = Convert.ToChar(codePoint);

         if (CharUnicodeInfo.GetUnicodeCategory(ch) == category) {
            if (ctr % 5 == 0)
               Console.WriteLine();
            Console.Write("{0} (U+{1:X4})     ", ch, codePoint);
            ctr++;
         }
      }
      Console.WriteLine();
      Console.WriteLine("\n{0} characters are in the {1:G} category",
                        ctr, category);
   }
}
Imports System.Globalization

Module Example
   Public Sub Main()
      Dim ctr As Integer = 0
      Dim category As UnicodeCategory = UnicodeCategory.UppercaseLetter
      
      For codePoint As UShort = 0 To UShort.MaxValue - 1
         Dim ch As Char = Convert.ToChar(codePoint)

         If CharUnicodeInfo.GetUnicodeCategory(ch) = category Then
            If ctr Mod 5 = 0 Then Console.WriteLine()
            Console.Write("{0} (U+{1:X4})     ", ch, codePoint)
            ctr += 1
         End If 
      Next
      Console.WriteLine()
      Console.WriteLine()
      Console.WriteLine("{0} characters are in the {1:G} category", 
                        ctr, category)   
   End Sub
End Module

備註

Char.GetUnicodeCategoryCharUnicodeInfo.GetUnicodeCategory 方法會傳回 UnicodeCategory 列舉的成員。A member of the UnicodeCategory enumeration is returned by the Char.GetUnicodeCategory and CharUnicodeInfo.GetUnicodeCategory methods. UnicodeCategory 列舉也用來支援 Char 方法,例如 IsUpper(Char)The UnicodeCategory enumeration is also used to support Char methods, such as IsUpper(Char). 這類方法會判斷指定的字元是否為特定 Unicode 一般分類的成員。Such methods determine whether a specified character is a member of a particular Unicode general category. Unicode 一般分類會定義字元的廣泛分類,也就是指定為字母類型、十進位數、分隔符號、數學符號、標點符號等等。A Unicode general category defines the broad classification of a character, that is, designation as a type of letter, decimal digit, separator, mathematical symbol, punctuation, and so on.

這個列舉是以 Unicode Standard (版本5.0)為基礎。This enumeration is based on The Unicode Standard, version 5.0. 如需詳細資訊,請參閱 Unicode Character Database 中的 "UCD File Format" 和 "General Category Values" 副標題。For more information, see the "UCD File Format" and "General Category Values" subtopics at the Unicode Character Database.

Unicode 標準會定義下列各項:The Unicode Standard defines the following:

代理字組是由兩個程式碼單位序列所組成之單一抽象字元的編碼字元標記法,其中配對的第一個單位是高代理,而第二個是低代理。A surrogate pair is a coded character representation for a single abstract character that consists of a sequence of two code units, where the first unit of the pair is a high surrogate and the second is a low surrogate. 高代理是在 U + D800 到 U + DBFF 範圍內的 Unicode 程式碼點,而低代理是在 U + DC00 到 U + DFFF 範圍內的 Unicode 程式碼點。A high surrogate is a Unicode code point in the range U+D800 through U+DBFF and a low surrogate is a Unicode code point in the range U+DC00 through U+DFFF.

結合字元序列是基底字元和一或多個合併字元的組合。A combining character sequence is a combination of a base character and one or more combining characters. 代理配對代表基底字元或結合字元。A surrogate pair represents a base character or a combining character. 結合字元可以是間距或非空格。A combining character is either spacing or nonspacing. 間距組合字元會在轉譯時,單獨取得間距位置,而非間距組合字元則不會。A spacing combining character takes up a spacing position by itself when rendered, while a nonspacing combining character does not. 變音符號是不間距組合字元的範例。Diacritics are an example of nonspacing combining characters.

修飾詞字母是一個自由的間距字元,如同結合字元,表示修改了前一個字母。A modifier letter is a free-standing spacing character that, like a combining character, indicates modifications of a preceding letter.

封入標記是一個不間距的結合字元,會將所有先前的字元圍繞在基底字元之前。An enclosing mark is a nonspacing combining character that surrounds all previous characters up to and including a base character.

格式字元是一種字元,通常不會轉譯,而是會影響文字處理程式的版面配置或作業。A format character is a character that is not normally rendered but that affects the layout of text or the operation of text processes.

Unicode 標準會定義一些標點符號的數種變化。The Unicode Standard defines several variations to some punctuation marks. 例如,連字號可以是數個代表連字號的代碼值之一,例如 U + 002D (連字號-減號)或 U + 00AD (軟連字號)或 U + 2010 (連字號)或 U + 2011 (不中斷連字號)。For example, a hyphen can be one of several code values that represent a hyphen, such as U+002D (hyphen-minus) or U+00AD (soft hyphen) or U+2010 (hyphen) or U+2011 (nonbreaking hyphen). 虛線、空白字元和引號也是如此。The same is true for dashes, space characters, and quotation marks.

Unicode 標準也會將程式碼指派給給定腳本或語言特定的十進位數標記法,例如 U + 0030 (數位零)和 U + 0660 (阿拉伯文-印度數位零)。The Unicode Standard also assigns codes to representations of decimal digits that are specific to a given script or language, for example, U+0030 (digit zero) and U+0660 (Arabic-Indic digit zero).

適用於

另請參閱