NormalizationForm NormalizationForm NormalizationForm NormalizationForm Enum

定义

定义要执行的规范化的类型。Defines the type of normalization to perform.

public enum class NormalizationForm
[System.Runtime.InteropServices.ComVisible(true)]
public enum NormalizationForm
type NormalizationForm = 
Public Enum NormalizationForm
继承
NormalizationFormNormalizationFormNormalizationFormNormalizationForm
属性

字段

FormC FormC FormC FormC 1

指示 Unicode 字符串使用完全标准分解进行规范化,然后将序列替换为其主复合(如果可能)。Indicates that a Unicode string is normalized using full canonical decomposition, followed by the replacement of sequences with their primary composites, if possible.

FormD FormD FormD FormD 2

指示 Unicode 字符串使用完全标准分解进行规范化。Indicates that a Unicode string is normalized using full canonical decomposition.

FormKC FormKC FormKC FormKC 5

指示 Unicode 字符串使用完全兼容分解进行规范化,然后将序列替换为其主复合(如果可能)。Indicates that a Unicode string is normalized using full compatibility decomposition, followed by the replacement of sequences with their primary composites, if possible.

FormKD FormKD FormKD FormKD 6

指示 Unicode 字符串使用完全兼容分解进行规范化。Indicates that a Unicode string is normalized using full compatibility decomposition.

注解

因为它们表示相同的字符,一些 Unicode 序列被视为等效。Some Unicode sequences are considered equivalent because they represent the same character. 例如,下面会被视为等效其中的任何可用于表示"ắ":For example, the following are considered equivalent because any of these can be used to represent "ắ":

  • "\u1EAF""\u1EAF"

  • "\u0103\u0301""\u0103\u0301"

  • "\u0061\u0306\u0301""\u0061\u0306\u0301"

但是,序号,即二进制比较,请考虑这些序列不同因为它们包含不同的 Unicode 代码值。However, ordinal, that is, binary, comparisons consider these sequences different because they contain different Unicode code values. 在执行之前的序号比较,应用程序必须规范化这些字符串以将它们分解为其基本组件。Before performing ordinal comparisons, applications must normalize these strings to decompose them into their basic components.

每个复合的 Unicode 字符映射到更多基础序列的一个或多个字符。Each composite Unicode character is mapped to a more basic sequence of one or more characters. 分解过程将组合字符在字符串中的替换它们更基本的映射。The process of decomposition replaces composite characters in a string with their more basic mappings. 完整分解以递归方式执行此替换之前的所有字符在字符串中可以进一步分解。A full decomposition recursively performs this replacement until none of the characters in the string can be decomposed further.

Unicode 定义了两种类型的分解: 兼容性分解和标准分解。Unicode defines two types of decompositions: compatibility decomposition and canonical decomposition. 在兼容分解格式设置信息可能会丢失。In compatibility decomposition, formatting information might be lost. 在标准分解,这是兼容分解的子集,保留格式设置信息。In canonical decomposition, which is a subset of compatibility decomposition, formatting information is preserved.

两个集的字符被视为具有规范化等效性,如果其完全标准分解完全相同。Two sets of characters are considered to have canonical equivalence if their full canonical decompositions are identical. 同样,两个集的字符被视为具有兼容性等效性,如果其完全兼容分解完全相同。Likewise, two sets of characters are considered to have compatibility equivalence if their full compatibility decompositions are identical.

有关规范化、 分解和等效性的详细信息,请参阅Unicode 标准附录 #15:Unicode 范式unicode.org 处。For more information about normalization, decompositions and equivalence, see Unicode Standard Annex #15: Unicode Normalization Forms at unicode.org.

适用于

另请参阅