NormalizationForm 列舉

定義

定義要執行的正規化類型。Defines the type of normalization to perform.

public enum class NormalizationForm
[System.Runtime.InteropServices.ComVisible(true)]
public enum NormalizationForm
type NormalizationForm = 
Public Enum NormalizationForm
繼承
NormalizationForm
屬性

欄位

FormC 1

表示 Unicode 字串已使用完整標準分解進行標準化,並接著以序列的主要複合取代該序列 (若可能的話)。Indicates that a Unicode string is normalized using full canonical decomposition, followed by the replacement of sequences with their primary composites, if possible.

FormD 2

表示 Unicode 字串已使用完整標準分解進行標準化。Indicates that a Unicode string is normalized using full canonical decomposition.

FormKC 5

表示 Unicode 字串已使用完整相容性分解進行標準化,並接著以序列的主要複合取代該序列 (若可能的話)。Indicates that a Unicode string is normalized using full compatibility decomposition, followed by the replacement of sequences with their primary composites, if possible.

FormKD 6

表示 Unicode 字串已使用完整相容性分解進行標準化。Indicates that a Unicode string is normalized using full compatibility decomposition.

備註

某些 Unicode 序列會被視為相等, 因為它們代表相同的字元。Some Unicode sequences are considered equivalent because they represent the same character. 例如, 下列內容視為相等, 因為其中任何一項都可以用來代表 "ắ":For example, the following are considered equivalent because any of these can be used to represent "ắ":

  • "\u1EAF""\u1EAF"

  • "\u0103\u0301""\u0103\u0301"

  • "\u0061\u0306\u0301""\u0061\u0306\u0301"

不過, 序數 (也就是 binary) 比較會將這些序列視為不同, 因為它們包含不同的 Unicode 代碼值。However, ordinal, that is, binary, comparisons consider these sequences different because they contain different Unicode code values. 在執行序數比較之前, 應用程式必須將這些字串正規化, 以將它們分解成其基本元件。Before performing ordinal comparisons, applications must normalize these strings to decompose them into their basic components.

每個複合 Unicode 字元都會對應至一個或多個字元的較基本序列。Each composite Unicode character is mapped to a more basic sequence of one or more characters. 分解程式會將字串中的複合字元取代為其更基本的對應。The process of decomposition replaces composite characters in a string with their more basic mappings. 完整分解會以遞迴方式執行此取代, 直到字串中沒有任何字元可進一步分解為止。A full decomposition recursively performs this replacement until none of the characters in the string can be decomposed further.

Unicode 定義兩種類型的分解: 相容性分解和標準分解。Unicode defines two types of decompositions: compatibility decomposition and canonical decomposition. 在相容性分解中, 可能會遺失格式資訊。In compatibility decomposition, formatting information might be lost. 在標準分解中, 這是相容性分解的子集, 會保留格式資訊。In canonical decomposition, which is a subset of compatibility decomposition, formatting information is preserved.

如果兩組字元的完整標準分解相同, 則視為具有標準等價。Two sets of characters are considered to have canonical equivalence if their full canonical decompositions are identical. 同樣地, 如果它們的完整相容性分解相同, 則會將兩組字元視為具有相容性的等價。Likewise, two sets of characters are considered to have compatibility equivalence if their full compatibility decompositions are identical.

如需正規化、分解和等價的詳細資訊, 請參閱 Unicode 標準附錄 #15:Unicode 正規化窗體的 unicode.org。For more information about normalization, decompositions and equivalence, see Unicode Standard Annex #15: Unicode Normalization Forms at unicode.org.

適用於

另請參閱