String.Normalize String.Normalize String.Normalize String.Normalize Method

定義

傳回新的字串,其二進位表示為特定的 Unicode 正規化格式。Returns a new string whose binary representation is in a particular Unicode normalization form.

多載

Normalize() Normalize() Normalize() Normalize()

傳回新的字串,其文字值與這個字串相同,但是其二進位表示為 Unicode 正規化格式 C。Returns a new string whose textual value is the same as this string, but whose binary representation is in Unicode normalization form C.

Normalize(NormalizationForm) Normalize(NormalizationForm) Normalize(NormalizationForm)

傳回新的字串,其文字值與這個字串相同,但是其二進位表示為特定的 Unicode 正規化格式。Returns a new string whose textual value is the same as this string, but whose binary representation is in the specified Unicode normalization form.

Normalize() Normalize() Normalize() Normalize()

傳回新的字串,其文字值與這個字串相同,但是其二進位表示為 Unicode 正規化格式 C。Returns a new string whose textual value is the same as this string, but whose binary representation is in Unicode normalization form C.

public:
 System::String ^ Normalize();
public string Normalize ();
member this.Normalize : unit -> string
Public Function Normalize () As String

傳回

新的正規化字串,其文字值與這個字串相同,但是其二進位表示為正規化格式 C。A new, normalized string whose textual value is the same as this string, but whose binary representation is in normalization form C.

例外狀況

目前的執行個體包含無效的 Unicode 字元。The current instance contains invalid Unicode characters.

範例

下列範例會將字串標準化為四個正規化形式的每一個,確認字串已正規化為指定的正規化表單,然後列出標準化字串中的程式碼點。The following example normalizes a string to each of four normalization forms, confirms the string was normalized to the specified normalization form, then lists the code points in the normalized string.

using namespace System;
using namespace System::Text;

void Show( String^ title, String^ s )
{
   Console::Write( "Characters in string {0} = ", title );
   for each (short x in s) {
      Console::Write("{0:X4} ", x);
   }
   Console::WriteLine();
}

int main()
{
   
   // Character c; combining characters acute and cedilla; character 3/4
   array<Char>^temp0 = {L'c',L'\u0301',L'\u0327',L'\u00BE'};
   String^ s1 = gcnew String( temp0 );
   String^ s2 = nullptr;
   String^ divider = gcnew String( '-',80 );
   divider = String::Concat( Environment::NewLine, divider, Environment::NewLine );

   Show( "s1", s1 );
   Console::WriteLine();
   Console::WriteLine( "U+0063 = LATIN SMALL LETTER C" );
   Console::WriteLine( "U+0301 = COMBINING ACUTE ACCENT" );
   Console::WriteLine( "U+0327 = COMBINING CEDILLA" );
   Console::WriteLine( "U+00BE = VULGAR FRACTION THREE QUARTERS" );
   Console::WriteLine( divider );
   Console::WriteLine( "A1) Is s1 normalized to the default form (Form C)?: {0}", s1->IsNormalized() );
   Console::WriteLine( "A2) Is s1 normalized to Form C?:  {0}", s1->IsNormalized( NormalizationForm::FormC ) );
   Console::WriteLine( "A3) Is s1 normalized to Form D?:  {0}", s1->IsNormalized( NormalizationForm::FormD ) );
   Console::WriteLine( "A4) Is s1 normalized to Form KC?: {0}", s1->IsNormalized( NormalizationForm::FormKC ) );
   Console::WriteLine( "A5) Is s1 normalized to Form KD?: {0}", s1->IsNormalized( NormalizationForm::FormKD ) );
   Console::WriteLine( divider );
   Console::WriteLine( "Set string s2 to each normalized form of string s1." );
   Console::WriteLine();
   Console::WriteLine( "U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE" );
   Console::WriteLine( "U+0033 = DIGIT THREE" );
   Console::WriteLine( "U+2044 = FRACTION SLASH" );
   Console::WriteLine( "U+0034 = DIGIT FOUR" );
   Console::WriteLine( divider );
   s2 = s1->Normalize();
   Console::Write( "B1) Is s2 normalized to the default form (Form C)?: " );
   Console::WriteLine( s2->IsNormalized() );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormC );
   Console::Write( "B2) Is s2 normalized to Form C?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormD );
   Console::Write( "B3) Is s2 normalized to Form D?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormD ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKC );
   Console::Write( "B4) Is s2 normalized to Form KC?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKD );
   Console::Write( "B5) Is s2 normalized to Form KD?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKD ) );
   Show( "s2", s2 );
   Console::WriteLine();
}

/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
using System;
using System.Text;

class Example
{
    public static void Main() 
    {
       // Character c; combining characters acute and cedilla; character 3/4
       string s1 = new String( new char[] {'\u0063', '\u0301', '\u0327', '\u00BE'});
       string s2 = null;
       string divider = new String('-', 80);
       divider = String.Concat(Environment.NewLine, divider, Environment.NewLine);
   
       Show("s1", s1);
       Console.WriteLine();
       Console.WriteLine("U+0063 = LATIN SMALL LETTER C");
       Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT");
       Console.WriteLine("U+0327 = COMBINING CEDILLA");
       Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS");
       Console.WriteLine(divider);
   
       Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", 
                                    s1.IsNormalized());
       Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormC));
       Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormD));
       Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKC));
       Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKD));
   
       Console.WriteLine(divider);
   
       Console.WriteLine("Set string s2 to each normalized form of string s1.");
       Console.WriteLine();
       Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE");
       Console.WriteLine("U+0033 = DIGIT THREE");
       Console.WriteLine("U+2044 = FRACTION SLASH");
       Console.WriteLine("U+0034 = DIGIT FOUR");
       Console.WriteLine(divider);
   
       s2 = s1.Normalize();
       Console.Write("B1) Is s2 normalized to the default form (Form C)?: ");
       Console.WriteLine(s2.IsNormalized());
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormC);
       Console.Write("B2) Is s2 normalized to Form C?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormD);
       Console.Write("B3) Is s2 normalized to Form D?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKC);
       Console.Write("B4) Is s2 normalized to Form KC?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKD);
       Console.Write("B5) Is s2 normalized to Form KD?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD));
       Show("s2", s2);
       Console.WriteLine();
    }

    private static void Show(string title, string s)
    {
       Console.Write("Characters in string {0} = ", title);
       foreach(short x in s) {
           Console.Write("{0:X4} ", x);
       }
       Console.WriteLine();
    }
}
/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
Imports System.Text

Class Example
   Public Shared Sub Main()
      ' Character c; combining characters acute and cedilla; character 3/4
      Dim s1 = New [String](New Char() {ChrW(&H0063), ChrW(&H0301), ChrW(&H0327), ChrW(&H00BE)})
      Dim s2 As String = Nothing
      Dim divider = New [String]("-"c, 80)
      divider = [String].Concat(Environment.NewLine, divider, Environment.NewLine)
      
      Show("s1", s1)
      Console.WriteLine()
      Console.WriteLine("U+0063 = LATIN SMALL LETTER C")
      Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT")
      Console.WriteLine("U+0327 = COMBINING CEDILLA")
      Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS")

      Console.WriteLine(divider)
      
      Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", s1.IsNormalized())
      Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", s1.IsNormalized(NormalizationForm.FormC))
      Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", s1.IsNormalized(NormalizationForm.FormD))
      Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", s1.IsNormalized(NormalizationForm.FormKC))
      Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", s1.IsNormalized(NormalizationForm.FormKD))
      
      Console.WriteLine(divider)
      
      Console.WriteLine("Set string s2 to each normalized form of string s1.")
      Console.WriteLine()
      Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE")
      Console.WriteLine("U+0033 = DIGIT THREE")
      Console.WriteLine("U+2044 = FRACTION SLASH")
      Console.WriteLine("U+0034 = DIGIT FOUR")
      Console.WriteLine(divider)
      
      s2 = s1.Normalize()
      Console.Write("B1) Is s2 normalized to the default form (Form C)?: ")
      Console.WriteLine(s2.IsNormalized())
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormC)
      Console.Write("B2) Is s2 normalized to Form C?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormD)
      Console.Write("B3) Is s2 normalized to Form D?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKC)
      Console.Write("B4) Is s2 normalized to Form KC?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKD)
      Console.Write("B5) Is s2 normalized to Form KD?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD))
      Show("s2", s2)
      Console.WriteLine()
   End Sub 
   
   Private Shared Sub Show(title As String, s As String)
      Console.Write("Characters in string {0} = ", title)
      For Each x As Char In s
         Console.Write("{0:X4} ", AscW(x))
      Next 
      Console.WriteLine()
   End Sub 
End Class 
'This example produces the following results:
'
'Characters in string s1 = 0063 0301 0327 00BE
'
'U+0063 = LATIN SMALL LETTER C
'U+0301 = COMBINING ACUTE ACCENT
'U+0327 = COMBINING CEDILLA
'U+00BE = VULGAR FRACTION THREE QUARTERS
'
'--------------------------------------------------------------------------------
'
'A1) Is s1 normalized to the default form (Form C)?: False
'A2) Is s1 normalized to Form C?:  False
'A3) Is s1 normalized to Form D?:  False
'A4) Is s1 normalized to Form KC?: False
'A5) Is s1 normalized to Form KD?: False
'
'--------------------------------------------------------------------------------
'
'Set string s2 to each normalized form of string s1.
'
'U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
'U+0033 = DIGIT THREE
'U+2044 = FRACTION SLASH
'U+0034 = DIGIT FOUR
'
'--------------------------------------------------------------------------------
'
'B1) Is s2 normalized to the default form (Form C)?: True
'Characters in string s2 = 1E09 00BE
'
'B2) Is s2 normalized to Form C?: True
'Characters in string s2 = 1E09 00BE
'
'B3) Is s2 normalized to Form D?: True
'Characters in string s2 = 0063 0327 0301 00BE
'
'B4) Is s2 normalized to Form KC?: True
'Characters in string s2 = 1E09 0033 2044 0034
'
'B5) Is s2 normalized to Form KD?: True
'Characters in string s2 = 0063 0327 0301 0033 2044 0034
'

備註

某些 Unicode 字元有多個對等的二進位表示,其中包含組合和/或複合 Unicode 字元的集合。Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. 例如,下列任何一個程式碼點都可以代表字母 "ắ":For example, any of the following code points can represent the letter "ắ":

  • U+1EAFU+1EAF

  • U+0103 U+0301U+0103 U+0301

  • U+0061 U+0306 U+0301U+0061 U+0306 U+0301

單一字元的多個標記法會使搜尋、排序、比對和其他作業變得更複雜。The existence of multiple representations for a single character complicates searching, sorting, matching, and other operations.

Unicode 標準定義了一個稱為正規化的進程,它會在指定字元的任何對等二進位表示時,傳回一個二進位標記法。The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. 正規化可以使用遵守不同規則的數種演算法(稱為正規化形式)來執行。Normalization can be performed with several algorithms, called normalization forms, that obey different rules. .NET 支援由 Unicode 標準所定義的四種正規化形式(C、D、KC 和 KD)。.NET supports the four normalization forms (C, D, KC, and KD) that are defined by the Unicode standard. 當兩個字串以相同正規化格式表示時,可以使用序數比較來進行比較。When two strings are represented in the same normalization form, they can be compared by using ordinal comparison.

若要標準化並比較兩個字串,請執行下列動作:To normalize and compare two strings, do the following:

  1. 取得要與輸入來源比較的字串,例如檔案或使用者輸入裝置。Obtain the strings to be compared from an input source, such as a file or a user input device.

  2. Normalize()呼叫方法,將字串標準化為正規化格式 C。Call the Normalize() method to normalize the strings to normalization form C.

  3. 若要比較兩個字串,請呼叫支援序數位符串比較的方法(例如Compare(String, String, StringComparison)方法),並提供的StringComparison.Ordinal值或StringComparison.OrdinalIgnoreCase做為StringComparison引數。To compare two strings, call a method that supports ordinal string comparison, such as the Compare(String, String, StringComparison) method, and supply a value of StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase as the StringComparison argument. 若要排序正規化字串的陣列,請將comparer StringComparer.OrdinalStringComparer.OrdinalIgnoreCase的值傳遞至的適當Array.Sort多載。To sort an array of normalized strings, pass a comparer value of StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase to an appropriate overload of Array.Sort.

  4. 根據上一個步驟所指示的順序,發出排序輸出中的字串。Emit the strings in the sorted output based on the order indicated by the previous step.

如需支援的 Unicode 正規化格式的說明, System.Text.NormalizationForm請參閱。For a description of supported Unicode normalization forms, see System.Text.NormalizationForm.

給呼叫者的注意事項

IsNormalizedfalse方法在字串中遇到第一個非正規化字元時,就會傳回。The IsNormalized method returns false as soon as it encounters the first non-normalized character in a string. 因此,如果字串包含後面接著無效 Unicode 字元的非正規化Normalize字元,則方法會ArgumentException擲回,但IsNormalized false會傳回。Therefore, if a string contains non-normalized characters followed by invalid Unicode characters, the Normalize method will throw an ArgumentException although IsNormalized returns false.

另請參閱

Normalize(NormalizationForm) Normalize(NormalizationForm) Normalize(NormalizationForm)

傳回新的字串,其文字值與這個字串相同,但是其二進位表示為特定的 Unicode 正規化格式。Returns a new string whose textual value is the same as this string, but whose binary representation is in the specified Unicode normalization form.

public:
 System::String ^ Normalize(System::Text::NormalizationForm normalizationForm);
public string Normalize (System.Text.NormalizationForm normalizationForm);
member this.Normalize : System.Text.NormalizationForm -> string

參數

normalizationForm
NormalizationForm NormalizationForm NormalizationForm NormalizationForm

Unicode 正規化格式。A Unicode normalization form.

傳回

新的字串,其文字值與這個字串相同,但是其二進位表示為 normalizationForm 參數指定的正規化格式。A new string whose textual value is the same as this string, but whose binary representation is in the normalization form specified by the normalizationForm parameter.

例外狀況

目前的執行個體包含無效的 Unicode 字元。The current instance contains invalid Unicode characters.

範例

下列範例會將字串標準化為四個正規化形式的每一個,確認字串已正規化為指定的正規化表單,然後列出標準化字串中的程式碼點。The following example normalizes a string to each of four normalization forms, confirms the string was normalized to the specified normalization form, then lists the code points in the normalized string.

using namespace System;
using namespace System::Text;

void Show( String^ title, String^ s )
{
   Console::Write( "Characters in string {0} = ", title );
   for each (short x in s) {
      Console::Write("{0:X4} ", x);
   }
   Console::WriteLine();
}

int main()
{
   
   // Character c; combining characters acute and cedilla; character 3/4
   array<Char>^temp0 = {L'c',L'\u0301',L'\u0327',L'\u00BE'};
   String^ s1 = gcnew String( temp0 );
   String^ s2 = nullptr;
   String^ divider = gcnew String( '-',80 );
   divider = String::Concat( Environment::NewLine, divider, Environment::NewLine );

   Show( "s1", s1 );
   Console::WriteLine();
   Console::WriteLine( "U+0063 = LATIN SMALL LETTER C" );
   Console::WriteLine( "U+0301 = COMBINING ACUTE ACCENT" );
   Console::WriteLine( "U+0327 = COMBINING CEDILLA" );
   Console::WriteLine( "U+00BE = VULGAR FRACTION THREE QUARTERS" );
   Console::WriteLine( divider );
   Console::WriteLine( "A1) Is s1 normalized to the default form (Form C)?: {0}", s1->IsNormalized() );
   Console::WriteLine( "A2) Is s1 normalized to Form C?:  {0}", s1->IsNormalized( NormalizationForm::FormC ) );
   Console::WriteLine( "A3) Is s1 normalized to Form D?:  {0}", s1->IsNormalized( NormalizationForm::FormD ) );
   Console::WriteLine( "A4) Is s1 normalized to Form KC?: {0}", s1->IsNormalized( NormalizationForm::FormKC ) );
   Console::WriteLine( "A5) Is s1 normalized to Form KD?: {0}", s1->IsNormalized( NormalizationForm::FormKD ) );
   Console::WriteLine( divider );
   Console::WriteLine( "Set string s2 to each normalized form of string s1." );
   Console::WriteLine();
   Console::WriteLine( "U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE" );
   Console::WriteLine( "U+0033 = DIGIT THREE" );
   Console::WriteLine( "U+2044 = FRACTION SLASH" );
   Console::WriteLine( "U+0034 = DIGIT FOUR" );
   Console::WriteLine( divider );
   s2 = s1->Normalize();
   Console::Write( "B1) Is s2 normalized to the default form (Form C)?: " );
   Console::WriteLine( s2->IsNormalized() );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormC );
   Console::Write( "B2) Is s2 normalized to Form C?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormD );
   Console::Write( "B3) Is s2 normalized to Form D?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormD ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKC );
   Console::Write( "B4) Is s2 normalized to Form KC?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKD );
   Console::Write( "B5) Is s2 normalized to Form KD?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKD ) );
   Show( "s2", s2 );
   Console::WriteLine();
}

/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
using System;
using System.Text;

class Example
{
    public static void Main() 
    {
       // Character c; combining characters acute and cedilla; character 3/4
       string s1 = new String( new char[] {'\u0063', '\u0301', '\u0327', '\u00BE'});
       string s2 = null;
       string divider = new String('-', 80);
       divider = String.Concat(Environment.NewLine, divider, Environment.NewLine);
   
       Show("s1", s1);
       Console.WriteLine();
       Console.WriteLine("U+0063 = LATIN SMALL LETTER C");
       Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT");
       Console.WriteLine("U+0327 = COMBINING CEDILLA");
       Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS");
       Console.WriteLine(divider);
   
       Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", 
                                    s1.IsNormalized());
       Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormC));
       Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormD));
       Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKC));
       Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKD));
   
       Console.WriteLine(divider);
   
       Console.WriteLine("Set string s2 to each normalized form of string s1.");
       Console.WriteLine();
       Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE");
       Console.WriteLine("U+0033 = DIGIT THREE");
       Console.WriteLine("U+2044 = FRACTION SLASH");
       Console.WriteLine("U+0034 = DIGIT FOUR");
       Console.WriteLine(divider);
   
       s2 = s1.Normalize();
       Console.Write("B1) Is s2 normalized to the default form (Form C)?: ");
       Console.WriteLine(s2.IsNormalized());
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormC);
       Console.Write("B2) Is s2 normalized to Form C?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormD);
       Console.Write("B3) Is s2 normalized to Form D?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKC);
       Console.Write("B4) Is s2 normalized to Form KC?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKD);
       Console.Write("B5) Is s2 normalized to Form KD?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD));
       Show("s2", s2);
       Console.WriteLine();
    }

    private static void Show(string title, string s)
    {
       Console.Write("Characters in string {0} = ", title);
       foreach(short x in s) {
           Console.Write("{0:X4} ", x);
       }
       Console.WriteLine();
    }
}
/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
Imports System.Text

Class Example
   Public Shared Sub Main()
      ' Character c; combining characters acute and cedilla; character 3/4
      Dim s1 = New [String](New Char() {ChrW(&H0063), ChrW(&H0301), ChrW(&H0327), ChrW(&H00BE)})
      Dim s2 As String = Nothing
      Dim divider = New [String]("-"c, 80)
      divider = [String].Concat(Environment.NewLine, divider, Environment.NewLine)
      
      Show("s1", s1)
      Console.WriteLine()
      Console.WriteLine("U+0063 = LATIN SMALL LETTER C")
      Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT")
      Console.WriteLine("U+0327 = COMBINING CEDILLA")
      Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS")

      Console.WriteLine(divider)
      
      Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", s1.IsNormalized())
      Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", s1.IsNormalized(NormalizationForm.FormC))
      Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", s1.IsNormalized(NormalizationForm.FormD))
      Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", s1.IsNormalized(NormalizationForm.FormKC))
      Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", s1.IsNormalized(NormalizationForm.FormKD))
      
      Console.WriteLine(divider)
      
      Console.WriteLine("Set string s2 to each normalized form of string s1.")
      Console.WriteLine()
      Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE")
      Console.WriteLine("U+0033 = DIGIT THREE")
      Console.WriteLine("U+2044 = FRACTION SLASH")
      Console.WriteLine("U+0034 = DIGIT FOUR")
      Console.WriteLine(divider)
      
      s2 = s1.Normalize()
      Console.Write("B1) Is s2 normalized to the default form (Form C)?: ")
      Console.WriteLine(s2.IsNormalized())
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormC)
      Console.Write("B2) Is s2 normalized to Form C?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormD)
      Console.Write("B3) Is s2 normalized to Form D?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKC)
      Console.Write("B4) Is s2 normalized to Form KC?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKD)
      Console.Write("B5) Is s2 normalized to Form KD?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD))
      Show("s2", s2)
      Console.WriteLine()
   End Sub 
   
   Private Shared Sub Show(title As String, s As String)
      Console.Write("Characters in string {0} = ", title)
      For Each x As Char In s
         Console.Write("{0:X4} ", AscW(x))
      Next 
      Console.WriteLine()
   End Sub 
End Class 
'This example produces the following results:
'
'Characters in string s1 = 0063 0301 0327 00BE
'
'U+0063 = LATIN SMALL LETTER C
'U+0301 = COMBINING ACUTE ACCENT
'U+0327 = COMBINING CEDILLA
'U+00BE = VULGAR FRACTION THREE QUARTERS
'
'--------------------------------------------------------------------------------
'
'A1) Is s1 normalized to the default form (Form C)?: False
'A2) Is s1 normalized to Form C?:  False
'A3) Is s1 normalized to Form D?:  False
'A4) Is s1 normalized to Form KC?: False
'A5) Is s1 normalized to Form KD?: False
'
'--------------------------------------------------------------------------------
'
'Set string s2 to each normalized form of string s1.
'
'U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
'U+0033 = DIGIT THREE
'U+2044 = FRACTION SLASH
'U+0034 = DIGIT FOUR
'
'--------------------------------------------------------------------------------
'
'B1) Is s2 normalized to the default form (Form C)?: True
'Characters in string s2 = 1E09 00BE
'
'B2) Is s2 normalized to Form C?: True
'Characters in string s2 = 1E09 00BE
'
'B3) Is s2 normalized to Form D?: True
'Characters in string s2 = 0063 0327 0301 00BE
'
'B4) Is s2 normalized to Form KC?: True
'Characters in string s2 = 1E09 0033 2044 0034
'
'B5) Is s2 normalized to Form KD?: True
'Characters in string s2 = 0063 0327 0301 0033 2044 0034
'

備註

某些 Unicode 字元有多個對等的二進位表示,其中包含組合和/或複合 Unicode 字元的集合。Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. 單一字元的多個標記法會使搜尋、排序、比對和其他作業變得更複雜。The existence of multiple representations for a single character complicates searching, sorting, matching, and other operations.

Unicode 標準定義了一個稱為正規化的進程,它會在指定字元的任何對等二進位表示時,傳回一個二進位標記法。The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. 正規化可以使用遵守不同規則的數種演算法(稱為正規化形式)來執行。Normalization can be performed with several algorithms, called normalization forms, that obey different rules. .NET 支援由 Unicode 標準所定義的四種正規化形式(C、D、KC 和 KD)。.NET supports the four normalization forms (C, D, KC, and KD) that are defined by the Unicode standard. 當兩個字串以相同正規化格式表示時,可以使用序數比較來進行比較。When two strings are represented in the same normalization form, they can be compared by using ordinal comparison.

若要標準化並比較兩個字串,請執行下列動作:To normalize and compare two strings, do the following:

  1. 取得要與輸入來源比較的字串,例如檔案或使用者輸入裝置。Obtain the strings to be compared from an input source, such as a file or a user input device.

  2. Normalize(NormalizationForm)呼叫方法,將字串標準化為指定的正規化表單。Call the Normalize(NormalizationForm) method to normalize the strings to a specified normalization form.

  3. 若要比較兩個字串,請呼叫支援序數位符串比較的方法(例如Compare(String, String, StringComparison)方法),並提供的StringComparison.Ordinal值或StringComparison.OrdinalIgnoreCase做為StringComparison引數。To compare two strings, call a method that supports ordinal string comparison, such as the Compare(String, String, StringComparison) method, and supply a value of StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase as the StringComparison argument. 若要排序正規化字串的陣列,請將comparer StringComparer.OrdinalStringComparer.OrdinalIgnoreCase的值傳遞至的適當Array.Sort多載。To sort an array of normalized strings, pass a comparer value of StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase to an appropriate overload of Array.Sort.

  4. 根據上一個步驟所指示的順序,發出排序輸出中的字串。Emit the strings in the sorted output based on the order indicated by the previous step.

如需支援的 Unicode 正規化格式的說明, System.Text.NormalizationForm請參閱。For a description of supported Unicode normalization forms, see System.Text.NormalizationForm.

給呼叫者的注意事項

IsNormalizedfalse方法在字串中遇到第一個非正規化字元時,就會傳回。The IsNormalized method returns false as soon as it encounters the first non-normalized character in a string. 因此,如果字串包含後面接著無效 Unicode 字元的非Normalize正規化字元,則方法可能會ArgumentException擲回, IsNormalizedfalse會傳回。Therefore, if a string contains non-normalized characters followed by invalid Unicode characters, the Normalize method may throw an ArgumentException although IsNormalized returns false.

另請參閱

適用於