String.Normalize String.Normalize String.Normalize String.Normalize Method

定義

バイナリ表現が特定の Unicode 正規形である新しい文字列を返します。Returns a new string whose binary representation is in a particular Unicode normalization form.

オーバーロード

Normalize() Normalize() Normalize() Normalize()

この文字列と同じテキスト値を持ち、なおかつ、バイナリ表現が Unicode 正規形 C である新しい文字列を返します。Returns a new string whose textual value is the same as this string, but whose binary representation is in Unicode normalization form C.

Normalize(NormalizationForm) Normalize(NormalizationForm) Normalize(NormalizationForm)

この文字列と同じテキスト値を持ち、なおかつ、バイナリ表現が、指定された Unicode 正規形である新しい文字列を返します。Returns a new string whose textual value is the same as this string, but whose binary representation is in the specified Unicode normalization form.

Normalize() Normalize() Normalize() Normalize()

この文字列と同じテキスト値を持ち、なおかつ、バイナリ表現が Unicode 正規形 C である新しい文字列を返します。Returns a new string whose textual value is the same as this string, but whose binary representation is in Unicode normalization form C.

public:
 System::String ^ Normalize();
public string Normalize ();
member this.Normalize : unit -> string
Public Function Normalize () As String

戻り値

この文字列と同じテキスト値を持ち、なおかつ、バイナリ表現が正規形 C である新しい文字列。A new, normalized string whose textual value is the same as this string, but whose binary representation is in normalization form C.

例外

現在のインスタンスに、正しくない Unicode 文字が含まれています。The current instance contains invalid Unicode characters.

次の例では、文字列を4つの正規化された各形式に正規化し、文字列が指定した正規化形式に正規化されたことを確認した後、正規化された文字列内のコードポイントを一覧表示します。The following example normalizes a string to each of four normalization forms, confirms the string was normalized to the specified normalization form, then lists the code points in the normalized string.

using namespace System;
using namespace System::Text;

void Show( String^ title, String^ s )
{
   Console::Write( "Characters in string {0} = ", title );
   for each (short x in s) {
      Console::Write("{0:X4} ", x);
   }
   Console::WriteLine();
}

int main()
{
   
   // Character c; combining characters acute and cedilla; character 3/4
   array<Char>^temp0 = {L'c',L'\u0301',L'\u0327',L'\u00BE'};
   String^ s1 = gcnew String( temp0 );
   String^ s2 = nullptr;
   String^ divider = gcnew String( '-',80 );
   divider = String::Concat( Environment::NewLine, divider, Environment::NewLine );

   Show( "s1", s1 );
   Console::WriteLine();
   Console::WriteLine( "U+0063 = LATIN SMALL LETTER C" );
   Console::WriteLine( "U+0301 = COMBINING ACUTE ACCENT" );
   Console::WriteLine( "U+0327 = COMBINING CEDILLA" );
   Console::WriteLine( "U+00BE = VULGAR FRACTION THREE QUARTERS" );
   Console::WriteLine( divider );
   Console::WriteLine( "A1) Is s1 normalized to the default form (Form C)?: {0}", s1->IsNormalized() );
   Console::WriteLine( "A2) Is s1 normalized to Form C?:  {0}", s1->IsNormalized( NormalizationForm::FormC ) );
   Console::WriteLine( "A3) Is s1 normalized to Form D?:  {0}", s1->IsNormalized( NormalizationForm::FormD ) );
   Console::WriteLine( "A4) Is s1 normalized to Form KC?: {0}", s1->IsNormalized( NormalizationForm::FormKC ) );
   Console::WriteLine( "A5) Is s1 normalized to Form KD?: {0}", s1->IsNormalized( NormalizationForm::FormKD ) );
   Console::WriteLine( divider );
   Console::WriteLine( "Set string s2 to each normalized form of string s1." );
   Console::WriteLine();
   Console::WriteLine( "U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE" );
   Console::WriteLine( "U+0033 = DIGIT THREE" );
   Console::WriteLine( "U+2044 = FRACTION SLASH" );
   Console::WriteLine( "U+0034 = DIGIT FOUR" );
   Console::WriteLine( divider );
   s2 = s1->Normalize();
   Console::Write( "B1) Is s2 normalized to the default form (Form C)?: " );
   Console::WriteLine( s2->IsNormalized() );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormC );
   Console::Write( "B2) Is s2 normalized to Form C?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormD );
   Console::Write( "B3) Is s2 normalized to Form D?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormD ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKC );
   Console::Write( "B4) Is s2 normalized to Form KC?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKD );
   Console::Write( "B5) Is s2 normalized to Form KD?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKD ) );
   Show( "s2", s2 );
   Console::WriteLine();
}

/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
using System;
using System.Text;

class Example
{
    public static void Main() 
    {
       // Character c; combining characters acute and cedilla; character 3/4
       string s1 = new String( new char[] {'\u0063', '\u0301', '\u0327', '\u00BE'});
       string s2 = null;
       string divider = new String('-', 80);
       divider = String.Concat(Environment.NewLine, divider, Environment.NewLine);
   
       Show("s1", s1);
       Console.WriteLine();
       Console.WriteLine("U+0063 = LATIN SMALL LETTER C");
       Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT");
       Console.WriteLine("U+0327 = COMBINING CEDILLA");
       Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS");
       Console.WriteLine(divider);
   
       Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", 
                                    s1.IsNormalized());
       Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormC));
       Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormD));
       Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKC));
       Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKD));
   
       Console.WriteLine(divider);
   
       Console.WriteLine("Set string s2 to each normalized form of string s1.");
       Console.WriteLine();
       Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE");
       Console.WriteLine("U+0033 = DIGIT THREE");
       Console.WriteLine("U+2044 = FRACTION SLASH");
       Console.WriteLine("U+0034 = DIGIT FOUR");
       Console.WriteLine(divider);
   
       s2 = s1.Normalize();
       Console.Write("B1) Is s2 normalized to the default form (Form C)?: ");
       Console.WriteLine(s2.IsNormalized());
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormC);
       Console.Write("B2) Is s2 normalized to Form C?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormD);
       Console.Write("B3) Is s2 normalized to Form D?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKC);
       Console.Write("B4) Is s2 normalized to Form KC?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKD);
       Console.Write("B5) Is s2 normalized to Form KD?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD));
       Show("s2", s2);
       Console.WriteLine();
    }

    private static void Show(string title, string s)
    {
       Console.Write("Characters in string {0} = ", title);
       foreach(short x in s) {
           Console.Write("{0:X4} ", x);
       }
       Console.WriteLine();
    }
}
/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
Imports System.Text

Class Example
   Public Shared Sub Main()
      ' Character c; combining characters acute and cedilla; character 3/4
      Dim s1 = New [String](New Char() {ChrW(&H0063), ChrW(&H0301), ChrW(&H0327), ChrW(&H00BE)})
      Dim s2 As String = Nothing
      Dim divider = New [String]("-"c, 80)
      divider = [String].Concat(Environment.NewLine, divider, Environment.NewLine)
      
      Show("s1", s1)
      Console.WriteLine()
      Console.WriteLine("U+0063 = LATIN SMALL LETTER C")
      Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT")
      Console.WriteLine("U+0327 = COMBINING CEDILLA")
      Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS")

      Console.WriteLine(divider)
      
      Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", s1.IsNormalized())
      Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", s1.IsNormalized(NormalizationForm.FormC))
      Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", s1.IsNormalized(NormalizationForm.FormD))
      Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", s1.IsNormalized(NormalizationForm.FormKC))
      Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", s1.IsNormalized(NormalizationForm.FormKD))
      
      Console.WriteLine(divider)
      
      Console.WriteLine("Set string s2 to each normalized form of string s1.")
      Console.WriteLine()
      Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE")
      Console.WriteLine("U+0033 = DIGIT THREE")
      Console.WriteLine("U+2044 = FRACTION SLASH")
      Console.WriteLine("U+0034 = DIGIT FOUR")
      Console.WriteLine(divider)
      
      s2 = s1.Normalize()
      Console.Write("B1) Is s2 normalized to the default form (Form C)?: ")
      Console.WriteLine(s2.IsNormalized())
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormC)
      Console.Write("B2) Is s2 normalized to Form C?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormD)
      Console.Write("B3) Is s2 normalized to Form D?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKC)
      Console.Write("B4) Is s2 normalized to Form KC?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKD)
      Console.Write("B5) Is s2 normalized to Form KD?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD))
      Show("s2", s2)
      Console.WriteLine()
   End Sub 
   
   Private Shared Sub Show(title As String, s As String)
      Console.Write("Characters in string {0} = ", title)
      For Each x As Char In s
         Console.Write("{0:X4} ", AscW(x))
      Next 
      Console.WriteLine()
   End Sub 
End Class 
'This example produces the following results:
'
'Characters in string s1 = 0063 0301 0327 00BE
'
'U+0063 = LATIN SMALL LETTER C
'U+0301 = COMBINING ACUTE ACCENT
'U+0327 = COMBINING CEDILLA
'U+00BE = VULGAR FRACTION THREE QUARTERS
'
'--------------------------------------------------------------------------------
'
'A1) Is s1 normalized to the default form (Form C)?: False
'A2) Is s1 normalized to Form C?:  False
'A3) Is s1 normalized to Form D?:  False
'A4) Is s1 normalized to Form KC?: False
'A5) Is s1 normalized to Form KD?: False
'
'--------------------------------------------------------------------------------
'
'Set string s2 to each normalized form of string s1.
'
'U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
'U+0033 = DIGIT THREE
'U+2044 = FRACTION SLASH
'U+0034 = DIGIT FOUR
'
'--------------------------------------------------------------------------------
'
'B1) Is s2 normalized to the default form (Form C)?: True
'Characters in string s2 = 1E09 00BE
'
'B2) Is s2 normalized to Form C?: True
'Characters in string s2 = 1E09 00BE
'
'B3) Is s2 normalized to Form D?: True
'Characters in string s2 = 0063 0327 0301 00BE
'
'B4) Is s2 normalized to Form KC?: True
'Characters in string s2 = 1E09 0033 2044 0034
'
'B5) Is s2 normalized to Form KD?: True
'Characters in string s2 = 0063 0327 0301 0033 2044 0034
'

注釈

一部の Unicode 文字には、組み合わせと複合 Unicode 文字のセットで構成される等価のバイナリ表現が複数あります。Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. たとえば、次のコードポイントは、"ắ" という文字を表すことができます。For example, any of the following code points can represent the letter "ắ":

  • U+1EAFU+1EAF

  • U + 0103 U + 0301U+0103 U+0301

  • U + 0061 U + 0306 U + 0301U+0061 U+0306 U+0301

1つの文字に対して複数の表現が存在すると、検索、並べ替え、照合、およびその他の操作が複雑になります。The existence of multiple representations for a single character complicates searching, sorting, matching, and other operations.

Unicode 規格では、1つの文字に相当するバイナリ表現を指定した場合に1つのバイナリ表現を返す正規化と呼ばれるプロセスが定義されています。The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. 正規化は、正規化形式と呼ばれるいくつかのアルゴリズムを使用して実行できます。これは、さまざまな規則に従います。Normalization can be performed with several algorithms, called normalization forms, that obey different rules. .NET では、Unicode 規格で定義されている4つの正規化形式 (C、D、KC、および KD) がサポートされています。.NET supports the four normalization forms (C, D, KC, and KD) that are defined by the Unicode standard. 2つの文字列が同じ正規化形式で表される場合は、序数に基づく比較を使用して比較できます。When two strings are represented in the same normalization form, they can be compared by using ordinal comparison.

2つの文字列を正規化して比較するには、次の手順を実行します。To normalize and compare two strings, do the following:

  1. ファイルやユーザー入力デバイスなどの入力ソースと比較する文字列を取得します。Obtain the strings to be compared from an input source, such as a file or a user input device.

  2. メソッドをNormalize()呼び出して、文字列を正規形 C に正規化します。Call the Normalize() method to normalize the strings to normalization form C.

  3. 2つの文字列を比較するCompare(String, String, StringComparison)には、メソッドなどの序数の文字列比較をサポートするメソッドを呼び出し、 StringComparison.Ordinal StringComparison引数StringComparison.OrdinalIgnoreCaseとしてまたはの値を指定します。To compare two strings, call a method that supports ordinal string comparison, such as the Compare(String, String, StringComparison) method, and supply a value of StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase as the StringComparison argument. 正規化された文字列の配列を並べ替えるにcomparerは、 StringComparer.OrdinalまたStringComparer.OrdinalIgnoreCaseはの値をのArray.Sort適切なオーバーロードに渡します。To sort an array of normalized strings, pass a comparer value of StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase to an appropriate overload of Array.Sort.

  4. 前の手順で示された順序に基づいて、並べ替えられた出力内の文字列を生成します。Emit the strings in the sorted output based on the order indicated by the previous step.

サポートされている Unicode 正規形の説明System.Text.NormalizationFormについては、「」を参照してください。For a description of supported Unicode normalization forms, see System.Text.NormalizationForm.

注意 (呼び出し元)

メソッドIsNormalizedは、 false文字列内の最初の正規化されていない文字が検出されるとすぐに戻ります。The IsNormalized method returns false as soon as it encounters the first non-normalized character in a string. したがって、文字列に非正規化文字の後に無効な Unicode 文字が含まNormalizeれている場合、 IsNormalizedメソッドfalseはを返しますが、はArgumentExceptionをスローします。Therefore, if a string contains non-normalized characters followed by invalid Unicode characters, the Normalize method will throw an ArgumentException although IsNormalized returns false.

こちらもご覧ください

Normalize(NormalizationForm) Normalize(NormalizationForm) Normalize(NormalizationForm)

この文字列と同じテキスト値を持ち、なおかつ、バイナリ表現が、指定された Unicode 正規形である新しい文字列を返します。Returns a new string whose textual value is the same as this string, but whose binary representation is in the specified Unicode normalization form.

public:
 System::String ^ Normalize(System::Text::NormalizationForm normalizationForm);
public string Normalize (System.Text.NormalizationForm normalizationForm);
member this.Normalize : System.Text.NormalizationForm -> string

パラメーター

normalizationForm
NormalizationForm NormalizationForm NormalizationForm NormalizationForm

Unicode 正規形。A Unicode normalization form.

戻り値

この文字列と同じテキスト値を持ち、なおかつ、バイナリ表現が、normalizationForm パラメーターで指定された正規形である新しい文字列。A new string whose textual value is the same as this string, but whose binary representation is in the normalization form specified by the normalizationForm parameter.

例外

現在のインスタンスに、正しくない Unicode 文字が含まれています。The current instance contains invalid Unicode characters.

次の例では、文字列を4つの正規化された各形式に正規化し、文字列が指定した正規化形式に正規化されたことを確認した後、正規化された文字列内のコードポイントを一覧表示します。The following example normalizes a string to each of four normalization forms, confirms the string was normalized to the specified normalization form, then lists the code points in the normalized string.

using namespace System;
using namespace System::Text;

void Show( String^ title, String^ s )
{
   Console::Write( "Characters in string {0} = ", title );
   for each (short x in s) {
      Console::Write("{0:X4} ", x);
   }
   Console::WriteLine();
}

int main()
{
   
   // Character c; combining characters acute and cedilla; character 3/4
   array<Char>^temp0 = {L'c',L'\u0301',L'\u0327',L'\u00BE'};
   String^ s1 = gcnew String( temp0 );
   String^ s2 = nullptr;
   String^ divider = gcnew String( '-',80 );
   divider = String::Concat( Environment::NewLine, divider, Environment::NewLine );

   Show( "s1", s1 );
   Console::WriteLine();
   Console::WriteLine( "U+0063 = LATIN SMALL LETTER C" );
   Console::WriteLine( "U+0301 = COMBINING ACUTE ACCENT" );
   Console::WriteLine( "U+0327 = COMBINING CEDILLA" );
   Console::WriteLine( "U+00BE = VULGAR FRACTION THREE QUARTERS" );
   Console::WriteLine( divider );
   Console::WriteLine( "A1) Is s1 normalized to the default form (Form C)?: {0}", s1->IsNormalized() );
   Console::WriteLine( "A2) Is s1 normalized to Form C?:  {0}", s1->IsNormalized( NormalizationForm::FormC ) );
   Console::WriteLine( "A3) Is s1 normalized to Form D?:  {0}", s1->IsNormalized( NormalizationForm::FormD ) );
   Console::WriteLine( "A4) Is s1 normalized to Form KC?: {0}", s1->IsNormalized( NormalizationForm::FormKC ) );
   Console::WriteLine( "A5) Is s1 normalized to Form KD?: {0}", s1->IsNormalized( NormalizationForm::FormKD ) );
   Console::WriteLine( divider );
   Console::WriteLine( "Set string s2 to each normalized form of string s1." );
   Console::WriteLine();
   Console::WriteLine( "U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE" );
   Console::WriteLine( "U+0033 = DIGIT THREE" );
   Console::WriteLine( "U+2044 = FRACTION SLASH" );
   Console::WriteLine( "U+0034 = DIGIT FOUR" );
   Console::WriteLine( divider );
   s2 = s1->Normalize();
   Console::Write( "B1) Is s2 normalized to the default form (Form C)?: " );
   Console::WriteLine( s2->IsNormalized() );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormC );
   Console::Write( "B2) Is s2 normalized to Form C?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormD );
   Console::Write( "B3) Is s2 normalized to Form D?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormD ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKC );
   Console::Write( "B4) Is s2 normalized to Form KC?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKC ) );
   Show( "s2", s2 );
   Console::WriteLine();
   s2 = s1->Normalize( NormalizationForm::FormKD );
   Console::Write( "B5) Is s2 normalized to Form KD?: " );
   Console::WriteLine( s2->IsNormalized( NormalizationForm::FormKD ) );
   Show( "s2", s2 );
   Console::WriteLine();
}

/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
using System;
using System.Text;

class Example
{
    public static void Main() 
    {
       // Character c; combining characters acute and cedilla; character 3/4
       string s1 = new String( new char[] {'\u0063', '\u0301', '\u0327', '\u00BE'});
       string s2 = null;
       string divider = new String('-', 80);
       divider = String.Concat(Environment.NewLine, divider, Environment.NewLine);
   
       Show("s1", s1);
       Console.WriteLine();
       Console.WriteLine("U+0063 = LATIN SMALL LETTER C");
       Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT");
       Console.WriteLine("U+0327 = COMBINING CEDILLA");
       Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS");
       Console.WriteLine(divider);
   
       Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", 
                                    s1.IsNormalized());
       Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormC));
       Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", 
                                    s1.IsNormalized(NormalizationForm.FormD));
       Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKC));
       Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", 
                                    s1.IsNormalized(NormalizationForm.FormKD));
   
       Console.WriteLine(divider);
   
       Console.WriteLine("Set string s2 to each normalized form of string s1.");
       Console.WriteLine();
       Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE");
       Console.WriteLine("U+0033 = DIGIT THREE");
       Console.WriteLine("U+2044 = FRACTION SLASH");
       Console.WriteLine("U+0034 = DIGIT FOUR");
       Console.WriteLine(divider);
   
       s2 = s1.Normalize();
       Console.Write("B1) Is s2 normalized to the default form (Form C)?: ");
       Console.WriteLine(s2.IsNormalized());
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormC);
       Console.Write("B2) Is s2 normalized to Form C?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormD);
       Console.Write("B3) Is s2 normalized to Form D?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKC);
       Console.Write("B4) Is s2 normalized to Form KC?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC));
       Show("s2", s2);
       Console.WriteLine();
   
       s2 = s1.Normalize(NormalizationForm.FormKD);
       Console.Write("B5) Is s2 normalized to Form KD?: ");
       Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD));
       Show("s2", s2);
       Console.WriteLine();
    }

    private static void Show(string title, string s)
    {
       Console.Write("Characters in string {0} = ", title);
       foreach(short x in s) {
           Console.Write("{0:X4} ", x);
       }
       Console.WriteLine();
    }
}
/*
This example produces the following results:

Characters in string s1 = 0063 0301 0327 00BE

U+0063 = LATIN SMALL LETTER C
U+0301 = COMBINING ACUTE ACCENT
U+0327 = COMBINING CEDILLA
U+00BE = VULGAR FRACTION THREE QUARTERS

--------------------------------------------------------------------------------

A1) Is s1 normalized to the default form (Form C)?: False
A2) Is s1 normalized to Form C?:  False
A3) Is s1 normalized to Form D?:  False
A4) Is s1 normalized to Form KC?: False
A5) Is s1 normalized to Form KD?: False

--------------------------------------------------------------------------------

Set string s2 to each normalized form of string s1.

U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
U+0033 = DIGIT THREE
U+2044 = FRACTION SLASH
U+0034 = DIGIT FOUR

--------------------------------------------------------------------------------

B1) Is s2 normalized to the default form (Form C)?: True
Characters in string s2 = 1E09 00BE

B2) Is s2 normalized to Form C?: True
Characters in string s2 = 1E09 00BE

B3) Is s2 normalized to Form D?: True
Characters in string s2 = 0063 0327 0301 00BE

B4) Is s2 normalized to Form KC?: True
Characters in string s2 = 1E09 0033 2044 0034

B5) Is s2 normalized to Form KD?: True
Characters in string s2 = 0063 0327 0301 0033 2044 0034

*/
Imports System.Text

Class Example
   Public Shared Sub Main()
      ' Character c; combining characters acute and cedilla; character 3/4
      Dim s1 = New [String](New Char() {ChrW(&H0063), ChrW(&H0301), ChrW(&H0327), ChrW(&H00BE)})
      Dim s2 As String = Nothing
      Dim divider = New [String]("-"c, 80)
      divider = [String].Concat(Environment.NewLine, divider, Environment.NewLine)
      
      Show("s1", s1)
      Console.WriteLine()
      Console.WriteLine("U+0063 = LATIN SMALL LETTER C")
      Console.WriteLine("U+0301 = COMBINING ACUTE ACCENT")
      Console.WriteLine("U+0327 = COMBINING CEDILLA")
      Console.WriteLine("U+00BE = VULGAR FRACTION THREE QUARTERS")

      Console.WriteLine(divider)
      
      Console.WriteLine("A1) Is s1 normalized to the default form (Form C)?: {0}", s1.IsNormalized())
      Console.WriteLine("A2) Is s1 normalized to Form C?:  {0}", s1.IsNormalized(NormalizationForm.FormC))
      Console.WriteLine("A3) Is s1 normalized to Form D?:  {0}", s1.IsNormalized(NormalizationForm.FormD))
      Console.WriteLine("A4) Is s1 normalized to Form KC?: {0}", s1.IsNormalized(NormalizationForm.FormKC))
      Console.WriteLine("A5) Is s1 normalized to Form KD?: {0}", s1.IsNormalized(NormalizationForm.FormKD))
      
      Console.WriteLine(divider)
      
      Console.WriteLine("Set string s2 to each normalized form of string s1.")
      Console.WriteLine()
      Console.WriteLine("U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE")
      Console.WriteLine("U+0033 = DIGIT THREE")
      Console.WriteLine("U+2044 = FRACTION SLASH")
      Console.WriteLine("U+0034 = DIGIT FOUR")
      Console.WriteLine(divider)
      
      s2 = s1.Normalize()
      Console.Write("B1) Is s2 normalized to the default form (Form C)?: ")
      Console.WriteLine(s2.IsNormalized())
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormC)
      Console.Write("B2) Is s2 normalized to Form C?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormD)
      Console.Write("B3) Is s2 normalized to Form D?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormD))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKC)
      Console.Write("B4) Is s2 normalized to Form KC?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKC))
      Show("s2", s2)
      Console.WriteLine()
      
      s2 = s1.Normalize(NormalizationForm.FormKD)
      Console.Write("B5) Is s2 normalized to Form KD?: ")
      Console.WriteLine(s2.IsNormalized(NormalizationForm.FormKD))
      Show("s2", s2)
      Console.WriteLine()
   End Sub 
   
   Private Shared Sub Show(title As String, s As String)
      Console.Write("Characters in string {0} = ", title)
      For Each x As Char In s
         Console.Write("{0:X4} ", AscW(x))
      Next 
      Console.WriteLine()
   End Sub 
End Class 
'This example produces the following results:
'
'Characters in string s1 = 0063 0301 0327 00BE
'
'U+0063 = LATIN SMALL LETTER C
'U+0301 = COMBINING ACUTE ACCENT
'U+0327 = COMBINING CEDILLA
'U+00BE = VULGAR FRACTION THREE QUARTERS
'
'--------------------------------------------------------------------------------
'
'A1) Is s1 normalized to the default form (Form C)?: False
'A2) Is s1 normalized to Form C?:  False
'A3) Is s1 normalized to Form D?:  False
'A4) Is s1 normalized to Form KC?: False
'A5) Is s1 normalized to Form KD?: False
'
'--------------------------------------------------------------------------------
'
'Set string s2 to each normalized form of string s1.
'
'U+1E09 = LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
'U+0033 = DIGIT THREE
'U+2044 = FRACTION SLASH
'U+0034 = DIGIT FOUR
'
'--------------------------------------------------------------------------------
'
'B1) Is s2 normalized to the default form (Form C)?: True
'Characters in string s2 = 1E09 00BE
'
'B2) Is s2 normalized to Form C?: True
'Characters in string s2 = 1E09 00BE
'
'B3) Is s2 normalized to Form D?: True
'Characters in string s2 = 0063 0327 0301 00BE
'
'B4) Is s2 normalized to Form KC?: True
'Characters in string s2 = 1E09 0033 2044 0034
'
'B5) Is s2 normalized to Form KD?: True
'Characters in string s2 = 0063 0327 0301 0033 2044 0034
'

注釈

一部の Unicode 文字には、組み合わせと複合 Unicode 文字のセットで構成される等価のバイナリ表現が複数あります。Some Unicode characters have multiple equivalent binary representations consisting of sets of combining and/or composite Unicode characters. 1つの文字に対して複数の表現が存在すると、検索、並べ替え、照合、およびその他の操作が複雑になります。The existence of multiple representations for a single character complicates searching, sorting, matching, and other operations.

Unicode 規格では、1つの文字に相当するバイナリ表現を指定した場合に1つのバイナリ表現を返す正規化と呼ばれるプロセスが定義されています。The Unicode standard defines a process called normalization that returns one binary representation when given any of the equivalent binary representations of a character. 正規化は、正規化形式と呼ばれるいくつかのアルゴリズムを使用して実行できます。これは、さまざまな規則に従います。Normalization can be performed with several algorithms, called normalization forms, that obey different rules. .NET では、Unicode 規格で定義されている4つの正規化形式 (C、D、KC、および KD) がサポートされています。.NET supports the four normalization forms (C, D, KC, and KD) that are defined by the Unicode standard. 2つの文字列が同じ正規化形式で表される場合は、序数に基づく比較を使用して比較できます。When two strings are represented in the same normalization form, they can be compared by using ordinal comparison.

2つの文字列を正規化して比較するには、次の手順を実行します。To normalize and compare two strings, do the following:

  1. ファイルやユーザー入力デバイスなどの入力ソースと比較する文字列を取得します。Obtain the strings to be compared from an input source, such as a file or a user input device.

  2. 文字列をNormalize(NormalizationForm)指定された正規化形式に正規化するには、メソッドを呼び出します。Call the Normalize(NormalizationForm) method to normalize the strings to a specified normalization form.

  3. 2つの文字列を比較するCompare(String, String, StringComparison)には、メソッドなどの序数の文字列比較をサポートするメソッドを呼び出し、 StringComparison.Ordinal StringComparison引数StringComparison.OrdinalIgnoreCaseとしてまたはの値を指定します。To compare two strings, call a method that supports ordinal string comparison, such as the Compare(String, String, StringComparison) method, and supply a value of StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase as the StringComparison argument. 正規化された文字列の配列を並べ替えるにcomparerは、 StringComparer.OrdinalまたStringComparer.OrdinalIgnoreCaseはの値をのArray.Sort適切なオーバーロードに渡します。To sort an array of normalized strings, pass a comparer value of StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase to an appropriate overload of Array.Sort.

  4. 前の手順で示された順序に基づいて、並べ替えられた出力内の文字列を生成します。Emit the strings in the sorted output based on the order indicated by the previous step.

サポートされている Unicode 正規形の説明System.Text.NormalizationFormについては、「」を参照してください。For a description of supported Unicode normalization forms, see System.Text.NormalizationForm.

注意 (呼び出し元)

メソッドIsNormalizedは、 false文字列内の最初の正規化されていない文字が検出されるとすぐに戻ります。The IsNormalized method returns false as soon as it encounters the first non-normalized character in a string. したがって、文字列に非正規化文字の後に無効な Unicode 文字が含まNormalizeれている場合、 IsNormalizedメソッドfalseはを返しますが、はArgumentExceptionをスローします。Therefore, if a string contains non-normalized characters followed by invalid Unicode characters, the Normalize method may throw an ArgumentException although IsNormalized returns false.

こちらもご覧ください

適用対象