StringInfo.ParseCombiningCharacters 方法

返回指定字符串中每个基字符、高代理项或控制字符的索引。

**命名空间:**System.Globalization
**程序集:**mscorlib(在 mscorlib.dll 中)

语法

声明
Public Shared Function ParseCombiningCharacters ( _
    str As String _
) As Integer()
用法
Dim str As String
Dim returnValue As Integer()

returnValue = StringInfo.ParseCombiningCharacters(str)
public static int[] ParseCombiningCharacters (
    string str
)
public:
static array<int>^ ParseCombiningCharacters (
    String^ str
)
public static int[] ParseCombiningCharacters (
    String str
)
public static function ParseCombiningCharacters (
    str : String
) : int[]

参数

  • str
    要搜索的字符串。

返回值

一个整数数组,它包含指定字符串中每个基字符、高代理项或控制字符的从零开始的索引。

异常

异常类型 条件

ArgumentNullException

str 为 空引用(在 Visual Basic 中为 Nothing)。

备注

“Unicode 标准”将代理项对定义为单个抽象字符的编码的字符表示形式,该抽象字符包含一个由两个代码单元组成的序列,其中,第一个代码单元是高代理项,第二个是低代理项。高代理项是范围 U+D800 到 U+DBFF 中的 Unicode 码位,低代理项是范围 U+DC00 到 U+DFFF 中的 Unicode 码位。

控制字符是其 Unicode 值为 U+007F 或在 U+0000 到 U+001F 之间或 U+0080 到 U+009F 之间的字符。

.NET Framework 将文本元素定义为显示为单个字符(即字素)的文本单元。文本元素可以是基字符、代理项对或组合字符序列。“Unicode 标准”将组合字符序列定义为一个基字符与一个或多个组合字符的组合。代理项对可表示基字符或组合字符。有关代理项对和组合字符序列的更多信息,请参见 http://www.unicode.org 中的 Unicode Standard(Unicode 标准)。

如果组合字符序列无效,则还将返回该序列中的每个组合字符。

结果数组中的每个索引均是一个文本元素的开头;即基字符或高代理项的索引。

因为每个元素的长度均分别为对应连续索引之间的差,所以可方便地进行计算。数组的长度将始终小于或等于字符串的长度。例如,对于给定的字符串“\u4f00\u302a\ud800\udc00\u4f01”,此方法会返回索引 0、2 和 4。

等效成员

从 .NET Framework 2.0 版开始,SubstringByTextElements 方法和 LengthInTextElements 属性为 ParseCombiningCharacters 方法所提供的功能提供了一种易于使用的实现。

示例

下面的代码示例演示如何调用 ParseCombiningCharacters 方法。此代码示例摘自一个为 StringInfo 类提供的更大的示例。

using System;
using System.Text;
using System.Globalization;

public sealed class App {
   static void Main() {
      // The string below contains combining characters.
      String s = "a\u0304\u0308bc\u0327";

      // Show each 'character' in the string.
      EnumTextElements(s);

      // Show the index in the string where each 'character' starts.
      EnumTextElementIndexes(s);
   }

   // Show how to enumerate each real character (honoring surrogates) in a string.
   static void EnumTextElements(String s) {
      // This StringBuilder holds the output results.
      StringBuilder sb = new StringBuilder();

      // Use the enumerator returned from GetTextElementEnumerator 
      // method to examine each real character.
      TextElementEnumerator charEnum = StringInfo.GetTextElementEnumerator(s);
      while (charEnum.MoveNext()) {
         sb.AppendFormat(
           "Character at index {0} is '{1}'{2}",
           charEnum.ElementIndex, charEnum.GetTextElement(),
           Environment.NewLine);
      }

      // Show the results.
      Console.WriteLine("Result of GetTextElementEnumerator:");
      Console.WriteLine(sb);
   }

   // Show how to discover the index of each real character (honoring surrogates) in a string.
   static void EnumTextElementIndexes(String s) {
      // This StringBuilder holds the output results.
      StringBuilder sb = new StringBuilder();

      // Use the ParseCombiningCharacters method to 
      // get the index of each real character in the string.
      Int32[] textElemIndex = StringInfo.ParseCombiningCharacters(s);

      // Iterate through each real character showing the character and the index where it was found.
      for (Int32 i = 0; i < textElemIndex.Length; i++) {
         sb.AppendFormat(
            "Character {0} starts at index {1}{2}",
            i, textElemIndex[i], Environment.NewLine);
      }

      // Show the results.
      Console.WriteLine("Result of ParseCombiningCharacters:");
      Console.WriteLine(sb);
   }
}

// This code produces the following output.
//
// Result of GetTextElementEnumerator:
// Character at index 0 is 'a-"'
// Character at index 3 is 'b'
// Character at index 4 is 'c,'
// 
// Result of ParseCombiningCharacters:
// Character 0 starts at index 0
// Character 1 starts at index 3
// Character 2 starts at index 4
using namespace System;
using namespace System::Text;
using namespace System::Globalization;


// Show how to enumerate each real character (honoring surrogates)
// in a string.

void EnumTextElements(String^ combiningChars)
{
    // This StringBuilder holds the output results.
    StringBuilder^ sb = gcnew StringBuilder();

    // Use the enumerator returned from GetTextElementEnumerator
    // method to examine each real character.
    TextElementEnumerator^ charEnum =
        StringInfo::GetTextElementEnumerator(combiningChars);
    while (charEnum->MoveNext())
    {
        sb->AppendFormat("Character at index {0} is '{1}'{2}", 
            charEnum->ElementIndex, charEnum->GetTextElement(), 
            Environment::NewLine);
    }

    // Show the results.
    Console::WriteLine("Result of GetTextElementEnumerator:");
    Console::WriteLine(sb);
}


// Show how to discover the index of each real character
// (honoring surrogates) in a string.

void EnumTextElementIndexes(String^ combiningChars)
{
    // This StringBuilder holds the output results.
    StringBuilder^ sb = gcnew StringBuilder();

    // Use the ParseCombiningCharacters method to
    // get the index of each real character in the string.
    array <int>^ textElemIndex =
        StringInfo::ParseCombiningCharacters(combiningChars);

    // Iterate through each real character showing the character
    // and the index where it was found.
    for (int i = 0; i < textElemIndex->Length; i++)
    {
        sb->AppendFormat("Character {0} starts at index {1}{2}",
            i, textElemIndex[i], Environment::NewLine);
    }

    // Show the results.
    Console::WriteLine("Result of ParseCombiningCharacters:");
    Console::WriteLine(sb);
}

int main()
{

    // The string below contains combining characters.
    String^ combiningChars = L"a\u0304\u0308bc\u0327";

    // Show each 'character' in the string.
    EnumTextElements(combiningChars);

    // Show the index in the string where each 'character' starts.
    EnumTextElementIndexes(combiningChars);

};

// This code produces the following output.
//
// Result of GetTextElementEnumerator:
// Character at index 0 is 'a-"'
// Character at index 3 is 'b'
// Character at index 4 is 'c,'
//
// Result of ParseCombiningCharacters:
// Character 0 starts at index 0
// Character 1 starts at index 3
// Character 2 starts at index 4

平台

Windows 98、Windows 2000 SP4、Windows CE、Windows Millennium Edition、Windows Mobile for Pocket PC、Windows Mobile for Smartphone、Windows Server 2003、Windows XP Media Center Edition、Windows XP Professional x64 Edition、Windows XP SP2、Windows XP Starter Edition

.NET Framework 并不是对每个平台的所有版本都提供支持。有关受支持版本的列表,请参见系统要求

版本信息

.NET Framework

受以下版本支持:2.0、1.1、1.0

.NET Compact Framework

受以下版本支持:2.0、1.0

请参见

参考

StringInfo 类
StringInfo 成员
System.Globalization 命名空间
SubstringByTextElements
LengthInTextElements