Unicode:宽字符集Unicode: The Wide-Character Set

宽字符是双字节多语言字符代码。A wide character is a 2-byte multilingual character code. 在现代全球计算业内使用的任意字符(包括技术符号和特殊的发布字符),都可以根据 Unicode 规范表示为宽字符形式。Any character in use in modern computing worldwide, including technical symbols and special publishing characters, can be represented according to the Unicode specification as a wide character. 由包括 Microsoft 在内的大财团开发和维护的 Unicode 标准现在被广泛接受。Developed and maintained by a large consortium that includes Microsoft, the Unicode standard is now widely accepted.

宽字符的类型为 wchar_t。A wide character is of type wchar_t. 宽字符字符串表示为一个 wchar_t[] 数组并由 wchar_t* 指针指向它。A wide-character string is represented as a wchar_t[] array and is pointed to by a wchar_t* pointer. 可以通过在字符前放置字母 L 作为前缀来将任何 ASCII 字符表示为宽字符形式。You can represent any ASCII character as a wide character by prefixing the letter L to the character. 例如,L'\0' 是(16 位)null 终止宽字符。For example, L'\0' is the terminating wide (16-bit) null character. 同样,您可以通过在 ASCII 文本前放置字母 L 作为前缀 (L"Hello") 来将任何 ASCII 字符串文本表示为宽字符串文本形式。Similarly, you can represent any ASCII string literal as a wide-character string literal simply by prefixing the letter L to the ASCII literal (L"Hello").

通常,宽字符在内存中占用的空间比多字节字符多,但处理速度更快。Generally, wide characters take up more space in memory than multibyte characters but are faster to process. 另外,在多字节编码中一次只能表示一个区域设置,而世界上的所有字符集可以同时以 Unicode 表示形式表示。In addition, only one locale can be represented at a time in multibyte encoding, whereas all character sets in the world are represented simultaneously by the Unicode representation.

请参阅See Also

按类别分的通用 C 运行时例程Universal C runtime routines by category