词法结构Lexical structure

ProgramsPrograms

C# 程序由一个或多个源文件组成,该文件正式称为编译单元编译单元)。A C# program consists of one or more source files, known formally as compilation units (Compilation units). 源文件是 Unicode 字符的有序序列。A source file is an ordered sequence of Unicode characters. 源文件与文件系统中的文件通常具有一对一的对应关系,但不需要此函件。Source files typically have a one-to-one correspondence with files in a file system, but this correspondence is not required. 为获得最大的可移植性,建议使用 UTF-8 编码对文件系统中的文件进行编码。For maximal portability, it is recommended that files in a file system be encoded with the UTF-8 encoding.

从概念上讲,程序是使用三个步骤编译的:Conceptually speaking, a program is compiled using three steps:

  1. 转换,将特定字符已知和编码方案中的文件转换为 Unicode 字符序列。Transformation, which converts a file from a particular character repertoire and encoding scheme into a sequence of Unicode characters.
  2. 词法分析,将 Unicode 输入字符流转换为标记流。Lexical analysis, which translates a stream of Unicode input characters into a stream of tokens.
  3. 语法分析,将令牌流转换为可执行代码。Syntactic analysis, which translates the stream of tokens into executable code.

语法Grammars

此规范提供使用两个语法C#的编程语言的语法。This specification presents the syntax of the C# programming language using two grammars. 词法语法词法语法)定义 Unicode 字符如何合并为行终止符、空格、注释、标记和预处理指令。The lexical grammar (Lexical grammar) defines how Unicode characters are combined to form line terminators, white space, comments, tokens, and pre-processing directives. 句法文法句法文法)定义如何将词法文法生成的标记组合为窗体C#程序。The syntactic grammar (Syntactic grammar) defines how the tokens resulting from the lexical grammar are combined to form C# programs.

语法表示法Grammar notation

词法语法和句法语法以巴科斯-诺尔范式形式显示,使用 ANTLR 语法工具的表示法。The lexical and syntactic grammars are presented in Backus-Naur form using the notation of the ANTLR grammar tool.

词法语法Lexical grammar

的C#词法语法显示在词法分析标记预处理指令中。The lexical grammar of C# is presented in Lexical analysis, Tokens, and Pre-processing directives. 词法语法的终端符号是 Unicode 字符集的字符,而词法语法指定如何将字符组合到一起形成标记(标记)、空白(空格)、注释(注释)和预处理指令(预处理指令)。The terminal symbols of the lexical grammar are the characters of the Unicode character set, and the lexical grammar specifies how characters are combined to form tokens (Tokens), white space (White space), comments (Comments), and pre-processing directives (Pre-processing directives).

C#程序中的每个源文件都必须符合词法文法的输入生产(词法分析)。Every source file in a C# program must conform to the input production of the lexical grammar (Lexical analysis).

语法语法Syntactic grammar

本章节后面的C#章节和附录中提供了句法语法。The syntactic grammar of C# is presented in the chapters and appendices that follow this chapter. 语法语法的终端符号是由词法语法定义的标记,句法语法指定如何将标记组合为窗体C#程序。The terminal symbols of the syntactic grammar are the tokens defined by the lexical grammar, and the syntactic grammar specifies how tokens are combined to form C# programs.

C#程序中的每个源文件都必须符合句法文法(编译单元)的compilation_unit生产。Every source file in a C# program must conform to the compilation_unit production of the syntactic grammar (Compilation units).

词法分析Lexical analysis

输入生产定义C#源文件的词法结构。The input production defines the lexical structure of a C# source file. 程序中的C#每个源文件都必须符合此词法文法生产。Each source file in a C# program must conform to this lexical grammar production.

input
    : input_section?
    ;

input_section
    : input_section_part+
    ;

input_section_part
    : input_element* new_line
    | pp_directive
    ;

input_element
    : whitespace
    | comment
    | token
    ;

五个基本元素构成C#源文件的词法结构:行结束符(行终止符)、空白(空格)、注释(注释)、标记(标记)和预处理指令(预处理指令)。Five basic elements make up the lexical structure of a C# source file: Line terminators (Line terminators), white space (White space), comments (Comments), tokens (Tokens), and pre-processing directives (Pre-processing directives). 在这些基本元素中,只有标记在C#程序的句法语法中非常重要(句法语法)。Of these basic elements, only tokens are significant in the syntactic grammar of a C# program (Syntactic grammar).

C#源文件的词法处理包括将文件缩减为一系列标记,后者成为句法分析的输入。The lexical processing of a C# source file consists of reducing the file into a sequence of tokens which becomes the input to the syntactic analysis. 行终止符、空白和注释可用于分隔标记,预处理指令可能会导致跳过源文件的各个部分,否则,这些词法元素不会影响C#程序的语法结构。Line terminators, white space, and comments can serve to separate tokens, and pre-processing directives can cause sections of the source file to be skipped, but otherwise these lexical elements have no impact on the syntactic structure of a C# program.

在内插字符串文本(内插字符串文本)的情况下,单个标记最初由词法分析生成,但被拆分为多个输入元素,这些输入元素会反复进入词法分析直到所有内插字符串文本已解析。In the case of interpolated string literals (Interpolated string literals) a single token is initially produced by lexical analysis, but is broken up into several input elements which are repeatedly subjected to lexical analysis until all interpolated string literals have been resolved. 然后,生成的令牌作为句法分析的输入。The resulting tokens then serve as input to the syntactic analysis.

当多个词法语法生产与源文件中的一系列字符匹配时,词法处理始终形成可能的最长词汇元素。When several lexical grammar productions match a sequence of characters in a source file, the lexical processing always forms the longest possible lexical element. 例如,字符序列//作为单行注释的开头处理,因为该词法元素比单个/标记长。For example, the character sequence // is processed as the beginning of a single-line comment because that lexical element is longer than a single / token.

行终止符Line terminators

行结束符将C#源文件中的字符分为多行。Line terminators divide the characters of a C# source file into lines.

new_line
    : '<Carriage return character (U+000D)>'
    | '<Line feed character (U+000A)>'
    | '<Carriage return character (U+000D) followed by line feed character (U+000A)>'
    | '<Next line character (U+0085)>'
    | '<Line separator character (U+2028)>'
    | '<Paragraph separator character (U+2029)>'
    ;

为了与添加文件结尾标记的源代码编辑工具兼容,并允许将源文件视为一系列正确终止的行,将按顺序对C#程序中的每个源文件应用以下转换:For compatibility with source code editing tools that add end-of-file markers, and to enable a source file to be viewed as a sequence of properly terminated lines, the following transformations are applied, in order, to every source file in a C# program:

  • 如果源文件的最后一个字符是 Control z 字符(U+001A),则删除该字符。If the last character of the source file is a Control-Z character (U+001A), this character is deleted.
  • 如果源文件不为空且U+000D源文件的最后一个字符不是回车符(U+000D)、换行符(U+000A)、行分隔符U+2028(),则将回车符()添加到源文件的末尾)或段落分隔符(U+2029)。A carriage-return character (U+000D) is added to the end of the source file if that source file is non-empty and if the last character of the source file is not a carriage return (U+000D), a line feed (U+000A), a line separator (U+2028), or a paragraph separator (U+2029).

注释Comments

支持两种形式的注释:单行注释和分隔注释。Two forms of comments are supported: single-line comments and delimited comments. 单行注释以字符//开头,并延伸到源行的末尾。Single-line comments start with the characters // and extend to the end of the source line. 分隔注释以字符/*开头并以字符*/结尾。Delimited comments start with the characters /* and end with the characters */. 分隔注释可能跨多行。Delimited comments may span multiple lines.

comment
    : single_line_comment
    | delimited_comment
    ;

single_line_comment
    : '//' input_character*
    ;

input_character
    : '<Any Unicode character except a new_line_character>'
    ;

new_line_character
    : '<Carriage return character (U+000D)>'
    | '<Line feed character (U+000A)>'
    | '<Next line character (U+0085)>'
    | '<Line separator character (U+2028)>'
    | '<Paragraph separator character (U+2029)>'
    ;

delimited_comment
    : '/*' delimited_comment_section* asterisk+ '/'
    ;

delimited_comment_section
    : '/'
    | asterisk* not_slash_or_asterisk
    ;

asterisk
    : '*'
    ;

not_slash_or_asterisk
    : '<Any Unicode character except / or *>'
    ;

注释不嵌套。Comments do not nest. /*字符序列// //在注释中没有特殊含义,并且字符序列/*在分隔注释中没有特殊含义。 */The character sequences /* and */ have no special meaning within a // comment, and the character sequences // and /* have no special meaning within a delimited comment.

在字符和字符串文本中不处理注释。Comments are not processed within character and string literals.

示例The example

/* Hello, world program
   This program writes "hello, world" to the console
*/
class Hello
{
    static void Main() {
        System.Console.WriteLine("hello, world");
    }
}

包含分隔注释。includes a delimited comment.

示例The example

// Hello, world program
// This program writes "hello, world" to the console
//
class Hello // any name will do for this class
{
    static void Main() { // this method must be named "Main"
        System.Console.WriteLine("hello, world");
    }
}

显示若干单行注释。shows several single-line comments.

空格White space

空格定义为带有 Unicode 类 Zs 的任何字符(包括空格字符)以及水平制表符、垂直制表符、换行符和换页符。White space is defined as any character with Unicode class Zs (which includes the space character) as well as the horizontal tab character, the vertical tab character, and the form feed character.

whitespace
    : '<Any character with Unicode class Zs>'
    | '<Horizontal tab character (U+0009)>'
    | '<Vertical tab character (U+000B)>'
    | '<Form feed character (U+000C)>'
    ;

标记Tokens

有多种类型的令牌:标识符、关键字、文本、运算符和标点符号。There are several kinds of tokens: identifiers, keywords, literals, operators, and punctuators. 空白和注释不是标记,不过它们充当标记的分隔符。White space and comments are not tokens, though they act as separators for tokens.

token
    : identifier
    | keyword
    | integer_literal
    | real_literal
    | character_literal
    | string_literal
    | interpolated_string_literal
    | operator_or_punctuator
    ;

Unicode 字符转义序列Unicode character escape sequences

Unicode 字符转义序列表示一个 Unicode 字符。A Unicode character escape sequence represents a Unicode character. Unicode 字符转义序列在标识符(标识符)、字符文本(字符文本)和常规字符串文本(字符串文字)中进行处理。Unicode character escape sequences are processed in identifiers (Identifiers), character literals (Character literals), and regular string literals (String literals). 不会在任何其他位置(例如,使用 operator、标点符号或关键字)处理 Unicode 字符转义。A Unicode character escape is not processed in any other location (for example, to form an operator, punctuator, or keyword).

unicode_escape_sequence
    : '\\u' hex_digit hex_digit hex_digit hex_digit
    | '\\U' hex_digit hex_digit hex_digit hex_digit hex_digit hex_digit hex_digit hex_digit
    ;

Unicode 转义序列表示由 "\u" 或 "\U" 字符后面的十六进制数构成的单个 unicode 字符。A Unicode escape sequence represents the single Unicode character formed by the hexadecimal number following the "\u" or "\U" characters. 由于C#使用字符和字符串值中的 unicode 码位的16位编码,字符文本中不允许使用 U + 10000 到 u + 10FFFF 范围内的 unicode 字符,而是使用字符串文本中的 unicode 代理项对来表示。Since C# uses a 16-bit encoding of Unicode code points in characters and string values, a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal. 不支持0x10FFFF 以上的码位的 Unicode 字符。Unicode characters with code points above 0x10FFFF are not supported.

不会执行多个转换。Multiple translations are not performed. 例如,字符串文本 "\u005Cu005C" 等效于 "\u005C",而不\是 ""。For instance, the string literal "\u005Cu005C" is equivalent to "\u005C" rather than "\". Unicode 值\u005C为字符 "\"。The Unicode value \u005C is the character "\".

示例The example

class Class1
{
    static void Test(bool \u0066) {
        char c = '\u0066';
        if (\u0066)
            System.Console.WriteLine(c.ToString());
    }        
}

显示了的\u0066几个用法,它是字母 "f" 的转义序列。shows several uses of \u0066, which is the escape sequence for the letter "f". 该程序等效于The program is equivalent to

class Class1
{
    static void Test(bool f) {
        char c = 'f';
        if (f)
            System.Console.WriteLine(c.ToString());
    }        
}

标识符Identifiers

本节中给出的标识符规则完全与 Unicode 标准附录31推荐的规则相对应,只不过允许将下划线作为初始字符(如同 C 编程语言中的传统),Unicode 转义序列是允许在标识符中使用 "@" 字符作为前缀,以允许将关键字用作标识符。The rules for identifiers given in this section correspond exactly to those recommended by the Unicode Standard Annex 31, except that underscore is allowed as an initial character (as is traditional in the C programming language), Unicode escape sequences are permitted in identifiers, and the "@" character is allowed as a prefix to enable keywords to be used as identifiers.

identifier
    : available_identifier
    | '@' identifier_or_keyword
    ;

available_identifier
    : '<An identifier_or_keyword that is not a keyword>'
    ;

identifier_or_keyword
    : identifier_start_character identifier_part_character*
    ;

identifier_start_character
    : letter_character
    | '_'
    ;

identifier_part_character
    : letter_character
    | decimal_digit_character
    | connecting_character
    | combining_character
    | formatting_character
    ;

letter_character
    : '<A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl>'
    | '<A unicode_escape_sequence representing a character of classes Lu, Ll, Lt, Lm, Lo, or Nl>'
    ;

combining_character
    : '<A Unicode character of classes Mn or Mc>'
    | '<A unicode_escape_sequence representing a character of classes Mn or Mc>'
    ;

decimal_digit_character
    : '<A Unicode character of the class Nd>'
    | '<A unicode_escape_sequence representing a character of the class Nd>'
    ;

connecting_character
    : '<A Unicode character of the class Pc>'
    | '<A unicode_escape_sequence representing a character of the class Pc>'
    ;

formatting_character
    : '<A Unicode character of the class Cf>'
    | '<A unicode_escape_sequence representing a character of the class Cf>'
    ;

有关上面提到的 Unicode 字符类的信息,请参阅 Unicode 标准版本3.0,第4.5 节。For information on the Unicode character classes mentioned above, see The Unicode Standard, Version 3.0, section 4.5.

有效标识符的示例包括 "identifier1"、"_identifier2" 和 "@if"。Examples of valid identifiers include "identifier1", "_identifier2", and "@if".

符合标准的程序中的标识符必须是 Unicode 范式 C 定义的规范格式,如 Unicode 标准附录15所定义。An identifier in a conforming program must be in the canonical format defined by Unicode Normalization Form C, as defined by Unicode Standard Annex 15. 如果遇到非范式规范的标识符,则该行为是实现定义的;但是,不需要诊断。The behavior when encountering an identifier not in Normalization Form C is implementation-defined; however, a diagnostic is not required.

前缀 "@" 允许将关键字用作标识符,这在与其他编程语言交互时非常有用。The prefix "@" enables the use of keywords as identifiers, which is useful when interfacing with other programming languages. 该字符@实际上不是标识符的一部分,因此标识符可能以其他语言显示为普通标识符,不含前缀。The character @ is not actually part of the identifier, so the identifier might be seen in other languages as a normal identifier, without the prefix. 带有@前缀的标识符称为逐字标识符An identifier with an @ prefix is called a verbatim identifier. 允许对不是关键字的标识符使用前缀,但强烈建议不要使用它作为样式。@Use of the @ prefix for identifiers that are not keywords is permitted, but strongly discouraged as a matter of style.

示例:The example:

class @class
{
    public static void @static(bool @bool) {
        if (@bool)
            System.Console.WriteLine("true");
        else
            System.Console.WriteLine("false");
    }    
}

class Class1
{
    static void M() {
        cl\u0061ss.st\u0061tic(true);
    }
}

定义名class为 "" 的类,该类具有static名为 "" 的静态方法,该bool方法采用名为 "" 的参数。defines a class named "class" with a static method named "static" that takes a parameter named "bool". 请注意,由于关键字中不允许使用 Unicode 转义,因此标记cl\u0061ss"" 是标识符,与 "@class" 具有相同的标识符。Note that since Unicode escapes are not permitted in keywords, the token "cl\u0061ss" is an identifier, and is the same identifier as "@class".

如果两个标识符在应用以下转换后相同,则将其视为相同:Two identifiers are considered the same if they are identical after the following transformations are applied, in order:

  • 删除前缀 "@" (如果使用)。The prefix "@", if used, is removed.
  • 每个unicode_escape_sequence都转换为其对应的 unicode 字符。Each unicode_escape_sequence is transformed into its corresponding Unicode character.
  • 删除任何formatting_characterAny formatting_characters are removed.

包含两个连续下划线字符(U+005F)的标识符保留给实现使用。Identifiers containing two consecutive underscore characters (U+005F) are reserved for use by the implementation. 例如,实现可能提供以两个下划线开头的扩展关键字。For example, an implementation might provide extended keywords that begin with two underscores.

关键字Keywords

关键字是类似于标识符的字符序列(保留),不能用作标识符,除非以@字符开头。A keyword is an identifier-like sequence of characters that is reserved, and cannot be used as an identifier except when prefaced by the @ character.

keyword
    : 'abstract' | 'as'       | 'base'       | 'bool'      | 'break'
    | 'byte'     | 'case'     | 'catch'      | 'char'      | 'checked'
    | 'class'    | 'const'    | 'continue'   | 'decimal'   | 'default'
    | 'delegate' | 'do'       | 'double'     | 'else'      | 'enum'
    | 'event'    | 'explicit' | 'extern'     | 'false'     | 'finally'
    | 'fixed'    | 'float'    | 'for'        | 'foreach'   | 'goto'
    | 'if'       | 'implicit' | 'in'         | 'int'       | 'interface'
    | 'internal' | 'is'       | 'lock'       | 'long'      | 'namespace'
    | 'new'      | 'null'     | 'object'     | 'operator'  | 'out'
    | 'override' | 'params'   | 'private'    | 'protected' | 'public'
    | 'readonly' | 'ref'      | 'return'     | 'sbyte'     | 'sealed'
    | 'short'    | 'sizeof'   | 'stackalloc' | 'static'    | 'string'
    | 'struct'   | 'switch'   | 'this'       | 'throw'     | 'true'
    | 'try'      | 'typeof'   | 'uint'       | 'ulong'     | 'unchecked'
    | 'unsafe'   | 'ushort'   | 'using'      | 'virtual'   | 'void'
    | 'volatile' | 'while'
    ;

在语法中的某些位置,特定标识符具有特殊意义,但不是关键字。In some places in the grammar, specific identifiers have special meaning, but are not keywords. 此类标识符有时称为 "上下文关键字"。Such identifiers are sometimes referred to as "contextual keywords". 例如,在属性声明中,"get" 和 "set" 标识符具有特殊意义(取值函数)。For example, within a property declaration, the "get" and "set" identifiers have special meaning (Accessors). 此位置不允许get使用set或以外的标识符,因此,此使用不会与使用这些字词作为标识符冲突。An identifier other than get or set is never permitted in these locations, so this use does not conflict with a use of these words as identifiers. 在其他情况下,如在隐式类型var化局部变量声明(局部变量声明)中使用标识符 "" 时,上下文关键字可能与声明的名称冲突。In other cases, such as with the identifier "var" in implicitly typed local variable declarations (Local variable declarations), a contextual keyword can conflict with declared names. 在这种情况下,声明的名称优先于将标识符用作上下文关键字。In such cases, the declared name takes precedence over the use of the identifier as a contextual keyword.

文本Literals

文本是值的源代码表示形式。A literal is a source code representation of a value.

literal
    : boolean_literal
    | integer_literal
    | real_literal
    | character_literal
    | string_literal
    | null_literal
    ;

布尔文本Boolean literals

有两个布尔文本值: truefalseThere are two boolean literal values: true and false.

boolean_literal
    : 'true'
    | 'false'
    ;

Boolean_literal的类型为 boolThe type of a boolean_literal is bool.

整数文本Integer literals

整数文本用于int写入类型为longuint、和ulong的值。Integer literals are used to write values of types int, uint, long, and ulong. 整数文本具有两种可能的形式: decimal 和十六进制。Integer literals have two possible forms: decimal and hexadecimal.

integer_literal
    : decimal_integer_literal
    | hexadecimal_integer_literal
    ;

decimal_integer_literal
    : decimal_digit+ integer_type_suffix?
    ;

decimal_digit
    : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
    ;

integer_type_suffix
    : 'U' | 'u' | 'L' | 'l' | 'UL' | 'Ul' | 'uL' | 'ul' | 'LU' | 'Lu' | 'lU' | 'lu'
    ;

hexadecimal_integer_literal
    : '0x' hex_digit+ integer_type_suffix?
    | '0X' hex_digit+ integer_type_suffix?
    ;

hex_digit
    : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
    | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f';

确定整数文本的类型,如下所示:The type of an integer literal is determined as follows:

  • 如果文字没有后缀,则其值可以表示为以下类型的第一个类型int:、 uintlongulongIf the literal has no suffix, it has the first of these types in which its value can be represented: int, uint, long, ulong.
  • 如果U文本的后缀为或u,则它具有以下类型中可以表示其值的第一个类型: uintulongIf the literal is suffixed by U or u, it has the first of these types in which its value can be represented: uint, ulong.
  • 如果L文本的后缀为或l,则它具有以下类型中可以表示其值的第一个类型: longulongIf the literal is suffixed by L or l, it has the first of these types in which its value can be represented: long, ulong.
  • UL如果文本的后缀为、 LU ul uL Ul ulong、、 、Lu、、或lu,则它属于类型。 lUIf the literal is suffixed by UL, Ul, uL, ul, LU, Lu, lU, or lu, it is of type ulong.

如果整数文本所表示的值超出了该ulong类型的范围,则会发生编译时错误。If the value represented by an integer literal is outside the range of the ulong type, a compile-time error occurs.

作为样式,建议L在写入类型long的文本时使用 "" 而不是 "l",因为这样可以很容易地将字母 "l" 与数字 "1" 混淆。As a matter of style, it is suggested that "L" be used instead of "l" when writing literals of type long, since it is easy to confuse the letter "l" with the digit "1".

若要允许尽可能int小的和long值写入为十进制整数文本,请满足以下两个规则:To permit the smallest possible int and long values to be written as decimal integer literals, the following two rules exist:

  • 当值为2147483648(2 ^ 31)且没有integer_type_suffixdecimal_integer_literal显示为紧跟一元减号运算符(一元减号运算符)的标记时,结果为类型 int 的常量,其值为-2147483648 (-2 ^ 31)。When a decimal_integer_literal with the value 2147483648 (2^31) and no integer_type_suffix appears as the token immediately following a unary minus operator token (Unary minus operator), the result is a constant of type int with the value -2147483648 (-2^31). 在所有其他情况下,此类decimal_integer_literal的类型为 uintIn all other situations, such a decimal_integer_literal is of type uint.
  • 当具有值9223372036854775808(2 ^ 63)且没有integer_type_suffixinteger_type_suffix Lldecimal_integer_literal显示为紧跟一元减号运算符标记的标记后(一元减号运算符),则结果为类型 long,其值为-9223372036854775808 (-2 ^ 63)。When a decimal_integer_literal with the value 9223372036854775808 (2^63) and no integer_type_suffix or the integer_type_suffix L or l appears as the token immediately following a unary minus operator token (Unary minus operator), the result is a constant of type long with the value -9223372036854775808 (-2^63). 在所有其他情况下,此类decimal_integer_literal的类型为 ulongIn all other situations, such a decimal_integer_literal is of type ulong.

真实文本Real literals

真实文本用于写入类型float为、 doubledecimal的值。Real literals are used to write values of types float, double, and decimal.

real_literal
    : decimal_digit+ '.' decimal_digit+ exponent_part? real_type_suffix?
    | '.' decimal_digit+ exponent_part? real_type_suffix?
    | decimal_digit+ exponent_part real_type_suffix?
    | decimal_digit+ real_type_suffix
    ;

exponent_part
    : 'e' sign? decimal_digit+
    | 'E' sign? decimal_digit+
    ;

sign
    : '+'
    | '-'
    ;

real_type_suffix
    : 'F' | 'f' | 'D' | 'd' | 'M' | 'm'
    ;

如果未指定real_type_suffix ,则真实文本的类型为 doubleIf no real_type_suffix is specified, the type of the real literal is double. 否则,实数类型后缀将确定真实文本的类型,如下所示:Otherwise, the real type suffix determines the type of the real literal, as follows:

  • F floatf作为后缀的实文本的类型为。A real literal suffixed by F or f is of type float. 例如1f,文本float1.5f、和123.456F都是类型。 1e10fFor example, the literals 1f, 1.5f, 1e10f, and 123.456F are all of type float.
  • D doubled作为后缀的实文本的类型为。A real literal suffixed by D or d is of type double. 例如1d,文本double1.5d、和123.456D都是类型。 1e10dFor example, the literals 1d, 1.5d, 1e10d, and 123.456D are all of type double.
  • M decimalm作为后缀的实文本的类型为。A real literal suffixed by M or m is of type decimal. 例如1m,文本decimal1.5m、和123.456M都是类型。 1e10mFor example, the literals 1m, 1.5m, 1e10m, and 123.456M are all of type decimal. 通过采用精确值将此decimal文本转换为一个值,并在必要时使用银行家舍入(decimal 类型)舍入为最接近的可表示值。This literal is converted to a decimal value by taking the exact value, and, if necessary, rounding to the nearest representable value using banker's rounding (The decimal type). 除非值舍入或值为零(在这种情况下,正负号为0),否则文本中的任何小数位数都将保留。Any scale apparent in the literal is preserved unless the value is rounded or the value is zero (in which latter case the sign and scale will be 0). 因此,将分析2.900m文本,以形成带有符号0、系数2900和刻度3的小数。Hence, the literal 2.900m will be parsed to form the decimal with sign 0, coefficient 2900, and scale 3.

如果指定的文本不能用指定的类型表示,则会发生编译时错误。If the specified literal cannot be represented in the indicated type, a compile-time error occurs.

类型floatdouble的实字的值是通过使用 IEEE "舍入到最近的" 模式确定的。The value of a real literal of type float or double is determined by using the IEEE "round to nearest" mode.

请注意,在实际文本中,小数点后始终需要小数位数。Note that in a real literal, decimal digits are always required after the decimal point. 例如,是1.3F一个真实文本,但1.F不是。For example, 1.3F is a real literal but 1.F is not.

字符文本Character literals

字符文本表示单个字符,通常由引号中的字符组成,如中'a'所示。A character literal represents a single character, and usually consists of a character in quotes, as in 'a'.

注意:ANTLR 语法表示法会使以下混乱!Note: The ANTLR grammar notation makes the following confusing! 在 ANTLR 中,当您\'编写时,它代表一个'引号。In ANTLR, when you write \' it stands for a single quote '. 当你编写\\时,它代表单个反斜杠\And when you write \\ it stands for a single backslash \. 因此,字符文本的第一个规则意味着以单个单引号开始,然后是一个字符,然后是一个引号。Therefore the first rule for a character literal means it starts with a single quote, then a character, then a single quote. 和11个可能的简单转义序列\'\"\\ \b \0 、、\r\a\n 、、\t、、、 \f \v.And the eleven possible simple escape sequences are \', \", \\, \0, \a, \b, \f, \n, \r, \t, \v.

character_literal
    : '\'' character '\''
    ;

character
    : single_character
    | simple_escape_sequence
    | hexadecimal_escape_sequence
    | unicode_escape_sequence
    ;

single_character
    : '<Any character except \' (U+0027), \\ (U+005C), and new_line_character>'
    ;

simple_escape_sequence
    : '\\\'' | '\\"' | '\\\\' | '\\0' | '\\a' | '\\b' | '\\f' | '\\n' | '\\r' | '\\t' | '\\v'
    ;

hexadecimal_escape_sequence
    : '\\x' hex_digit hex_digit? hex_digit? hex_digit?;

在字符中跟在反斜杠字符\()后面的字符必须是下列字符之一: 'a "\0b、、 f, n, r, t, u, U, x, v.A character that follows a backslash character (\) in a character must be one of the following characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v. 否则,将发生编译时错误。Otherwise, a compile-time error occurs.

十六进制转义序列表示单个 Unicode 字符,其值由 "\x" 后面的十六进制数构成。A hexadecimal escape sequence represents a single Unicode character, with the value formed by the hexadecimal number following "\x".

如果字符文本表示的值大于U+FFFF,则会发生编译时错误。If the value represented by a character literal is greater than U+FFFF, a compile-time error occurs.

字符文本中的 unicode 字符转义序列(unicode 字符转义序列)必须在到U+0000 U+FFFF的范围内。A Unicode character escape sequence (Unicode character escape sequences) in a character literal must be in the range U+0000 to U+FFFF.

简单的转义序列表示 Unicode 字符编码,如下表中所述。A simple escape sequence represents a Unicode character encoding, as described in the table below.

转义序列Escape sequence 字符名称Character name Unicode 编码Unicode encoding
\' 单引号Single quote 0x0027
\" 双引号Double quote 0x0022
\\ 反斜杠Backslash 0x005C
\0 nullNull 0x0000
\a 警报Alert 0x0007
\b BackspaceBackspace 0x0008
\f 换页Form feed 0x000C
\n 换行New line 0x000A
\r 回车Carriage return 0x000D
\t 水平制表符Horizontal tab 0x0009
\v 垂直制表符Vertical tab 0x000B

Character_literal的类型为 charThe type of a character_literal is char.

字符串文本String literals

C#支持两种形式的字符串文本:常规字符串文本原义字符串文本C# supports two forms of string literals: regular string literals and verbatim string literals.

正则字符串文字由零个或多个字符括在双引号中,如"hello"中所示,并且可能包括简单转义序列( \t如制表符)和十六进制和 Unicode 转义序列。A regular string literal consists of zero or more characters enclosed in double quotes, as in "hello", and may include both simple escape sequences (such as \t for the tab character), and hexadecimal and Unicode escape sequences.

原义字符串包含一个@字符,后跟一个双引号字符、零个或多个字符和一个右双引号字符。A verbatim string literal consists of an @ character followed by a double-quote character, zero or more characters, and a closing double-quote character. 一个简单的示例@"hello"是。A simple example is @"hello". 在原义字符串文本中,分隔符之间的字符按原义解释,唯一的例外是quote_escape_sequenceIn a verbatim string literal, the characters between the delimiters are interpreted verbatim, the only exception being a quote_escape_sequence. 具体而言,简单转义序列和十六进制和 Unicode 转义序列不会在原义字符串文本中处理。In particular, simple escape sequences, and hexadecimal and Unicode escape sequences are not processed in verbatim string literals. 原义字符串文本可以跨多个行。A verbatim string literal may span multiple lines.

string_literal
    : regular_string_literal
    | verbatim_string_literal
    ;

regular_string_literal
    : '"' regular_string_literal_character* '"'
    ;

regular_string_literal_character
    : single_regular_string_literal_character
    | simple_escape_sequence
    | hexadecimal_escape_sequence
    | unicode_escape_sequence
    ;

single_regular_string_literal_character
    : '<Any character except " (U+0022), \\ (U+005C), and new_line_character>'
    ;

verbatim_string_literal
    : '@"' verbatim_string_literal_character* '"'
    ;

verbatim_string_literal_character
    : single_verbatim_string_literal_character
    | quote_escape_sequence
    ;

single_verbatim_string_literal_character
    : '<any character except ">'
    ;

quote_escape_sequence
    : '""'
    ;

regular_string_literal_character中反斜杠字符(\)后面的字符必须是下列字符之一: '"\0abfn、0 @no__tt-sql,3,4,5。A character that follows a backslash character (\) in a regular_string_literal_character must be one of the following characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v. 否则,将发生编译时错误。Otherwise, a compile-time error occurs.

示例The example

string a = "hello, world";                   // hello, world
string b = @"hello, world";                  // hello, world

string c = "hello \t world";                 // hello      world
string d = @"hello \t world";                // hello \t world

string e = "Joe said \"Hello\" to me";       // Joe said "Hello" to me
string f = @"Joe said ""Hello"" to me";      // Joe said "Hello" to me

string g = "\\\\server\\share\\file.txt";    // \\server\share\file.txt
string h = @"\\server\share\file.txt";       // \\server\share\file.txt

string i = "one\r\ntwo\r\nthree";
string j = @"one
two
three";

显示各种字符串文本。shows a variety of string literals. 最后一个字符串文字j是跨多行的逐字字符串。The last string literal, j, is a verbatim string literal that spans multiple lines. 引号之间的字符(包括空白字符,如换行符)会逐字保留。The characters between the quotation marks, including white space such as new line characters, are preserved verbatim.

由于十六进制转义序列可以具有可变数量的十六进制数字,因此字符串文本"\x123"包含一个具有十六进制值123的单个字符。Since a hexadecimal escape sequence can have a variable number of hex digits, the string literal "\x123" contains a single character with hex value 123. 若要创建一个字符串,该字符串包含的字符的十六进制值为12,后跟字符"\x00123" 3 "\x12" + "3" ,则可以写。To create a string containing the character with hex value 12 followed by the character 3, one could write "\x00123" or "\x12" + "3" instead.

String_literal的类型为 stringThe type of a string_literal is string.

每个字符串文本不一定会生成新的字符串实例。Each string literal does not necessarily result in a new string instance. 如果两个或更多的字符串文本在同一程序中出现,则根据字符串相等运算符(字符串相等运算符)进行等效时,这些字符串将引用相同的字符串实例。When two or more string literals that are equivalent according to the string equality operator (String equality operators) appear in the same program, these string literals refer to the same string instance. 例如,生成的输出For instance, the output produced by

class Test
{
    static void Main() {
        object a = "hello";
        object b = "hello";
        System.Console.WriteLine(a == b);
    }
}

True因为两个文本引用相同的字符串实例。is True because the two literals refer to the same string instance.

内插字符串文本Interpolated string literals

内插字符串与字符串文本类似,但包含用 and { }分隔的孔,其中的表达式可以出现。Interpolated string literals are similar to string literals, but contain holes delimited by { and }, wherein expressions can occur. 在运行时,将对表达式进行计算,目的是将其文本窗体替换为发生该洞的位置的字符串。At runtime, the expressions are evaluated with the purpose of having their textual forms substituted into the string at the place where the hole occurs. 字符串内插的语法和语义在节(内插字符串)中进行了介绍。The syntax and semantics of string interpolation are described in section (Interpolated strings).

与字符串文本一样,内插字符串文本可以是正则为或是原义字符串。Like string literals, interpolated string literals can be either regular or verbatim. 内插正则字符串文本由$""分隔,并由$@""分隔逐字字符串。Interpolated regular string literals are delimited by $" and ", and interpolated verbatim string literals are delimited by $@" and ".

与其他文本一样,内插字符串的词法分析最初会根据下面的语法产生单个令牌。Like other literals, lexical analysis of an interpolated string literal initially results in a single token, as per the grammar below. 但是,在句法分析之前,内插字符串的单个标记将被分解为包含该洞的字符串部分的多个标记,而洞中发生的输入元素会在词法上重新并非。However, before syntactic analysis, the single token of an interpolated string literal is broken into several tokens for the parts of the string enclosing the holes, and the input elements occurring in the holes are lexically analysed again. 这反过来会生成更多的内插字符串文字,但如果词法上正确,最终将导致一系列标记,以便进行语法分析。This may in turn produce more interpolated string literals to be processed, but, if lexically correct, will eventually lead to a sequence of tokens for syntactic analysis to process.

interpolated_string_literal
    : '$' interpolated_regular_string_literal
    | '$' interpolated_verbatim_string_literal
    ;

interpolated_regular_string_literal
    : interpolated_regular_string_whole
    | interpolated_regular_string_start  interpolated_regular_string_literal_body interpolated_regular_string_end
    ;

interpolated_regular_string_literal_body
    : regular_balanced_text
    | interpolated_regular_string_literal_body interpolated_regular_string_mid regular_balanced_text
    ;

interpolated_regular_string_whole
    : '"' interpolated_regular_string_character* '"'
    ;

interpolated_regular_string_start
    : '"' interpolated_regular_string_character* '{'
    ;

interpolated_regular_string_mid
    : interpolation_format? '}' interpolated_regular_string_characters_after_brace? '{'
    ;

interpolated_regular_string_end
    : interpolation_format? '}' interpolated_regular_string_characters_after_brace? '"'
    ;

interpolated_regular_string_characters_after_brace
    : interpolated_regular_string_character_no_brace
    | interpolated_regular_string_characters_after_brace interpolated_regular_string_character
    ;

interpolated_regular_string_character
    : single_interpolated_regular_string_character
    | simple_escape_sequence
    | hexadecimal_escape_sequence
    | unicode_escape_sequence
    | open_brace_escape_sequence
    | close_brace_escape_sequence
    ;

interpolated_regular_string_character_no_brace
    : '<Any interpolated_regular_string_character except close_brace_escape_sequence and any hexadecimal_escape_sequence or unicode_escape_sequence designating } (U+007D)>'
    ;

single_interpolated_regular_string_character
    : '<Any character except \" (U+0022), \\ (U+005C), { (U+007B), } (U+007D), and new_line_character>'
    ;

open_brace_escape_sequence
    : '{{'
    ;

close_brace_escape_sequence
    : '}}'
    ;
    
regular_balanced_text
    : regular_balanced_text_part+
    ;

regular_balanced_text_part
    : single_regular_balanced_text_character
    | delimited_comment
    | '@' identifier_or_keyword
    | string_literal
    | interpolated_string_literal
    | '(' regular_balanced_text ')'
    | '[' regular_balanced_text ']'
    | '{' regular_balanced_text '}'
    ;
    
single_regular_balanced_text_character
    : '<Any character except / (U+002F), @ (U+0040), \" (U+0022), $ (U+0024), ( (U+0028), ) (U+0029), [ (U+005B), ] (U+005D), { (U+007B), } (U+007D) and new_line_character>'
    | '</ (U+002F), if not directly followed by / (U+002F) or * (U+002A)>'
    ;
    
interpolation_format
    : interpolation_format_character+
    ;
    
interpolation_format_character
    : '<Any character except \" (U+0022), : (U+003A), { (U+007B) and } (U+007D)>'
    ;
    
interpolated_verbatim_string_literal
    : interpolated_verbatim_string_whole
    | interpolated_verbatim_string_start interpolated_verbatim_string_literal_body interpolated_verbatim_string_end
    ;

interpolated_verbatim_string_literal_body
    : verbatim_balanced_text
    | interpolated_verbatim_string_literal_body interpolated_verbatim_string_mid verbatim_balanced_text
    ;
    
interpolated_verbatim_string_whole
    : '@"' interpolated_verbatim_string_character* '"'
    ;
    
interpolated_verbatim_string_start
    : '@"' interpolated_verbatim_string_character* '{'
    ;
    
interpolated_verbatim_string_mid
    : interpolation_format? '}' interpolated_verbatim_string_characters_after_brace? '{'
    ;
    
interpolated_verbatim_string_end
    : interpolation_format? '}' interpolated_verbatim_string_characters_after_brace? '"'
    ;
    
interpolated_verbatim_string_characters_after_brace
    : interpolated_verbatim_string_character_no_brace
    | interpolated_verbatim_string_characters_after_brace interpolated_verbatim_string_character
    ;
    
interpolated_verbatim_string_character
    : single_interpolated_verbatim_string_character
    | quote_escape_sequence
    | open_brace_escape_sequence
    | close_brace_escape_sequence
    ;
    
interpolated_verbatim_string_character_no_brace
    : '<Any interpolated_verbatim_string_character except close_brace_escape_sequence>'
    ;
    
single_interpolated_verbatim_string_character
    : '<Any character except \" (U+0022), { (U+007B) and } (U+007D)>'
    ;
    
verbatim_balanced_text
    : verbatim_balanced_text_part+
    ;

verbatim_balanced_text_part
    : single_verbatim_balanced_text_character
    | comment
    | '@' identifier_or_keyword
    | string_literal
    | interpolated_string_literal
    | '(' verbatim_balanced_text ')'
    | '[' verbatim_balanced_text ']'
    | '{' verbatim_balanced_text '}'
    ;
    
single_verbatim_balanced_text_character
    : '<Any character except / (U+002F), @ (U+0040), \" (U+0022), $ (U+0024), ( (U+0028), ) (U+0029), [ (U+005B), ] (U+005D), { (U+007B) and } (U+007D)>'
    | '</ (U+002F), if not directly followed by / (U+002F) or * (U+002A)>'
    ;

Interpolated_string_literal标记作为多个标记和其他输入元素,按以下顺序出现在interpolated_string_literal中:An interpolated_string_literal token is reinterpreted as multiple tokens and other input elements as follows, in order of occurrence in the interpolated_string_literal:

  • 以下各项分别重新解释为单独的标记:前导 $ sign、 interpolated_regular_string_wholeinterpolated_regular_string_startinterpolated_regular_string_midinterpolated_regular_string_endinterpolated_verbatim_string_wholeinterpolated_verbatim_string_startinterpolated_verbatim_string_midinterpolated_verbatim_string_endOccurrences of the following are reinterpreted as separate individual tokens: the leading $ sign, interpolated_regular_string_whole, interpolated_regular_string_start, interpolated_regular_string_mid, interpolated_regular_string_end, interpolated_verbatim_string_whole, interpolated_verbatim_string_start, interpolated_verbatim_string_mid and interpolated_verbatim_string_end.
  • 在这些regular_balanced_textverbatim_balanced_text之间出现的情况被重新处理为input_section词法分析),并重新解释作为输入元素的结果序列。Occurrences of regular_balanced_text and verbatim_balanced_text between these are reprocessed as an input_section (Lexical analysis) and are reinterpreted as the resulting sequence of input elements. 这些转换可能会将内插字符串文本标记包含为重新解释。These may in turn include interpolated string literal tokens to be reinterpreted.

语法分析会将令牌重新组合到interpolated_string_expression (内插的字符串)。Syntactic analysis will recombine the tokens into an interpolated_string_expression (Interpolated strings).

示例 TODOExamples TODO

Null 文本The null literal

null_literal
    : 'null'
    ;

Null_literal可隐式转换为引用类型或可以为 null 的类型。The null_literal can be implicitly converted to a reference type or nullable type.

运算符和标点符号Operators and punctuators

有多种运算符和标点符号。There are several kinds of operators and punctuators. 运算符用在表达式中,用于描述涉及一个或多个操作数的操作。Operators are used in expressions to describe operations involving one or more operands. 例如a + b ,表达式+使用运算符添加两个操作数abFor example, the expression a + b uses the + operator to add the two operands a and b. 标点符号用于分组和分隔。Punctuators are for grouping and separating.

operator_or_punctuator
    : '{'  | '}'  | '['  | ']'  | '('   | ')'  | '.'  | ','  | ':'  | ';'
    | '+'  | '-'  | '*'  | '/'  | '%'   | '&'  | '|'  | '^'  | '!'  | '~'
    | '='  | '<'  | '>'  | '?'  | '??'  | '::' | '++' | '--' | '&&' | '||'
    | '->' | '==' | '!=' | '<=' | '>='  | '+=' | '-=' | '*=' | '/=' | '%='
    | '&=' | '|=' | '^=' | '<<' | '<<=' | '=>'
    ;

right_shift
    : '>>'
    ;

right_shift_assignment
    : '>>='
    ;

Right_shiftright_shift_assignment生产中的竖线用于指示,与句法语法中的其他生产不同,标记之间不允许任何类型的字符(甚至不允许使用空格)。The vertical bar in the right_shift and right_shift_assignment productions are used to indicate that, unlike other productions in the syntactic grammar, no characters of any kind (not even whitespace) are allowed between the tokens. 这些生产是专门处理的,目的是为了实现正确的type_parameter_list类型参数)处理。These productions are treated specially in order to enable the correct handling of type_parameter_lists (Type parameters).

预处理指令Pre-processing directives

预处理指令提供按条件跳过源文件部分的功能,报告错误和警告条件,以及描述源代码的不同区域。The pre-processing directives provide the ability to conditionally skip sections of source files, to report error and warning conditions, and to delineate distinct regions of source code. 术语 "预处理指令" 仅用于与 C 和C++编程语言的一致性。The term "pre-processing directives" is used only for consistency with the C and C++ programming languages. 在C#中,没有单独的预处理步骤;预处理指令作为词法分析阶段的一部分进行处理。In C#, there is no separate pre-processing step; pre-processing directives are processed as part of the lexical analysis phase.

pp_directive
    : pp_declaration
    | pp_conditional
    | pp_line
    | pp_diagnostic
    | pp_region
    | pp_pragma
    ;

以下预处理指令可用:The following pre-processing directives are available:

  • #define#undef,分别用于定义和取消定义条件编译符号(声明指令)。#define and #undef, which are used to define and undefine, respectively, conditional compilation symbols (Declaration directives).
  • #if#elif#else#endif,用于有条件地跳过源代码的各个部分(条件编译指令)。#if, #elif, #else, and #endif, which are used to conditionally skip sections of source code (Conditional compilation directives).
  • #line,用于控制发出的错误和警告的行号(行指令)。#line, which is used to control line numbers emitted for errors and warnings (Line directives).
  • #error#warning,分别用于发出错误和警告(诊断指令)。#error and #warning, which are used to issue errors and warnings, respectively (Diagnostic directives).
  • #region#endregion,用于显式标记源代码的各个部分(Region 指令)。#region and #endregion, which are used to explicitly mark sections of source code (Region directives).
  • #pragma,用于指定编译器的可选上下文信息(杂注指令)。#pragma, which is used to specify optional contextual information to the compiler (Pragma directives).

预处理指令始终占用一行单独的源代码,并始终以#字符和预处理指令名称开头。A pre-processing directive always occupies a separate line of source code and always begins with a # character and a pre-processing directive name. 空格可能出现在#字符和指令名称#之间。White space may occur before the # character and between the # character and the directive name.

#define包含、 、#elif、、、、或#endregion指令的源行可以以单行注释结束。 #if #undef #else #endif #lineA source line containing a #define, #undef, #if, #elif, #else, #endif, #line, or #endregion directive may end with a single-line comment. 在包含预处理指令/* */的源行上不允许使用带分隔符的注释(注释的样式)。Delimited comments (the /* */ style of comments) are not permitted on source lines containing pre-processing directives.

预处理指令不是标记,并且不是句法语法的C#一部分。Pre-processing directives are not tokens and are not part of the syntactic grammar of C#. 但是,预处理指令可用于包含或排除标记序列,并以这种方式影响C#程序的含义。However, pre-processing directives can be used to include or exclude sequences of tokens and can in that way affect the meaning of a C# program. 例如,编译后,程序:For example, when compiled, the program:

#define A
#undef B

class C
{
#if A
    void F() {}
#else
    void G() {}
#endif

#if B
    void H() {}
#else
    void I() {}
#endif
}

生成与程序完全相同的标记序列:results in the exact same sequence of tokens as the program:

class C
{
    void F() {}
    void I() {}
}

因此,在语义上,这两个程序在语法上非常不同,它们是相同的。Thus, whereas lexically, the two programs are quite different, syntactically, they are identical.

条件编译符号Conditional compilation symbols

#if#elif、和指令#endif提供的条件编译功能通过预处理表达式(预处理表达式)和条件#else编译符号。The conditional compilation functionality provided by the #if, #elif, #else, and #endif directives is controlled through pre-processing expressions (Pre-processing expressions) and conditional compilation symbols.

conditional_symbol
    : '<Any identifier_or_keyword except true or false>'
    ;

条件编译符号有两种可能的状态:已定义未定义A conditional compilation symbol has two possible states: defined or undefined. 在源文件的词法处理开始时,不定义条件编译符号,除非它已由外部机制(例如命令行编译器选项)显式定义。At the beginning of the lexical processing of a source file, a conditional compilation symbol is undefined unless it has been explicitly defined by an external mechanism (such as a command-line compiler option). #define处理指令时,该指令中名为的条件编译符号将在该源文件中进行定义。When a #define directive is processed, the conditional compilation symbol named in that directive becomes defined in that source file. 在处理同一符号的#undef指令之前,或在到达源文件末尾之前,该符号保持为已定义。The symbol remains defined until an #undef directive for that same symbol is processed, or until the end of the source file is reached. 这意味着,一个源文件中#define#undef和指令对同一程序中的其他源文件不起作用。An implication of this is that #define and #undef directives in one source file have no effect on other source files in the same program.

在预处理表达式中引用时,定义的条件编译符号具有布尔值true,未定义的条件编译符号具有布尔值。 falseWhen referenced in a pre-processing expression, a defined conditional compilation symbol has the boolean value true, and an undefined conditional compilation symbol has the boolean value false. 在预处理表达式中引用条件编译符号之前,不需要显式声明它们。There is no requirement that conditional compilation symbols be explicitly declared before they are referenced in pre-processing expressions. 相反,未声明的符号只是未定义的, false因此具有值。Instead, undeclared symbols are simply undefined and thus have the value false.

条件编译符号的命名空间是不同的,不同于C#程序中的其他所有命名实体。The name space for conditional compilation symbols is distinct and separate from all other named entities in a C# program. 条件编译符号只能在#define#undef指令以及预处理表达式中引用。Conditional compilation symbols can only be referenced in #define and #undef directives and in pre-processing expressions.

预处理表达式Pre-processing expressions

预处理表达式可以出现在和#if #elif指令中。Pre-processing expressions can occur in #if and #elif directives. 预处理表达式!==允许!=使用&&运算符|| 、、和,括号可用于分组。The operators !, ==, !=, && and || are permitted in pre-processing expressions, and parentheses may be used for grouping.

pp_expression
    : whitespace? pp_or_expression whitespace?
    ;

pp_or_expression
    : pp_and_expression
    | pp_or_expression whitespace? '||' whitespace? pp_and_expression
    ;

pp_and_expression
    : pp_equality_expression
    | pp_and_expression whitespace? '&&' whitespace? pp_equality_expression
    ;

pp_equality_expression
    : pp_unary_expression
    | pp_equality_expression whitespace? '==' whitespace? pp_unary_expression
    | pp_equality_expression whitespace? '!=' whitespace? pp_unary_expression
    ;

pp_unary_expression
    : pp_primary_expression
    | '!' whitespace? pp_unary_expression
    ;

pp_primary_expression
    : 'true'
    | 'false'
    | conditional_symbol
    | '(' whitespace? pp_expression whitespace? ')'
    ;

在预处理表达式中引用时,定义的条件编译符号具有布尔值true,未定义的条件编译符号具有布尔值。 falseWhen referenced in a pre-processing expression, a defined conditional compilation symbol has the boolean value true, and an undefined conditional compilation symbol has the boolean value false.

预处理表达式的计算始终产生布尔值。Evaluation of a pre-processing expression always yields a boolean value. 预处理表达式的计算规则与常量表达式的计算规则相同(常数表达式),只不过只能引用的用户定义实体是条件编译符号。The rules of evaluation for a pre-processing expression are the same as those for a constant expression (Constant expressions), except that the only user-defined entities that can be referenced are conditional compilation symbols.

声明指令Declaration directives

声明指令用于定义或取消定义条件编译符号。The declaration directives are used to define or undefine conditional compilation symbols.

pp_declaration
    : whitespace? '#' whitespace? 'define' whitespace conditional_symbol pp_new_line
    | whitespace? '#' whitespace? 'undef' whitespace conditional_symbol pp_new_line
    ;

pp_new_line
    : whitespace? single_line_comment? new_line
    ;

#define指令的处理会使给定的条件编译符号成为定义,并从跟在指令后面的源行开始。The processing of a #define directive causes the given conditional compilation symbol to become defined, starting with the source line that follows the directive. 同样,处理#undef指令会使给定的条件编译符号变成未定义的,从该指令后面的源行开始。Likewise, the processing of an #undef directive causes the given conditional compilation symbol to become undefined, starting with the source line that follows the directive.

源文件#define#undef的任何和指令必须出现在源文件中的第一个标记标记)之前; 否则,将发生编译时错误。Any #define and #undef directives in a source file must occur before the first token (Tokens) in the source file; otherwise a compile-time error occurs. 在直观的术语#define#undef ,和指令必须位于源文件中的任何 "真实代码" 之前。In intuitive terms, #define and #undef directives must precede any "real code" in the source file.

示例:The example:

#define Enterprise

#if Professional || Enterprise
    #define Advanced
#endif

namespace Megacorp.Data
{
    #if Advanced
    class PivotTable {...}
    #endif
}

有效,因为#define指令位于源文件中的第一个标记namespace (关键字)之前。is valid because the #define directives precede the first token (the namespace keyword) in the source file.

下面的示例会导致编译时错误,因为#define它会跟随真实代码:The following example results in a compile-time error because a #define follows real code:

#define A
namespace N
{
    #define B
    #if B
    class Class1 {}
    #endif
}

可以定义已定义的条件编译符号,而不会对该符号进行任何#undef干预。 #defineA #define may define a conditional compilation symbol that is already defined, without there being any intervening #undef for that symbol. 下面的示例定义条件编译符号A ,然后再次定义它。The example below defines a conditional compilation symbol A and then defines it again.

#define A
#define A

#undef可能 "取消定义" 未定义的条件编译符号。A #undef may "undefine" a conditional compilation symbol that is not defined. 下面的示例定义条件编译符号A ,然后将其取消定义两次; 虽然第二个#undef不起作用,但仍有效。The example below defines a conditional compilation symbol A and then undefines it twice; although the second #undef has no effect, it is still valid.

#define A
#undef A
#undef A

条件编译指令Conditional compilation directives

条件编译指令用于有条件地包含或排除源文件的某些部分。The conditional compilation directives are used to conditionally include or exclude portions of a source file.

pp_conditional
    : pp_if_section pp_elif_section* pp_else_section? pp_endif
    ;

pp_if_section
    : whitespace? '#' whitespace? 'if' whitespace pp_expression pp_new_line conditional_section?
    ;

pp_elif_section
    : whitespace? '#' whitespace? 'elif' whitespace pp_expression pp_new_line conditional_section?
    ;

pp_else_section:
    | whitespace? '#' whitespace? 'else' pp_new_line conditional_section?
    ;

pp_endif
    : whitespace? '#' whitespace? 'endif' pp_new_line
    ;

conditional_section
    : input_section
    | skipped_section
    ;

skipped_section
    : skipped_section_part+
    ;

skipped_section_part
    : skipped_characters? new_line
    | pp_directive
    ;

skipped_characters
    : whitespace? not_number_sign input_character*
    ;

not_number_sign
    : '<Any input_character except #>'
    ;

如语法所示,必须按顺序( #if按顺序)、指令、零个或多个#elif指令、零个或#endif一个#else指令以及指令来写入条件编译指令。As indicated by the syntax, conditional compilation directives must be written as sets consisting of, in order, an #if directive, zero or more #elif directives, zero or one #else directive, and an #endif directive. 在指令与源代码的条件部分之间。Between the directives are conditional sections of source code. 每个部分都由前面的指令控制。Each section is controlled by the immediately preceding directive. 条件部分本身可能包含嵌套的条件编译指令,前提是这些指令构成了完整的集。A conditional section may itself contain nested conditional compilation directives provided these directives form complete sets.

Pp_conditional最多为常规词法处理选择一个包含的conditional_sectionA pp_conditional selects at most one of the contained conditional_sections for normal lexical processing:

  • 将按顺序计算 #if 和 @no__t 2 指令的pp_expression,直到其中一个结果 @no__t 为3。The pp_expressions of the #if and #elif directives are evaluated in order until one yields true. 如果表达式产生 true,则选择相应指令的conditional_sectionIf an expression yields true, the conditional_section of the corresponding directive is selected.
  • 如果所有pp_expressionfalse,并且 #else 指令存在,则选择 #else 指令的conditional_sectionIf all pp_expressions yield false, and if an #else directive is present, the conditional_section of the #else directive is selected.
  • 否则,不会选择任何conditional_sectionOtherwise, no conditional_section is selected.

所选的conditional_section(如果有)将作为一般input_section进行处理:节中包含的源代码必须符合词法语法;标记是从节中的源代码生成的;部分中的和预处理指令具有指定的效果。The selected conditional_section, if any, is processed as a normal input_section: the source code contained in the section must adhere to the lexical grammar; tokens are generated from the source code in the section; and pre-processing directives in the section have the prescribed effects.

剩余的conditional_section(如果有)将被处理为skipped_sections:除预处理指令以外,部分中的源代码无需遵守词法语法;不会从该部分中的源代码生成任何标记;部分中的和预处理指令必须在词法上是正确的,但不会进行处理。The remaining conditional_sections, if any, are processed as skipped_sections: except for pre-processing directives, the source code in the section need not adhere to the lexical grammar; no tokens are generated from the source code in the section; and pre-processing directives in the section must be lexically correct but are not otherwise processed. 在处理为skipped_sectionconditional_section中,任何嵌套的conditional_section(包含在嵌套 #if @no__t ... #endregion#region ...-6 构造)中也作为skipped_ 处理节Within a conditional_section that is being processed as a skipped_section, any nested conditional_sections (contained in nested #if...#endif and #region...#endregion constructs) are also processed as skipped_sections.

下面的示例说明了条件编译指令如何嵌套:The following example illustrates how conditional compilation directives can nest:

#define Debug       // Debugging on
#undef Trace        // Tracing off

class PurchaseTransaction
{
    void Commit() {
        #if Debug
            CheckConsistency();
            #if Trace
                WriteToLog(this.ToString());
            #endif
        #endif
        CommitHelper();
    }
}

除预处理指令外,跳过的源代码不受词法分析的限制。Except for pre-processing directives, skipped source code is not subject to lexical analysis. 例如,尽管#else部分中出现未终止的注释,以下内容仍有效:For example, the following is valid despite the unterminated comment in the #else section:

#define Debug        // Debugging on

class PurchaseTransaction
{
    void Commit() {
        #if Debug
            CheckConsistency();
        #else
            /* Do something else
        #endif
    }
}

但请注意,即使在源代码中跳过的部分,预处理指令也需要在词法上正确。Note, however, that pre-processing directives are required to be lexically correct even in skipped sections of source code.

当预处理指令出现在多行输入元素中时,不会对其进行处理。Pre-processing directives are not processed when they appear inside multi-line input elements. 例如,程序:For example, the program:

class Hello
{
    static void Main() {
        System.Console.WriteLine(@"hello, 
#if Debug
        world
#else
        Nebraska
#endif
        ");
    }
}

输出结果为:results in the output:

hello,
#if Debug
        world
#else
        Nebraska
#endif

在特殊情况下,处理的预处理指令集可能取决于pp_expression的计算。In peculiar cases, the set of pre-processing directives that is processed might depend on the evaluation of the pp_expression. 示例:The example:

#if X
    /*
#else
    /* */ class Q { }
#endif

class不管是否{ Q }定义了,始终会生成相同的令牌流()。Xalways produces the same token stream (class Q { }), regardless of whether or not X is defined. 如果X定义了,则仅处理的指令#if#endif和,因为有多行注释。If X is defined, the only processed directives are #if and #endif, due to the multi-line comment. 如果X未定义,则三个指令#if#else#endif、)是指令集的一部分。If X is undefined, then three directives (#if, #else, #endif) are part of the directive set.

诊断指令Diagnostic directives

诊断指令用于显式生成错误和警告消息,其报告方式与其他编译时错误和警告的方式相同。The diagnostic directives are used to explicitly generate error and warning messages that are reported in the same way as other compile-time errors and warnings.

pp_diagnostic
    : whitespace? '#' whitespace? 'error' pp_message
    | whitespace? '#' whitespace? 'warning' pp_message
    ;

pp_message
    : new_line
    | whitespace input_character* new_line
    ;

示例:The example:

#warning Code review needed before check-in

#if Debug && Retail
    #error A build can't be both debug and retail
#endif

class Test {...}

始终产生警告("签入前需要代码评审"),并且如果同时定义了条件符号DebugRetail ,则生成编译时错误("a build 不能同时为调试和零售")。always produces a warning ("Code review needed before check-in"), and produces a compile-time error ("A build can't be both debug and retail") if the conditional symbols Debug and Retail are both defined. 请注意, pp_message可以包含任意文本;具体而言,它不需要包含格式正确的标记,如单词 can't 中的单引号所示。Note that a pp_message can contain arbitrary text; specifically, it need not contain well-formed tokens, as shown by the single quote in the word can't.

区域指令Region directives

区域指令用于显式标记源代码区域。The region directives are used to explicitly mark regions of source code.

pp_region
    : pp_start_region conditional_section? pp_end_region
    ;

pp_start_region
    : whitespace? '#' whitespace? 'region' pp_message
    ;

pp_end_region
    : whitespace? '#' whitespace? 'endregion' pp_message
    ;

无语义含义附加到区域;区域旨在供程序员或自动工具用来标记源代码的一部分。No semantic meaning is attached to a region; regions are intended for use by the programmer or by automated tools to mark a section of source code. #region#endregion指令中指定的消息同样没有语义含义; 它仅用于标识区域。The message specified in a #region or #endregion directive likewise has no semantic meaning; it merely serves to identify the region. 匹配 #region#endregion 指令可能具有不同的pp_messageMatching #region and #endregion directives may have different pp_messages.

区域的词法处理:The lexical processing of a region:

#region
...
#endregion

完全对应于格式为的条件编译指令的词法处理:corresponds exactly to the lexical processing of a conditional compilation directive of the form:

#if true
...
#endif

行指令Line directives

行指令可用于更改编译器在输出(如警告和错误)中报告的行号和源文件名,以及由调用方信息特性(调用方信息特性)使用的行号。Line directives may be used to alter the line numbers and source file names that are reported by the compiler in output such as warnings and errors, and that are used by caller info attributes (Caller info attributes).

行指令最常用于从其他某些文本输入生成C#源代码的元编程工具。Line directives are most commonly used in meta-programming tools that generate C# source code from some other text input.

pp_line
    : whitespace? '#' whitespace? 'line' whitespace line_indicator pp_new_line
    ;

line_indicator
    : decimal_digit+ whitespace file_name
    | decimal_digit+
    | 'default'
    | 'hidden'
    ;

file_name
    : '"' file_name_character+ '"'
    ;

file_name_character
    : '<Any input_character except ">'
    ;

当不#line存在任何指令时,编译器会在其输出中报告真实的行号和源文件名。When no #line directives are present, the compiler reports true line numbers and source file names in its output. 当处理包含不 defaultline_indicator的 @no__t 0 指令时,编译器会将指令后的行视为具有给定的行号(和文件名,如果指定)。When processing a #line directive that includes a line_indicator that is not default, the compiler treats the line after the directive as having the given line number (and file name, if specified).

#line default指令反转所有前面 #line 指令的作用。A #line default directive reverses the effect of all preceding #line directives. 编译器会报告后续行的真实行信息,就像未#line处理过指令一样。The compiler reports true line information for subsequent lines, precisely as if no #line directives had been processed.

#line hidden指令对错误消息中报告的文件和行号没有影响,但会影响源级别调试。A #line hidden directive has no effect on the file and line numbers reported in error messages, but does affect source level debugging. 调试时, #line hidden指令和后续#line指令之间的所有行(即 not #line hidden)都没有行号信息。When debugging, all lines between a #line hidden directive and the subsequent #line directive (that is not #line hidden) have no line number information. 单步执行调试器中的代码时,将完全跳过这些行。When stepping through code in the debugger, these lines will be skipped entirely.

请注意,在不处理转义字符的情况下, file_name不同于正则字符串文本;"\" 字符只是在file_name中指定普通反斜杠字符。Note that a file_name differs from a regular string literal in that escape characters are not processed; the "\" character simply designates an ordinary backslash character within a file_name.

Pragma 指令Pragma directives

#pragma预处理指令用于指定编译器的可选上下文信息。The #pragma preprocessing directive is used to specify optional contextual information to the compiler. #pragma指令中提供的信息永远不会更改程序语义。The information supplied in a #pragma directive will never change program semantics.

pp_pragma
    : whitespace? '#' whitespace? 'pragma' whitespace pragma_body pp_new_line
    ;

pragma_body
    : pragma_warning_body
    ;

C#提供#pragma用于控制编译器警告的指令。C# provides #pragma directives to control compiler warnings. 将来版本的语言可能包含其他#pragma指令。Future versions of the language may include additional #pragma directives. 为了确保与其他C#编译器的互操作性C# ,Microsoft 编译器不会发出未知#pragma指令的编译错误; 此类指令会生成警告。To ensure interoperability with other C# compilers, the Microsoft C# compiler does not issue compilation errors for unknown #pragma directives; such directives do however generate warnings.

pragma warningPragma warning

#pragma warning指令用于在编译后续程序文本期间禁用或还原所有或一组特定的警告消息。The #pragma warning directive is used to disable or restore all or a particular set of warning messages during compilation of the subsequent program text.

pragma_warning_body
    : 'warning' whitespace warning_action
    | 'warning' whitespace warning_action whitespace warning_list
    ;

warning_action
    : 'disable'
    | 'restore'
    ;

warning_list
    : decimal_digit+ (whitespace? ',' whitespace? decimal_digit+)*
    ;

省略#pragma warning警告列表的指令将影响所有警告。A #pragma warning directive that omits the warning list affects all warnings. 包含警告列表的指令只影响列表中指定的那些警告。#pragma warningA #pragma warning directive that includes a warning list affects only those warnings that are specified in the list.

#pragma warning disable指令禁用所有或给定的一组警告。A #pragma warning disable directive disables all or the given set of warnings.

#pragma warning restore指令将所有或给定的警告集还原到编译单元开头处生效的状态。A #pragma warning restore directive restores all or the given set of warnings to the state that was in effect at the beginning of the compilation unit. 请注意,如果在外部禁用了特定警告, #pragma warning restore则将不会重新启用该警告。Note that if a particular warning was disabled externally, a #pragma warning restore (whether for all or the specific warning) will not re-enable that warning.

下面的示例演示#pragma warning如何使用 Microsoft C#编译器中的警告号,使用来暂时禁用引用过时成员时所报告的警告。The following example shows use of #pragma warning to temporarily disable the warning reported when obsoleted members are referenced, using the warning number from the Microsoft C# compiler.

using System;

class Program
{
    [Obsolete]
    static void Foo() {}

    static void Main() {
#pragma warning disable 612
    Foo();
#pragma warning restore 612
    }
}