x64 软件约定x64 software conventions

本部分介绍了C++调用约定方法对于 x64,x86 的 64 位扩展体系结构。This section describes the C++ calling convention methodology for x64, the 64-bit extension to the x86 architecture.

X64 调用约定的概述Overview of x64 calling conventions

X86 和 x64 的两个重要区别是 64 位寻址功能和一组展开为 16 个 64 位寄存器供常规使用。Two important differences between x86 and x64 are the 64-bit addressing capability and a flat set of 16 64-bit registers for general use. 提供扩展的注册组,使用 x64 __fastcall调用约定和一个基于 RISC 的异常处理模型。Given the expanded register set, x64 uses the __fastcall calling convention and a RISC-based exception-handling model. __fastcall约定使用前四个参数和堆栈帧的寄存器传递其他参数。The __fastcall convention uses registers for the first four arguments and the stack frame to pass additional arguments. 对于 x64 调用约定,包括注册使用情况的详细信息堆栈参数,则返回值和堆栈展开,请参阅x64 调用约定For details on the x64 calling convention, including register usage, stack parameters, return values, and stack unwinding, see x64 calling convention.

启用适用于 x64 的优化Enable optimization for x64

以下编译器选项可帮助您优化用于 x64 应用程序:The following compiler option helps you optimize your application for x64:

类型和存储Types and storage

本部分介绍枚举和 x64 的数据类型的存储体系结构。This section describes the enumeration and storage of data types for the x64 architecture.

标量类型Scalar types

虽然可以访问数据的任何的对齐方式,但建议对齐上其自然边界或多,以避免性能丢失的数据。Although it's possible to access data with any alignment, it's recommended to align data on its natural boundary, or some multiple, to avoid performance loss. 枚举是常量整数,而是被视为 32 位整数。Enums are constant integers and are treated as 32-bit integers. 下表描述的类型定义和建议的数据的存储,因为这与使用以下的对齐值的对齐方式:The following table describes the type definition and recommended storage for data as it pertains to alignment using the following alignment values:

  • 8 位字节Byte - 8 bits

  • Word 的 16 位Word - 16 bits

  • 双字的 32 位Doubleword - 32 bits

  • 四字的 64 位Quadword - 64 bits

  • Octaword-128 位Octaword - 128 bits

标量类型Scalar Type C 数据类型C Data Type 存储大小 (以字节为单位)Storage Size (in bytes) 建议的对齐方式Recommended Alignment
INT8INT8 charchar 11 ByteByte
UINT8UINT8 unsigned charunsigned char 11 ByteByte
INT16INT16 shortshort 22 Word
UINT16UINT16 unsigned shortunsigned short 22 Word
INT32INT32 intint, long 44 双字Doubleword
UINT32UINT32 无符号的整型、 无符号长unsigned int, unsigned long 44 双字Doubleword
INT64INT64 __int64__int64 88 四字Quadword
UINT64UINT64 unsigned __int64unsigned __int64 88 四字Quadword
FP32 (单精度)FP32 (single precision) floatfloat 44 双字Doubleword
FP64 (双精度)FP64 (double precision) doubledouble 88 四字Quadword
指针POINTER * 88 四字Quadword
__m64__m64 结构 __m64struct __m64 88 四字Quadword
__m128__m128 结构 __m128struct __m128 1616 OctawordOctaword

聚合和联合Aggregates and unions

其他类型,如数组、 结构和联合,具有更严格对齐要求,以确保一致聚合和联合存储和数据检索。Other types, such as arrays, structs, and unions, have stricter alignment requirements that ensure consistent aggregate and union storage and data retrieval. 下面是数组、 结构和联合的定义:Here are the definitions for array, structure, and union:

  • 数组Array

    包含相邻的数据对象的有序的组。Contains an ordered group of adjacent data objects. 每个对象调用元素Each object is called an element. 一个数组中的所有元素都具有相同的大小和数据类型。All elements within an array have the same size and data type.

  • 结构Structure

    包含数据对象的有序的组。Contains an ordered group of data objects. 与数组的元素,在结构中的数据对象可以具有不同的数据类型和大小。Unlike the elements of an array, the data objects within a structure can have different data types and sizes. 在结构中的每个数据对象称为成员Each data object in a structure is called a member.

  • 联合Union

    保存一组命名的成员之一的对象。An object that holds any one of a set of named members. 命名集的成员可以是任何类型。The members of the named set can be of any type. 分配为联合的存储等于该联合,加上任何填充所需的对齐方式的最大成员所需的存储。The storage allocated for a union is equal to the storage required for the largest member of that union, plus any padding required for alignment.

下表显示了标量联合和结构的成员的强烈建议对齐方式。The following table shows the strongly suggested alignment for the scalar members of unions and structures.

标量类型Scalar Type C 数据类型C Data Type 所需的对齐方式Required Alignment
INT8INT8 charchar ByteByte
UINT8UINT8 unsigned charunsigned char ByteByte
INT16INT16 shortshort Word
UINT16UINT16 unsigned shortunsigned short Word
INT32INT32 intint, long 双字Doubleword
UINT32UINT32 无符号的整型、 无符号长unsigned int, unsigned long 双字Doubleword
INT64INT64 __int64__int64 四字Quadword
UINT64UINT64 unsigned __int64unsigned __int64 四字Quadword
FP32 (单精度)FP32 (single precision) floatfloat 双字Doubleword
FP64 (双精度)FP64 (double precision) doubledouble 四字Quadword
指针POINTER * 四字Quadword
__m64__m64 结构 __m64struct __m64 四字Quadword
__m128__m128 结构 __m128struct __m128 OctawordOctaword

以下聚合对齐规则适用:The following aggregate alignment rules apply:

  • 数组的对齐方式是一个数组的元素的对齐方式相同。The alignment of an array is the same as the alignment of one of the elements of the array.

  • 开始处的结构或联合的对齐方式为任何单个成员的最大对齐方式。The alignment of the beginning of a structure or a union is the maximum alignment of any individual member. 结构或联合中的每个成员必须位于其正确对齐上, 表中,这可能需要隐式内部的填充量,具体取决于上一个成员中定义。Each member within the structure or union must be placed at its proper alignment as defined in the previous table, which may require implicit internal padding, depending on the previous member.

  • 结构大小必须为其对齐方式,可能需要填充的最后一个成员后的整数倍。Structure size must be an integral multiple of its alignment, which may require padding after the last member. 由于结构和联合可以分组在数组中,结构或联合的每个数组元素必须开始和结束处先前确定的适当对齐方式。Since structures and unions can be grouped in arrays, each array element of a structure or union must begin and end at the proper alignment previously determined.

  • 就可以保持一致的方式将其大于对齐要求,只要将保留以前的规则中的数据。It is possible to align data in such a way as to be greater than the alignment requirements as long as the previous rules are maintained.

  • 单个编译器可能会调整大小的原因的结构的包装。An individual compiler may adjust the packing of a structure for size reasons. 例如/Zp (结构成员对齐)允许进行调整的结构的包装。For example /Zp (Struct Member Alignment) allows for adjusting the packing of structures.

结构对齐示例Examples of Structure Alignment

以下四个示例每个声明对齐的结构或联合和相应的图形说明了该结构或联合在内存中的布局。The following four examples each declare an aligned structure or union, and the corresponding figures illustrate the layout of that structure or union in memory. 在图中每一列代表一字节的内存,并在列中的数字表示的字节偏移量。Each column in a figure represents a byte of memory, and the number in the column indicates the displacement of that byte. 第二行的每个图形中的名称对应于在声明变量的名称。The name in the second row of each figure corresponds to the name of a variable in the declaration. 阴影的列指示填充的所需实现指定的对齐方式。The shaded columns indicate padding that is required to achieve the specified alignment.

示例 1Example 1

// Total size = 2 bytes, alignment = 2 bytes (word).

_declspec(align(2)) struct {
    short a;      // +0; size = 2 bytes

AMD 转换示例 1 结构布局AMD conversion example 1 structure layout

示例 2Example 2

// Total size = 24 bytes, alignment = 8 bytes (quadword).

_declspec(align(8)) struct {
    int a;       // +0; size = 4 bytes
    double b;    // +8; size = 8 bytes
    short c;     // +16; size = 2 bytes

AMD 转换示例 2 结构布局AMD conversion example 2 structure layout

示例 3Example 3

// Total size = 12 bytes, alignment = 4 bytes (doubleword).

_declspec(align(4)) struct {
    char a;       // +0; size = 1 byte
    short b;      // +2; size = 2 bytes
    char c;       // +4; size = 1 byte
    int d;        // +8; size = 4 bytes

AMD 转换示例 2 结构布局AMD conversion example 2 structure layout

示例 4Example 4

// Total size = 8 bytes, alignment = 8 bytes (quadword).

_declspec(align(8)) union {
    char *p;      // +0; size = 8 bytes
    short s;      // +0; size = 2 bytes
    long l;       // +0; size = 4 bytes

AMD 转换示例 4 联合 layouitAMD conversion example 4 union layouit


结构位域被限制为 64 位和可以为类型 int、 unsigned 的 int、 int64、 或 unsigned 的 int64 的签名。Structure bit fields are limited to 64 bits and can be of type signed int, unsigned int, int64, or unsigned int64. 交叉类型边界的位域将跳过位,以使位域到下一步类型对齐方式。Bit fields that cross the type boundary will skip bits to align the bitfield to the next type alignment. 例如,整数位域可能不会跨 32 位边界。For example, integer bitfields may not cross a 32-bit boundry.

冲突与 x86 编译器Conflicts with the x86 compiler

数据类型包含大于 4 个字节都不会自动对齐在堆栈上时使用 x86 编译器来编译应用程序。Data types that are larger than 4 bytes are not automatically aligned on the stack when you use the x86 compiler to compile an application. 因为编译器是 4 字节对齐的堆栈,任何大于 4 个字节,例如,64 位整数,不能自动对齐到 8 字节地址 x86 体系结构。Because the architecture for the x86 compiler is a 4 byte aligned stack, anything larger than 4 bytes, for example, a 64-bit integer, cannot be automatically aligned to an 8-byte address.

处理未对齐的数据有两个含义。Working with unaligned data has two implications.

  • 可能需要更长时间才能访问未对齐的位置不是所需访问对齐的位置。It may take longer to access unaligned locations than it takes to access aligned locations.

  • 未对齐的位置不能使用互锁操作中。Unaligned locations cannot be used in interlocked operations.

如果需要更为严格的对齐,请使用__declspec(align(N))变量声明。If you require more strict alignment, use __declspec(align(N)) on your variable declarations. 这将导致编译器动态对齐的堆栈,以满足您的规范。This causes the compiler to dynamically align the stack to meet your specifications. 但是,在运行时动态调整堆栈可能会导致应用程序的执行速度变慢。However, dynamically adjusting the stack at run time may cause slower execution of your application.

注册使用情况Register usage

X64 体系结构可提供 16 个通用寄存器 (以后称为整数寄存器),以及 16 XMM/YMM 寄存器可供浮点使用。The x64 architecture provides for 16 general-purpose registers (hereafter referred to as integer registers) as well as 16 XMM/YMM registers available for floating-point use. 易失寄存器是由调用方假想的临时寄存器,并要在调用过程中销毁。Volatile registers are scratch registers presumed by the caller to be destroyed across a call. 非易失寄存器需要在整个函数调用过程中保留其值,并且一旦使用,则必须由被调用方保存。Nonvolatile registers are required to retain their values across a function call and must be saved by the callee if used.

注册更新率和保留Register volatility and preservation

下表说明了每种寄存器在整个函数调用过程中的使用方法:The following table describes how each register is used across function calls:

寄存器Register 状态Status 使用Use
RAXRAX 易失的Volatile 返回值寄存器Return value register
RCXRCX 易失的Volatile 第一个整型自变量First integer argument
RDXRDX 易失的Volatile 第二个整型自变量Second integer argument
R8R8 易失的Volatile 第三个整型自变量Third integer argument
R9R9 易失的Volatile 第四个整型自变量Fourth integer argument
R10:R11R10:R11 易失的Volatile 必须根据需要由调用方保留;在 syscall/sysret 指令中使用Must be preserved as needed by caller; used in syscall/sysret instructions
R12:R15R12:R15 非易失的Nonvolatile 必须由被调用方保留Must be preserved by callee
RDIRDI 非易失的Nonvolatile 必须由被调用方保留Must be preserved by callee
RSIRSI 非易失的Nonvolatile 必须由被调用方保留Must be preserved by callee
RBXRBX 非易失的Nonvolatile 必须由被调用方保留Must be preserved by callee
RBPRBP 非易失的Nonvolatile 可用作帧指针;必须由被调用方保留May be used as a frame pointer; must be preserved by callee
RSPRSP 非易失的Nonvolatile 堆栈指针Stack pointer
XMM0、YMM0XMM0, YMM0 易失的Volatile 第一个 FP 参数;使用 __vectorcall 时的第一个矢量类型参数First FP argument; first vector-type argument when __vectorcall is used
XMM1、YMM1XMM1, YMM1 易失的Volatile 第二个 FP 参数;使用 __vectorcall 时的第二个矢量类型参数Second FP argument; second vector-type argument when __vectorcall is used
XMM2、YMM2XMM2, YMM2 易失的Volatile 第三个 FP 参数;使用 __vectorcall 时的第三个矢量类型参数Third FP argument; third vector-type argument when __vectorcall is used
XMM3、YMM3XMM3, YMM3 易失的Volatile 第四个 FP 自变量;使用 __vectorcall 时的第四个矢量类型参数Fourth FP argument; fourth vector-type argument when __vectorcall is used
XMM4、YMM4XMM4, YMM4 易失的Volatile 必须根据需要由调用方保留;使用 __vectorcall 时的第五个矢量类型参数Must be preserved as needed by caller; fifth vector-type argument when __vectorcall is used
XMM5、YMM5XMM5, YMM5 易失的Volatile 必须根据需要由调用方保留;使用 __vectorcall 时的第六个矢量类型参数Must be preserved as needed by caller; sixth vector-type argument when __vectorcall is used
XMM6:XMM15、YMM6:YMM15XMM6:XMM15, YMM6:YMM15 非易失的 (XMM),易失的(YMM 的上半部分)Nonvolatile (XMM), Volatile (upper half of YMM) 必须由被调用方保留。Must be preserved by callee. YMM 寄存器必须根据需要由调用方保留。YMM registers must be preserved as needed by caller.

在函数退出和函数进入到 C 运行库调用和 Windows 的系统调用、 CPU 中的方向标志标志注册应清除。On function exit and on function entry to C Runtime Library calls and Windows system calls, the direction flag in the CPU flags register is expected to be cleared.

堆栈使用Stack usage

堆栈分配、 对齐方式、 函数类型和在 x64 上的堆栈帧的详细信息,请参阅x64 堆栈使用情况For details on stack allocation, alignment, function types and stack frames on x64, see x64 stack usage.

Prolog 和 epilogProlog and epilog

分配堆栈空间、 调用其他函数中,将非易失寄存器保存或使用异常处理的每个函数必须具有的 prolog 展开数据与关联的相应函数表条目中,以及在 epilog 中所述的地址限制每个退出函数。Every function that allocates stack space, calls other functions, saves nonvolatile registers, or uses exception handling must have a prolog whose address limits are described in the unwind data associated with the respective function table entry, and epilogs at each exit to a function. 详细信息所需的 prolog 和 epilog 代码在 x64 上的,请参阅x64 prolog 和 epilogFor details on the required prolog and epilog code on x64, see x64 prolog and epilog.

x64 异常处理x64 exception handling

有关用于实现结构化的异常处理的约定和数据结构的信息和C++异常处理行为在 x64 上,请参阅x64 异常处理For information on the conventions and data structures used to implement structured exception handling and C++ exception handling behavior on the x64, see x64 exception handling.

内部函数和内联程序集Intrinsics and inline assembly

一个编译器不是内联汇编程序支持 x64 的约束。One of the constraints for the x64 compiler is to have no inline assembler support. 这意味着,函数不能以 C 编写或C++必须将编写为子例程或编译器支持的内部函数。This means that functions that cannot be written in C or C++ will either have to be written as subroutines or as intrinsic functions supported by the compiler. 某些功能是敏感的性能,而有些则不然。Certain functions are performance sensitive while others are not. 性能敏感的函数应作为内部函数实现。Performance-sensitive functions should be implemented as intrinsic functions.

编译器支持的内部函数中所述编译器内部函数The intrinsics supported by the compiler are described in Compiler Intrinsics.

图像格式Image format

X64 可执行映像格式是 PE32 +。The x64 executable image format is PE32+. 可执行映像 (Dll 和 Exe) 是 2 千兆字节的最大大小限制为的因此使用 32 位偏移量相对地址可用于处理静态图像数据。Executable images (both DLLs and EXEs) are restricted to a maximum size of 2 gigabytes, so relative addressing with a 32-bit displacement can be used to address static image data. 此数据包括导入地址表、 字符串常量,静态全局数据等。This data includes the import address table, string constants, static global data, and so on.

请参阅See also

调用约定Calling Conventions