Floating-Point Intrinsics Using Streaming SIMD Extensions
Each intrinsic entry is presented with its informal pseudo-code and it is followed with a corresponding instruction name in uppercase letters; for example, ADDSS is the name of the first instruction listed in this section, which corresponds to the intrinsic for the following:
__m128 _mm_add_ss(__m128 a , __m128 b ); ADDSS
The variable r is generally used for the intrinsic's return value. A number appended to a variable name indicates the element of a packed object. For example, r0 is the lowest word of r. Some intrinsics are "composites" because they require more than one instruction to implement them.
You should be familiar with the hardware features provided by the Streaming SIMD Extensions (SSE) when writing programs with the intrinsics. The following are four important issues to keep in mind:
Certain intrinsics, such as _mm_loadr_ps and _mm_cmpgt_ss, are not directly supported by the instruction set. While these intrinsics are convenient programming aids, be mindful that they might consist of more than one machine-language instruction.
Floating-point data loaded stored as __m128 objects must be generally 16-byte aligned.
Some intrinsics require that their argument be immediates, that is, constant integers (literals), because of the nature of the instruction.
The result of arithmetic operations acting on two not a number (NAN) arguments is undefined. Therefore, floating-point operations using NAN arguments will not match the expected behavior of the corresponding assembly instructions.
The following floating-point operations are discussed: