Double precision support in C++ AMP
C++ AMP supports various data types for usage in the restrict(amp) functions that you schedule on the accelerator. One of those types is double. There are caveats and extra information that you need to know with regards to double precision and this post aims to help you with that.
It all depends on your driver
Not all hardware supports doubles so if you have double precision requirements, you need to check that your hardware has such support.
In addition, you must ensure you have the latest driver for your hardware (by visiting the hardware vendor’s website) that enables the support – it is not an uncommon pitfall to read online the specs for your hardware saying one thing, but finding that your installed driver reports another thing because it hasn’t been updated.
To complicate matters further, there are two levels of double support and hence there are two APIs exposed by C++ AMP to let you query for that support.
Limited double precision
In Windows Display Driver Model (WDDM) 1.1, double precision is an optional feature for hardware vendors to implement in their DirectX drivers. Even if they do, it is what we call limited double precision support. That means that the following double operations are not supported:
- FMA (fused multiply add).
- Reciprocal (rcp)
- Casting between int and double
Through C++ AMP you’ll know whether your system has a WDDM 1.1 driver that supports limited double precision by checking the supports_limited_double_precision property on the accelerator, e.g.
bool can_use_doubles_with_limits = accelerator().supports_limited_double_precision;
(Full) Double precision
In WDDM 1.2 (currently supported by Windows 8 and Windows Server 2012 only), the limitations above were removed. So while double precision is still an optional feature, when hardware vendors opt in to implementing it, you get full (aka extended) double precision support.
Through C++ AMP you’ll know your system has a WDDM 1.2 driver that opted in to supporting double precision by checking the supports_double_precision property on the accelerator, e.g.
bool can_use_doubles = accelerator().supports_double_precision;
Note that full double precision is required by the concurrency::precise_math functions in <amp_math.h>
C++ AMP runtime exception
As you know, calls to the C++ AMP API on the host can throw exceptions that I encourage you to learn more about at our blog post on C++ AMP Exceptions.
If you try to use doubles on a card where the driver does not support doubles at all, or the driver has limited double precision support and you try something unsupported such as division, you will get a concurrency::runtime_exception with one of the following messages
"concurrency::parallel_for_each uses features (full double_precision) unsupported by the selected accelerator."
"concurrency::parallel_for_each uses features (limited_double_precision) unsupported by the selected accelerator."
Whichever message you get, if you are running under the debug configuration, it also includes the following text:
“ID3D11Device::CreateComputeShader: Shader uses features not recognized by this D3D version.”