Blue Screen Appears on a Node Running a GPGPU Job

Updated: May 2011

Applies To: Windows HPC Server 2008, Windows HPC Server 2008 R2

If a blue screen occurs on a compute node that is executing a long-running general purpose computation job on a graphics processing unit (GPU) computing processor that uses a Windows Display Driver Model (WDDM) driver, you may need to modify or disable the timeout detection and recovery registry setting for the GPU on each compute node.

To disable the timeout detection and recovery registry setting, under HKLM\System\CurrentControlSet\Control\GraphicsDriver, set TdrLevel to 0. For more information, see Timeout Detection and Recovery of GPUs through WDDM (https://go.microsoft.com/fwlink/?LinkId=196045).

Caution
Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.