Debugging CNTK source code in Visual Studio

The steps for debugging CUDA kernels:

Install NVIDIA Nsight following the directions from here
Follow the directions for for “Local debugging”.
Set the environment variable NSIGHT_CUDA_DEBUGGER = 1.
Run Visual Studio and the Nsight monitor as administrator.
In Nsight Monitor->Options->CUDA, set “Use this monitor for CUDA attach” to True. You may have to restart Nsight. Run as admin again.
In Visual Studio, go to Nsight->Options and make sure the options match up with your options in Nsight monitor (e.g. the ports are the same). Especially make sure ”Establish secure connection” is the same in both.
Right click on the MathCUDA project in the solution explorer and go to Properties.
Go to Configuration Properties -> CUDA C/C++ -> Device and set Generate GPU Debug Information to Yes
Go to Configuration Properties -> CUDA Linker -> General and set Generate GPU Debug Information to Yes
Add your breakpoints in your kernel, rebuild CNTK, and get ready to run whatever you’re trying to debug.
In VS, go to Debug -> Attach to Process, set Transport to Nsight GPU Debugger, and set Qualifier to localhost.
Start CNTK.
Click refresh and find CNTK in the process list, then attach. When it hits a breakpoint you should be able to see all of your local variables from the kernel. If you only see CUDA globals like threadIdx and blockIdx, you haven’t properly set the GPU Debug flags in the MathCUDA properties.

Additional resources