YOLO on Azure Deep Learning Virtual Machine (DLVM – Linux)

In a previous post I covered setting up YOLO on an Azure DLVM.  In this I'll cover setting up YOLO on a Linux (Ubuntu) DSVM/DLVM using Docker.

First spin up a new Deep Learning Virtual Machine on Linux - this is already setup with Nvidia GPU CUDA drivers and Docker:

Then ssh in and pull down my docker github repo: https://github.com/daltskin/dlvm-darknet - this uses a fork of https://github.com/AlexeyAB/darknet and includes an opencv dependency fix, enables GPU and CUDNN in the Makefile.

 $ git clone https://github.com/daltskin/DLVM-Darknet
$ sudo docker build DLVM-Darknet/darknet -t darknet:latest
$ sudo docker build DLVM-Darknet -t dlvm-darknet:latest
$ sudo docker run --runtime=nvidia dlvm-darknet:latest

You should see some output that looks like:

 Loading weights from yolov3.weights...Total BFLOPS 65.864Done!
seen 64
./data/horses.jpg: Predicted in 0.101588 seconds.
horse: 89%
horse: 98%
horse: 97%
horse: 91%


Full output screenshot here


Using Docker should eliminate a lot of the dependency issues involved with setting up YOLO, however the Nvidia drivers on the host VM should be checked, by running the following command:

 $ nvidia-smi

You should see some output like this:

 | NVIDIA-SMI 396.26 Driver Version: 396.26 |
 | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
 | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
 | 0 Tesla K80 Off | 0000AC17:00:00.0 Off | 0 |
 | N/A 43C P0 56W / 149W | 0MiB / 11441MiB | 0% Default |

 | Processes: GPU Memory |
 | GPU PID Type Process name Usage |
 | No running processes found |

It is recommended you update the CUDA drivers on the host VM: /en-us/azure/virtual-machines/linux/n-series-driver-setup?toc=/azure/virtual-machines/linux/toc.json#cuda-driver-updates:

 $ sudo apt-get update
$ sudo apt-get upgrade -y
$ sudo apt-get dist-upgrade -y
$ sudo apt-get install cuda-drivers 
$ sudo reboot

You may need to do this if you don't see any output/predictions from running the darknet commands eg. if you only see something output like this (missing predictions):

 seen 64
./data/horses.jpg: Predicted in 0.0101588 seconds.