Setting up more than 18 GPU Instances on Azure using VMs or Containers
I have been getting a number of questions around the availability of Azure N Series GPU
at present we have two SKUs NC (GPU Compute}_ and NV (GPU Visualisation)
This blog explains the differences between the SKUs and where NC vs NV hardware instances should be used https://blogs.msdn.microsoft.com/uk_faculty_connection/2017/01/10/azure-cloud-gpu-for-datascience-and-academic-activities-such-as-cloud-rendering/
DataCenter & OS Availability
For NV machines (visualization): OS Availability- Windows Server 2016 or Windows Server 2012 R2
Region Availability – South Central US, West EU, South East Asia
For NC machines (compute):
OS Availability- Windows Server 2016, Windows Server 2012 R2 or Ubuntu 16.04 LTS
Region Availability – South Central US, East US
– You should choose HDD and not SSD
– You should install the necessary GPU drivers all documentation is available here: https://docs.microsoft.com/en-us/azure/virtual-machines/virtual-machines-windows-n-series-driver-setup and https://docs.microsoft.com/en-us/azure/virtual-machines/virtual-machines-linux-n-series-driver-setup
In regards to provisioning at present Azure GPU provision via portal.azure.com is set to a maximum of 18 Instances. If you require more Instances, the following is a quick overview of the steps you should take.
Step 1. You make a provisioning request on Azure via the Azure Portal http://portal.azure.com , so you simply try to create the VM.
Step 2. You will initially get a failure as the maximum number of NC24 is 20 Instances this places your request in a hold so please don’t allocate without letting me know anything over 18 instances
The error is "Operation results in exceeding quota limits of Core. Maximum allowed: 20, Current in use: 18," when trying to allocate the VMs in South Central US
Step 3. You will receive an email within 24 hours from the provision team look at the request and generally get in touch to confirm this wasn’t a mistake as NC series are costly.
To request a quota increase, you must open a Support Request with Microsoft. Load the Microsoft Azure Portal and click the question mark icon in the upper-right corner to get started.
The email you receive may from an alias or one of the provisioning team directly but it will be titled like this
RE: [REG:XXXXXXXX (is a number)] Quota request - Cores Initial Response
Information you need to submit
If you require over 18 Instances for your courses its always good to be clear to what these are being used for
The Azure Region you wish to deploy this to
Requirement Number of Instances x Type of GPU Instance i.e. 20 x NC6 server instances and specification of available services and costs are below
What will the GPU be used for
i.e GPU Cores for use within Deep Learning curriculum
Course Title: i.e Data Analysis and Probabilistic Inference
Course begin date: i.e. 16/1/17
Course end date: i.e 1/4/17 Number of participating students: 120 Profile of students: 4th-year machine learning students (undergraduate)
Proposed MS Azure utilisation in support of course teaching: estimated 880 hours of teaching utilisation be mindful of the costs which will be charged to your Azure Subscriptions
So in this example you have 20 x NC6 for 880 hrs per instance so 17,600 hours (733 days) of compute at $0.90 per hr = $15,840 charge to your Azure Subscription
Number of Students i.e. 125
What will students be doing i.e. SSH for Students Approx 3 students per NC6
The provisioning team then check with the capacity team to see if a pre auth has been given for you and for the details above they then pass/fail the request this and your machine can be provisioned.
This typically takes 2 days
Step 4. You will receive a confirmation email simply go back to the portal.azure.com and create the VMs again in the same region to the same instance size and these get deployed within 15 mins
Azure Batch Shipyard Data Science Containers https://blogs.msdn.microsoft.com/uk_faculty_connection/2017/02/13/deep-learning-using-cntk-caffe-keras-theanotorch-tensorflow-on-docker-with-microsoft-azure-batch-shipyard/
juypter notebooks Tensorflow Deep Learning https://notebooks.azure.com/library/OEdO6ybBxM4/dashboard?page=1